Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage write api - support default stream #226

Merged
merged 6 commits into from
Jun 12, 2024
Merged

Conversation

MatanLevy
Copy link
Contributor

This pull request aims to support bigquery emulator for clients using storage write api with the default stream.

Two main requirements (based on google documentation) for default stream that are part of that pull request:

default stream creation

Additionally, every table has a special stream named ‘_default’ to which data can be written. This stream doesn’t need to be created using CreateWriteStream

commited stream in AppendRows

For COMMITTED streams (which includes the default stream), data is
visible immediately upon successful append.

@hankd-v
Copy link

hankd-v commented Nov 9, 2023

+1 to this - @goccy is this repo actively maintained still?

@HiddenSplioGuy
Copy link

HiddenSplioGuy commented Nov 10, 2023

+1 actively need this as well as the default streaming method uses the "_default" stream and you can't use connections pool without that : Trying to enable connection pool in non-default stream.

I tried a workaround for this consisting of creating the "_default" write stream after the table but It seems like the API doesn't allow us to set the name of the stream -> OUTPUT ONLY.

var streamName = WriteStreamName.of(bQProjectId, datasetId, tableId, "_default").toString();
CreateWriteStreamRequest request =
    CreateWriteStreamRequest.newBuilder()
        .setParent(TableName.of(bQProjectId, datasetId, tableId).toString())
        .setWriteStream(
            WriteStream.newBuilder()
                .setType(WriteStream.Type.COMMITTED)
                .setName(streamName)
                .build())
        .build();
WriteStream response = bigQueryWriteClient.createWriteStream(request);
System.out.println("WriteStream created " + response);

With the following code, the stream created have a randomly generated name.

@araj-dev
Copy link

araj-dev commented Nov 20, 2023

+1 i need this with go client.

Here's my example Go code using the managedwriter package to request the bigquery emulator:

	ctx := context.Background()
	grpcCtx, _ := context.WithTimeout(ctx, 1*time.Second)
	conn, _ := grpc.DialContext(grpcCtx, "localhost:9060",
		grpc.WithTransportCredentials(insecure.NewCredentials()),
		grpc.WithIdleTimeout(1*time.Second),
	)
	client, err := managedwriter.NewClient(ctx, projectID,
		option.WithGRPCConn(conn),
		option.WithoutAuthentication(),
	) // OK

	pb := &v1.pb{}
	descriptorProto, err := adapt.NormalizeDescriptor(pb.ProtoReflect().Descriptor())
	tableName := fmt.Sprintf("projects/%s/datasets/%s/tables/%s", projectID, datasetID, "{table_name}")
	stream, err := client.NewManagedStream(ctx,
		managedwriter.WithDestinationTable(tableName),
		managedwriter.WithSchemaDescriptor(descriptorProto),
		managedwriter.WithAppendRowsCallOption(gax.WithTimeout(1*time.Second)),
	) // return error
	
	// rpc error: code = Unknown desc = failed to find stream from projects/{project}/datasets/{dataset}/tables/{table}/streams/_default

When calling NewManagedStream() without specifying the stream type option, it internally invokes getWriteStream with the stream name argument projects/{project}/datasets/{dataset}/tables/{table}/streams/_default.

IMO, the API should automatically create a _default stream for each table if it doesn't exist at the time of the call, or alternatively, prepare a _default stream for all tables during the initialization phase.

Here's a quick fix I tried as an alternative, but I'm not sure if it's the right approach: araj-dev#1

@limmatfun
Copy link

+1 for default straem support

Copy link
Owner

@goccy goccy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add test cases for this fixing

@goccy goccy added the reviewed label Apr 6, 2024
@MatanLevy
Copy link
Contributor Author

@goccy added test case for default stream

@winterjung
Copy link

@goccy Could you please merge this PR into the main branch? I also need it.

@goccy
Copy link
Owner

goccy commented Jun 12, 2024

@MatanLevy I would like to check the CI results, so could you push an empty commit to trigger the CI ?

@MatanLevy
Copy link
Contributor Author

@goccy pushed now an empty commit

@goccy goccy self-requested a review June 12, 2024 04:09
@goccy
Copy link
Owner

goccy commented Jun 12, 2024

Thank you for your contribution !!! LGTM 👍

@goccy goccy merged commit 76384ba into goccy:main Jun 12, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants