Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery Storage Write API SchemaTransform wrapper for Python SDK #24783

Closed

Conversation

ahmedabu98
Copy link
Contributor

@ahmedabu98 ahmedabu98 commented Dec 26, 2022

Implementing a wrapper for Python SDK that uses the Storage API SchemaTransform (#23988) to write to BigQuery

Fixes #21961

@codecov
Copy link

codecov bot commented Dec 26, 2022

Codecov Report

Merging #24783 (dd07d3c) into master (bb582d8) will increase coverage by 0.09%.
The diff coverage is 51.42%.

@@            Coverage Diff             @@
##           master   #24783      +/-   ##
==========================================
+ Coverage   72.95%   73.04%   +0.09%     
==========================================
  Files         745      742       -3     
  Lines       99174    98942     -232     
==========================================
- Hits        72353    72276      -77     
+ Misses      25455    25303     -152     
+ Partials     1366     1363       -3     
Flag Coverage Δ
python 82.45% <51.42%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdks/python/apache_beam/io/gcp/bigquery.py 70.29% <30.43%> (-1.06%) ⬇️
sdks/python/apache_beam/transforms/external.py 80.51% <91.66%> (+1.77%) ⬆️
...eam/runners/portability/fn_api_runner/execution.py 92.49% <0.00%> (-0.64%) ⬇️
sdks/go/pkg/beam/core/runtime/graphx/serialize.go 27.34% <0.00%> (-0.23%) ⬇️
...hon/apache_beam/runners/worker/bundle_processor.py 93.54% <0.00%> (-0.13%) ⬇️
sdks/go/pkg/beam/core/runtime/graphx/coder.go 53.36% <0.00%> (-0.12%) ⬇️
sdks/go/pkg/beam/io/mongodbio/common.go 0.00% <0.00%> (ø)
sdks/go/pkg/beam/io/mongodbio/id_range_split.go
...s/go/pkg/beam/io/mongodbio/id_range_restriction.go
sdks/go/pkg/beam/io/mongodbio/id_range_tracker.go
... and 5 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ahmedabu98 ahmedabu98 marked this pull request as ready for review January 19, 2023 20:57
@ahmedabu98
Copy link
Contributor Author

R: @chamikaramj

@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

Copy link
Contributor

@chamikaramj chamikaramj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Looks great. Just nits.

@@ -367,6 +367,32 @@ def discover(expansion_service):
inputs=proto_config.input_pcollection_names,
outputs=proto_config.output_pcollection_names)

@staticmethod
def discover_one(expansion_service, name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "discover_config()" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure SGTM

@chamikaramj
Copy link
Contributor

Did you try triggering the Jenkins integration test suites for the test with this PR ?

@ahmedabu98
Copy link
Contributor Author

@chamikaramj let me know if the mock service unit test is along the lines of what you were thinking

Copy link
Contributor

@chamikaramj chamikaramj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I meant to try test_xlang_storage_write using a Jenkins trigger.

_LOGGER.info(
"Created dataset %s in project %s", self.dataset_id, self.project)

self.expansion_service = ('localhost:%s' % os.environ.get('EXPANSION_PORT'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be using this expansion service when running the pipeline ?

Also, does this service already get created ?

@ahmedabu98 ahmedabu98 changed the base branch from master to mr-runner February 7, 2023 13:38
@ahmedabu98 ahmedabu98 changed the base branch from mr-runner to master February 7, 2023 13:39
@github-actions github-actions bot added the build label Feb 7, 2023
@ahmedabu98
Copy link
Contributor Author

Run XVR_PythonUsingJava_Dataflow PostCommit

@ahmedabu98
Copy link
Contributor Author

Closing this PR in favor of #25521 as this one has gotten too messy.

@ahmedabu98 ahmedabu98 closed this Feb 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: Add support for BigQuery Storge Write API in Python
2 participants