Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#24789][prism] Handlers for combine, ParDo, GBK, Flatten #25558

Merged
merged 4 commits into from
Feb 20, 2023

Conversation

lostluck
Copy link
Contributor

Add in the handlers for Combine, ParDo, GBK, Flatten
Adds in an executor interface.

No unit tests largely because these are largely rote Proto transformations, and are best tested in the pipeline context. They'll fully covered in the completed internal package.

See #24789 for more information.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

@codecov
Copy link

codecov bot commented Feb 20, 2023

Codecov Report

Merging #25558 (a1e25d8) into master (6667eb4) will decrease coverage by 0.32%.
The diff coverage is 0.00%.

❗ Current head a1e25d8 differs from pull request most recent head a460d5c. Consider uploading reports for the commit a460d5c to get more accurate results

@@            Coverage Diff             @@
##           master   #25558      +/-   ##
==========================================
- Coverage   72.64%   72.33%   -0.32%     
==========================================
  Files         763      766       +3     
  Lines      101060   101494     +434     
==========================================
  Hits        73413    73413              
- Misses      26226    26660     +434     
  Partials     1421     1421              
Flag Coverage Δ
go 51.62% <0.00%> (-0.71%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...o/pkg/beam/runners/prism/internal/handlecombine.go 0.00% <0.00%> (ø)
.../go/pkg/beam/runners/prism/internal/handlepardo.go 0.00% <0.00%> (ø)
...go/pkg/beam/runners/prism/internal/handlerunner.go 0.00% <0.00%> (ø)
...o/pkg/beam/runners/prism/internal/worker/worker.go 63.42% <0.00%> (-1.17%) ⬇️
...python/apache_beam/runners/worker/worker_status.py 74.66% <0.00%> (-0.67%) ⬇️
sdks/python/apache_beam/transforms/combiners.py 93.05% <0.00%> (-0.39%) ⬇️
...ks/python/apache_beam/runners/worker/sdk_worker.py 89.46% <0.00%> (+0.15%) ⬆️
...hon/apache_beam/runners/worker/bundle_processor.py 94.33% <0.00%> (+0.23%) ⬆️
sdks/go/pkg/beam/core/metrics/dumper.go 53.96% <0.00%> (+4.76%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@lostluck
Copy link
Contributor Author

R: @johannaojeling

This is where things get interesting! This converts/prepares the Pipeline Proto representations to fully expanded alternatives and similar. You'll probably find the comments on how SDFs and Combine composites are expanded interesting.

GroupByKeys and Flattens are pretty straight foward so far and do the appropriate in memory work. That would change depending on how data services evolve (eg. Stop being strictly in memory). But as it ism this is an OK abstraction to start.

The code living in an internal package gives us the freedom to evolve the abstractions we're using as needed.

@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

@lostluck lostluck changed the title [prism] Handlers for combine, ParDo, GBK, Flatten [#24789][prism] Handlers for combine, ParDo, GBK, Flatten Feb 20, 2023
@brucearctor
Copy link
Contributor

LGTM - but I've only been skimming the code here [ and https://github.com//pull/25556 and https://github.com//pull/25557 ... and your other related/previous PRs ]. Fantastic contribution, so glad you've gotten it this far already!

Copy link
Contributor

@johannaojeling johannaojeling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this is all excellent!

sdks/go/pkg/beam/runners/prism/internal/handlepardo.go Outdated Show resolved Hide resolved
sdks/go/pkg/beam/runners/prism/internal/handlerunner.go Outdated Show resolved Hide resolved
SDKGBK bool // Sets whether the GBK should be handled by the SDK, if possible by the SDK.
}

func Runner(config any) *runner {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Combine(), ParDo() and Runner() functions are exported with an unexported return type, but you might have an intention with this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a public API, returning unexported types is annoying for users, but it's less so under the isolation of the internal folder. It's likely the handlers will end up in their own package, and get exported out that way, with the types changed as appropriate.

While I've got the interfaces (preparer, executer) they're there to reduce the chance the abstraction will be broken. They'll be refined as other features are added to Prism, like Fusion, and handling the TestStream, and PubSubm URNs (and more).

@lostluck lostluck merged commit 85ebc2f into master Feb 20, 2023
@lostluck lostluck deleted the prism-handlers branch February 20, 2023 16:57
lostluck added a commit to lostluck/beam that referenced this pull request Feb 20, 2023
…he#25558)

* [prism] add in handlers

* [prism] executor interface

* delint - rm processor

* [prism] comments
ruslan-ikhsan pushed a commit to akvelon/beam that referenced this pull request Mar 10, 2023
…he#25558)

* [prism] add in handlers

* [prism] executor interface

* delint - rm processor

* [prism] comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants