Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug][Prism]: Unsupported feature Create/MaybeReshuffle (Python SDK) #32733

Open
1 of 17 tasks
victorrgez opened this issue Oct 10, 2024 · 1 comment
Open
1 of 17 tasks

Comments

@victorrgez
Copy link

What happened?

Following the release of PrismRunner for Python SDK with Apache Beam 2.59.0, we are trying to adapt our code so that we can use the same code in GCP as in Local debugging (DirectRunner lacks many features). The only difference we have in the code is that in Dataflow we listen from Pub/Sub whereas in Local we use test json local files that are read as Python dictionaries and then they are instantiated in a PCollection with beam.Create.

The problem we are facing is that this feature (beam.Create) is not implemented in PrismRunner yet and gives us the following error:

INFO:apache_beam.utils.subprocess_server:2024/10/10 12:21:11 ERROR unable to run job cause="unimplemented features" jobname=job errors="unsupported feature \"PTransform.Spec.Urn\" set with value beam:transform:pickled_python:v1 Create/MaybeReshuffle"
INFO:apache_beam.utils.subprocess_server:2024/10/10 12:21:11 ERROR job failed job.key=job-001 job.name=job error="found 1 uses of features unimplemented in prism in job job:\nunsupported feature \"PTransform.Spec.Urn\" set with value beam:transform:pickled_python:v1 Create/MaybeReshuffle"

We are opening this issue since the lack of implementation of this feature is not documented in the list of missing features and we want to make sure it does not slip out of the roadmap since it is a basic transformation for local development in order not depend from complex I/O resources:

In the [2.59.0 release](https://beam.apache.org/blog/beam-2.59.0/), Prism passes most runner validations tests with the exceptions of pipelines using the following features:

OrderedListState, OnWindowExpiry (eg. GroupIntoBatches), CustomWindows, MergingWindowFns, Trigger and WindowingStrategy associated features, Bundle Finalization, Looping Timers, and some Coder related issues such as with Python combiner packing, and Java Schema transforms, and heterogenous flatten coders. Processing Time timers do not yet have real time support.

Thank you!

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@victorrgez
Copy link
Author

.add-labels prism

@github-actions github-actions bot added the prism label Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant