-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: When using beam.io.kinesis.ReadDataFromKinesis, a java error is raised in DataFlowRunner #27165
Comments
Hello @vishwesh0409, what was the fix that was applied for this ? Tried to look for it but did not find it |
I'll reopen this. I thought it was fixed by #26953 but after updating the beam version to 2.49.0, I'm getting the same problem. |
Yep, also tested this in the current master branch status and getting the same error when running a test job locally so def an issue IMO |
also @vishwesh0409, #26953 was applied in the v2 of the aws kinesis reader and I think current implementation might still rely on v1 maybe (talking about the python code calling the right java package) |
ok I am gonna structure my thought a little bit better here :) So there are 2 #26953 fixed the issues in the I do encounter another issue though which is then when running on my laptop using the I also tried to make the |
Looks like the KinesisReader modules are not properly fixed yet. I observed the same deprication warnings when I tried to run my code on Google Cloud. It should be a simple enough fix tho. |
Basically, there is a need to develop a new |
2.50 release manager here. Please complete work and get it into the main branch in that time, or move this issue to the 2.51 Milestone: https://github.com/apache/beam/milestone/15 |
this was due to issue get reopened and the milestone not removed |
@Abacn @lostluck but then if an issue is open and neither who opened it nor who is participating in the discussion has the knowledge to fix it will the issue be forgotten or someone in the community will pick it up? |
As an open source project, the primary means of fixes is via community and user involvement. As a rule, no one can expect someone else to fix an issue just because it's filed. The story changes somewhat if judged to be a P1 or P0, a regression from the previous release version, or if it's affecting a large Dataflow customer, at which point it's likely that someone from the Dataflow team will drive it to resolution. That does require the customer to have a GCP support ticket in place as well. P3s are typically not release blockers, so it's more likely this remains in the backlog until someone motivated to fix it it comes along. Beam welcomes contributions. |
This issue seems to also affect FlinkRunner. |
@mxm it should affect all runners since the issue is at source code level, the core issue I tried to explain in more details here #27165 (comment) |
The same issue on my side. |
What happened?
"Error message from worker: generic::unknown: org.apache.beam.sdk.util.UserCodeException: java.lang.NullPointerException: Cannot invoke "org.apache.beam.sdk.io.kinesis.ShardReadersPool.getLatestRecordTimestamp()" because "this.shardReadersPool" is null
org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:39)
org.apache.beam.sdk.io.Read$UnboundedSourceAsSDFWrapperFn$DoFnInvoker.invokeSplitRestriction(Unknown Source)
org.apache.beam.fn.harness.FnApiDoFnRunner.processElementForSplitRestriction(FnApiDoFnRunner.java:887)
org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:313)
org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:245)
org.apache.beam.fn.harness.FnApiDoFnRunner.outputTo(FnApiDoFnRunner.java:1788)
org.apache.beam.fn.harness.FnApiDoFnRunner.processElementForPairWithRestriction(FnApiDoFnRunner.java:824)
org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:313)
org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:245)
org.apache.beam.fn.harness.FnApiDoFnRunner.outputTo(FnApiDoFnRunner.java:1788)
org.apache.beam.fn.harness.FnApiDoFnRunner.access$3000(FnApiDoFnRunner.java:142)
org.apache.beam.fn.harness.FnApiDoFnRunner$NonWindowObservingProcessBundleContext.output(FnApiDoFnRunner.java:2506)
org.apache.beam.sdk.io.Read$OutputSingleSource.processElement(Read.java:1034)
org.apache.beam.sdk.io.Read$OutputSingleSource$DoFnInvoker.invokeProcessElement(Unknown Source)
org.apache.beam.fn.harness.FnApiDoFnRunner.processElementForParDo(FnApiDoFnRunner.java:799)
org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:313)
org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:245)
org.apache.beam.fn.harness.BeamFnDataReadRunner.forwardElementToConsumer(BeamFnDataReadRunner.java:213)
org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.multiplexElements(BeamFnDataInboundObserver.java:158)
org.apache.beam.fn.harness.control.ProcessBundleHandler.processBundle(ProcessBundleHandler.java:537)
org.apache.beam.fn.harness.control.BeamFnControlClient.delegateOnInstructionRequestType(BeamFnControlClient.java:151)
org.apache.beam.fn.harness.control.BeamFnControlClient$InboundObserver.lambda$onNext$0(BeamFnControlClient.java:116)
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
org.apache.beam.sdk.util.UnboundedScheduledExecutorService$ScheduledFutureTask.run(UnboundedScheduledExecutorService.java:163)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.lang.NullPointerException: Cannot invoke "org.apache.beam.sdk.io.kinesis.ShardReadersPool.getLatestRecordTimestamp()" because "this.shardReadersPool" is null
org.apache.beam.sdk.io.kinesis.KinesisReader.getSplitBacklogBytes(KinesisReader.java:172)
org.apache.beam.sdk.io.Read$UnboundedSourceAsSDFWrapperFn$UnboundedSourceAsSDFRestrictionTracker.getProgress(Read.java:999)
org.apache.beam.sdk.transforms.reflect.ByteBuddyDoFnInvokerFactory$DefaultGetSize.invokeGetSize(ByteBuddyDoFnInvokerFactory.java:432)
org.apache.beam.sdk.io.Read$UnboundedSourceAsSDFWrapperFn$DoFnInvoker.invokeGetSize(Unknown Source)
org.apache.beam.fn.harness.FnApiDoFnRunner$SizedRestrictionNonWindowObservingProcessBundleContext.output(FnApiDoFnRunner.java:2415)
org.apache.beam.sdk.io.Read$UnboundedSourceAsSDFWrapperFn.splitRestriction(Read.java:540)
Issue Priority
Priority: 3 (minor)
Issue Components
The text was updated successfully, but these errors were encountered: