Skip to content

[Bug]: YAML JDBC sources broken in Beam 2.65 #35122

@ryanmadden-google

Description

@ryanmadden-google

What happened?

The Beam YAML JDBC sources(ReadFromPostgres, ReadFromSqlServer, ReadFromOracle, etc.) appear to be broken in Beam 2.65. For example, running this pipeline:

pipeline:
  type: chain
  transforms:
    - type: ReadFromPostgres
      config:
        url: "jdbc:sqlserver://my-host:1433/database"
        query: "SELECT * FROM table"
    - type: LogForTesting

produces the following error:

ValueError: Error applying transform "ReadFromPostgres" at line 4: java.lang.IllegalArgumentException: If JDBC type is not specified, then Driver Class Name and Driver Jars must be specified.
        at org.apache.beam.sdk.io.jdbc.JdbcReadSchemaTransformProvider$JdbcReadSchemaTransformConfiguration.validate(JdbcReadSchemaTransformProvider.java:369)
        at org.apache.beam.sdk.io.jdbc.JdbcReadSchemaTransformProvider$JdbcReadSchemaTransformConfiguration.validate(JdbcReadSchemaTransformProvider.java:355)
        at org.apache.beam.sdk.io.jdbc.JdbcReadSchemaTransformProvider$JdbcReadSchemaTransform.expand(JdbcReadSchemaTransformProvider.java:217)
        at org.apache.beam.sdk.io.jdbc.JdbcReadSchemaTransformProvider$JdbcReadSchemaTransform.expand(JdbcReadSchemaTransformProvider.java:171)
        at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:559)
        at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:507)
        at org.apache.beam.sdk.expansion.service.TransformProvider.apply(TransformProvider.java:121)
        at org.apache.beam.sdk.expansion.service.ExpansionService.expand(ExpansionService.java:657)
        at org.apache.beam.sdk.expansion.service.ExpansionService.expand(ExpansionService.java:758)
        at org.apache.beam.model.expansion.v1.ExpansionServiceGrpc$MethodHandlers.invoke(ExpansionServiceGrpc.java:306)
        at org.apache.beam.vendor.grpc.v1p69p0.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
        at org.apache.beam.vendor.grpc.v1p69p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:356)
        at org.apache.beam.vendor.grpc.v1p69p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:861)
        at org.apache.beam.vendor.grpc.v1p69p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
        at org.apache.beam.vendor.grpc.v1p69p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:840)

Based on the release and source history, this PR seems like a possible cause. The issue does not reproduce in Beam 2.64.

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions