Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for downloading flights from alternate endpoints #13900

Closed

Conversation

avantgardnerio
Copy link

@avantgardnerio avantgardnerio commented Aug 16, 2022

  • Add support for downloading flights from alternate EPs
    • arrow-ballista uses multiple executors to run a query in parallel, and this PR allows them each to serve their result
  • Fix NPE when there are no extraParams
  • Exclude more packages from Maven shade so developers can attach a debugger when integration testing shaded jar
    • e.g. when testing from one's favorite SQL IDE (DataGrip, SQLCommander, etc)
  • Fix dep.protobuf.version build error
  • Bump to v9 to match main repo

@jduo @jayhomn-bitquill @andygrove

See rafael-telles#42 for more detail.

@@ -87,3 +87,6 @@ cpp/Brewfile.lock.json
java-dist/
java-native-c/
java-native-cpp/

# files altered by build
java/flight/flight-jdbc-driver/src/main/resources/properties/flight.properties
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file seems to get generated each maven build

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note #13800 is the up-to-date PR and some of these things were addressed there

@@ -16,7 +16,7 @@
<parent>
<artifactId>arrow-flight</artifactId>
<groupId>org.apache.arrow</groupId>
<version>8.0.0-SNAPSHOT</version>
<version>9.0.0-SNAPSHOT</version>
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump version

@@ -93,7 +98,6 @@
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
<version>${dep.protobuf.version}</version>
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This caused the build to fail

<exclude>org.slf4j.**</exclude>
<exclude>org.apache.calcite.**</exclude>
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excluding these from shade allowed me to attach a debugger while running the shaded jar from DataGrip, solving some problems I couldn't re-create in junit tests.


final Map<String, String> keyValuePairs = UrlParser.parse(extraParams, "&");
resultMap.putAll(keyValuePairs);
if(extraParams != null) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This NPEd when there were no extraParams


import java.util.concurrent.CompletableFuture;

public class FlightClientCloser implements FlightStream {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensures FlightSqlClient is closed when the Stream is closed. I'm not sure if this is the right way to do it, but it seemed the least disruptive to the lifecycle of the driver and Future stuff.

URI uri = ep.getLocations().isEmpty() ? null : ep.getLocations().get(0).getUri();
FlightSqlClient sqlClient = this.factory.createConnection(uri);
FlightStream stream = sqlClient.getStream(ep.getTicket(), getOptions());
FlightClientCloser closer = new FlightClientCloser(sqlClient, stream);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap the stream to ensure it gets closed.

* Config options shared by both the {@link ArrowFlightSqlClientHandler.Builder}
* and the {@link ConnectionFactory}.
*/
static class Config {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the builder config into this data class, so it can be re-used later from the factory without breaking the builder API.

@avantgardnerio
Copy link
Author

This was a redo of rafael-telles#42

Brent Gardner added 4 commits August 17, 2022 14:38
Try to publish

Attach workspace

Don't rebuild every time

Only need to deploy jdbc-driver

Only need to deploy jdbc-driver

Might work this time?

Keep trying

Try to tag

Add docker

Build from correct directory

typo
@avantgardnerio
Copy link
Author

Will re-open as a new PR against the apache/arrow repo.

@lidavidm
Copy link
Member

@avantgardnerio this came up on dev@ today - are you still planning on re-opening this? Or would you mind if another contributor picked up this PR and continued it?

@avantgardnerio
Copy link
Author

are you still planning on re-opening this

At my company, it was concluded that for our use-case, proxying the flights through the "orchastrator" scheduler in Ballista was preferable from a deployment perspective, so I can't justify spending any company time on this. So the short answer is "no".

would you mind if another contributor picked up this PR

Of course not, happy to help with backstory, advice, or anything else I can contribute!

@lidavidm
Copy link
Member

Understandable. Thank you!

I filed #34532 for posterity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants