Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REFACTOR] Move DF reformat from StatementExecutionManagerImpl to QueryResultWriterImpl #701

Merged

Conversation

noCharger
Copy link
Collaborator

Description

Move DF reformat from StatementExecutionManagerImpl to QueryResultWriterImpl so that people use custom StatementExecutionManager still can work with default QueryResultWriterImpl and write result to OpenSearch result index for custom data source.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Louis Chu <clingzhi@amazon.com>
statementsExecutionManager.executeStatement(flintStatement)
val startTime = System.currentTimeMillis()
val df = statementsExecutionManager.executeStatement(flintStatement)
queryResultWriter.reformatDataFrame(df, flintStatement, startTime)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wonder why query result writer has to expose this new API and cannot do this in writeDataFrame API? Anything happen after this and before writeDataFrame called?

Copy link
Collaborator Author

@noCharger noCharger Sep 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wonder why query result writer has to expose this new API and cannot do this in writeDataFrame API? Anything happen after this and before writeDataFrame called?

Reformat data will call the 'collect' API on the driver node to trigger an execution. If onlyspark.sql() is called without an execution triggerred on the same thread, Spark will not process the query execution, and the thread seems to be idle.

Verified this in IT that query will not able to process in this case.

assertion failed: Timeout occurred after 60000 milliseconds waiting for query result.
java.lang.AssertionError: assertion failed: Timeout occurred after 60000 milliseconds waiting for query result.

Add more comments in 19ac42b

Signed-off-by: Louis Chu <clingzhi@amazon.com>
@noCharger noCharger merged commit d76e0cd into opensearch-project:main Sep 30, 2024
4 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 30, 2024
…ryResultWriterImpl (#701)

* Refactor query result writer

Signed-off-by: Louis Chu <clingzhi@amazon.com>

* Add more scala doc and update sbt

Signed-off-by: Louis Chu <clingzhi@amazon.com>

---------

Signed-off-by: Louis Chu <clingzhi@amazon.com>
(cherry picked from commit d76e0cd)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 30, 2024
…ryResultWriterImpl (#701)

* Refactor query result writer

Signed-off-by: Louis Chu <clingzhi@amazon.com>

* Add more scala doc and update sbt

Signed-off-by: Louis Chu <clingzhi@amazon.com>

---------

Signed-off-by: Louis Chu <clingzhi@amazon.com>
(cherry picked from commit d76e0cd)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
noCharger pushed a commit that referenced this pull request Sep 30, 2024
…ryResultWriterImpl (#701) (#716)

* Refactor query result writer



* Add more scala doc and update sbt



---------


(cherry picked from commit d76e0cd)

Signed-off-by: Louis Chu <clingzhi@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
noCharger pushed a commit that referenced this pull request Sep 30, 2024
noCharger added a commit to noCharger/opensearch-spark that referenced this pull request Oct 2, 2024
noCharger added a commit to noCharger/opensearch-spark that referenced this pull request Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants