-
Notifications
You must be signed in to change notification settings - Fork 114
SEA: Reduce network calls for synchronous commands #633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sea-migration
Are you sure you want to change the base?
Conversation
Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions inline.
@dataclass | ||
class GetStatementResponse: | ||
"""Representation of the response from getting information about a statement.""" | ||
|
||
statement_id: str | ||
status: StatementStatus | ||
manifest: ResultManifest | ||
result: ResultData | ||
|
||
@classmethod | ||
def from_dict(cls, data: Dict[str, Any]) -> "GetStatementResponse": | ||
"""Create a GetStatementResponse from a dictionary.""" | ||
return cls( | ||
statement_id=data.get("statement_id", ""), | ||
status=_parse_status(data), | ||
manifest=_parse_manifest(data), | ||
result=_parse_result(data), | ||
) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you would still need this? Probably part of different PRs
""" | ||
|
||
# Create and return a SeaResultSet | ||
from databricks.sql.backend.sea.result_set import SeaResultSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this lazy import has perf gain (lazily module loading when needed)?
@@ -324,7 +323,7 @@ def _extract_description_from_manifest( | |||
return columns | |||
|
|||
def _results_message_to_execute_response( | |||
self, response: GetStatementResponse | |||
self, response: ExecuteStatementResponse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't know that GetStatementResponse and ExecuteStatementResponse have the same fields wrt results (interchangeable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I did not realise this at first either, but this can be confirmed by comparing the response in the REST reference as well:
ExecuteStatementResponse
: https://docs.databricks.com/api/workspace/statementexecution/executestatementGetStatementResponse
: https://docs.databricks.com/api/workspace/statementexecution/getstatement
This does make sense logically to me as well, the purpose of the GET is to get the info related to an execution statement.
@@ -378,7 +399,7 @@ def _check_command_not_in_failed_or_closed_state( | |||
|
|||
def _wait_until_command_done( | |||
self, response: ExecuteStatementResponse | |||
) -> CommandState: | |||
) -> ExecuteStatementResponse: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems odd to me that this method does polling and still ends up return ExecuteStatementResponse. Semantically, this is a response to ExecuteRequest
@@ -574,9 +591,25 @@ def get_query_state(self, command_id: CommandId) -> CommandState: | |||
path=self.STATEMENT_PATH_WITH_ID.format(sea_statement_id), | |||
data=request.to_dict(), | |||
) | |||
response = ExecuteStatementResponse.from_dict(response_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it okay to return ExecuteResponse as a result of Polling?
What type of PR is this?
Description
In
execute_command
we first send the execute request to the server, following which, if the request is synchronous, we poll for the request state until it is no longer in the pending state. Following this, we make an additionalGET
request to the server to get the final request information. There are two areas of improvement:wait_timeout
, then we need not poll for completion or make anotherGET
request to the server. We can immediately construct ourResultSet
with the provided response.ResultSet
. We need not make anotherGET
request to the server after we are done polling and can instead utilise the last response provided.How is this tested?
Related Tickets & Documents
N/A