-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change FileFormat
to depends on TaskContext
rather than SessionState
#5843
Conversation
2816fa8
to
bd8a9d2
Compare
@@ -48,7 +48,7 @@ impl FileFormat for AvroFormat { | |||
|
|||
async fn infer_schema( | |||
&self, | |||
_state: &SessionState, | |||
_task_ctx: &TaskContext, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of this PR is to change the SessionState
references in these traits to TaskContext
@@ -1176,7 +1178,14 @@ impl QueryPlanner for DefaultQueryPlanner { | |||
} | |||
} | |||
|
|||
/// Execution context for registering data sources and executing queries | |||
/// Holds state needed for planning queries within the context of a Session. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to clarify SessionState vs TaskContext
Perhaps I am missing something here, but given FileFormat and the closely related ListingTable are part of query planning, and not execution, I'm not sure why they would depend on TaskContext which is the execution state? |
The update here is that @tustvold and I talked through this, and I think we were using different terms / ideas of what "planning" and "execution" mean. I will attempt to add some more docs to make this clearer |
7ad6bbe
to
d23b5e7
Compare
d23b5e7
to
5b79b5e
Compare
I am taking a different direction for now, so closing this PR -- see #1754 for latest plan |
Which issue does this PR close?
Part of #1754
Rationale for this change
I am trying to extract the physical_plan code into its own crate; and to do so I need to remove the circular dependencies between core --> datasource --> execution --> datasource
Since
SessionState
is staying indatafusion
core crate, in order to pull out datasource into its own crate (and only depend on datafusion_execution) it can not depend onSessionState
Conveniently,
TaskContext
is designed to be exactly the part ofSessionState
needed for executionSee more details in #1754 (comment)
What changes are included in this PR?
FileFormat
to depends onTaskContext
rather than SessionStateAre these changes tested?
Covered by existing tests
Are there any user-facing changes?
Technically the
FileFormat
/ListingTable
is a public API; However, I don't think theFileFormat
is widely used outside of DataFusion so I expect this to have minimal impact on usersI expect the change of TableProvider to use
TaskContext
to be the most disruptive and I am saving it for last.