-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for -f
command to run commands from files
#120
Conversation
|
||
DataFusion-tui provides an extensible terminal based data analysis tool that uses [DataFusion](https://github.com/apache/arrow-datafusion) as query execution engines. It has drawn inspiration and several features from `datafusion-cli`. In contrast to `datafusion-cli` a focus of `dft` is to provide an interface for leveraging DataFusions extensibility (for example connecting to `ObjectStore`s or querying custom `TableProvider`s). | ||
`dft` provides two interfaces to the [DataFusion](https://github.com/apache/arrow-datafusion) query execution engine: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also took a shot at updating the description here to describe the two different UIs. Let me know what you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, and I like the scripting option. Only comment from the updated description is that right now there isnt any MySQL integration.
@@ -300,32 +301,75 @@ pub async fn run_app(cli: cli::DftCli, state: state::AppState<'_>) -> Result<()> | |||
info!("Running app with state: {:?}", state); | |||
let mut app = App::new(state, cli.clone()); | |||
|
|||
match &cli.command { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this just removes a seemingly unecessary level of wrapping of Clap argument parsing
@@ -0,0 +1,236 @@ | |||
// Licensed to the Apache Software Foundation (ASF) under one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here are a bunch of tests for the CLI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably move some of these to a common
or something similar test since it covers both the CLI and TUI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment I think it only covers the "CLI" as I have defined it (aka running dft
without any interaction)
In terms of testing the TUI I do think that would be valuable too (like fire up the tui and send key strokes like q
and x
to it). I think there are some frameworks that could help here, but I haven't looked into them
What I was thinking is that we could use the "CLI" mode (aka batch mode) to test the various integrations like delta lake and other table providers and focus the UI testing on just the user interaction (rather than the details of execution)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that makes sense. we'll need to do some testing of the execution triggered by the UI but it definitely doesnt need to be concerned with details - more just making sure the results make it into app state.
|
||
/// run and execute SQL statements and commands from a file, against a context | ||
/// with the given print options | ||
pub async fn exec_from_file(ctx: &ExecutionContext, file: &Path) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the "process commands from a file" code -- it is based on the code in datafusion-cli
## `dft` CLI | ||
|
||
The `dft` CLI is a scriptable engine for executing queries from files. It is used in a similar manner to `datafusion-cli` but with the added benefit of being able to query from multiple data sources. | ||
|
||
For example you can run the contents of `query.sql` with | ||
|
||
```shell | ||
$ dft -f query.sql | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool, love this addition
- "Catalog File" support | ||
- Save table definitions *and* data | ||
- Save parquet metadata from remote object stores |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be useful if we had an issue that provided more details on the aspirations here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #122 and will link here
@@ -121,7 +121,8 @@ impl ExecutionContext { | |||
&self.session_ctx | |||
} | |||
|
|||
pub async fn execute_stream_sql(&mut self, query: &str) -> Result<()> { | |||
/// Executes the specified query and prints the result to stdout | |||
pub async fn execute_stream_sql(&self, query: &str) -> Result<()> { | |||
let df = self.session_ctx.sql(query).await.unwrap(); | |||
let physical_plan = df.create_physical_plan().await.unwrap(); | |||
// We use small batch size because web socket stream comes in small increments (each |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to remove this comment
help = "Execute commands from file(s), then exit", | ||
value_parser(parse_valid_file) | ||
)] | ||
pub file: Vec<PathBuf>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub file: Vec<PathBuf>, | |
pub files: Vec<PathBuf>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call -- this was just copied from datafusion-cli. files
is better
dft provides a rich terminal UI as well as a broad array of pre-integrated | ||
data sources, formats to make querying and analyzing data from | ||
various sources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dft provides a rich terminal UI as well as a broad array of pre-integrated | |
data sources, formats to make querying and analyzing data from | |
various sources. | |
dft provides a rich terminal UI as well as a broad array of pre-integrated | |
data sources and formats for querying and analyzing data from | |
various sources. |
@@ -0,0 +1,236 @@ | |||
// Licensed to the Apache Software Foundation (ASF) under one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably move some of these to a common
or something similar test since it covers both the CLI and TUI
Thanks for the contribution Andrew! I think this is a great addition. I added some comments / feedback, none of which need to addressed now. I also tested the app locally with all features (with and without a config file) to ensure it still worked and it still ran as expected. Either one of us can decide to work on the comments in follow on PRs. |
Implemented follow ons in #123 |
Implement suggestions from @matthewmturner in #120
This is my (very tardy) downpayment to help
dft
It adds a simple "run commands from a file" mode via the
-f
argument and adds some basic testsHere is an example:
Things I hope to do next:
-c <sql string>