You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge?
Some people want to use DataFusion as a read only engine (for example we do in IOx). We do not want to allow users to:
Create memory backed tables (the state is ephemeral, so they won't be able to use them)
Write to local files (via COPY) as this is a security issue
Set session configuration (e.g. batch_size) as this can cause unwanted memory use / Denial of service attacks
Other users, such as datafusion-cli want to allow all the features
Also, DataFusion has gained additional capabilities, such as the ability to INSERT into the included table providers like Csv and Json, it may not be obvious to builders on top of DataFusion that such modifications are allowed and depending on their usecase may actually be a security risk
While working on #7272 from @UlfarErl , it is pretty clear that the distinction between APIs that handle read only sql and SQL that modifies the catalog is confusing. Additionally
the new COPY command, is a normal execution plan, and thus without additional work on IOx (see https://github.com/influxdata/influxdb_iox/pull/8515#discussion_r1297654343 ) datafusion could allow users to run COPY (and overwrite local files, etc)
Describe the solution you'd like
Thus I propose making an API on SessionContext and SessionState with the specific options about what types of operations are supported:
implSessionContext{/// Existing API will allow all types of SQL:pubasyncfnsql(&self,sql:&str) -> Result<DataFrame>{.self.sql_with_options(sql SQLOptions{allow_ddl:true,allow_dml:true,allow_config:true,})}/// New API will generate errors if a type of command is not allowedpubasyncfnsql_with_options(&self,sql:&str,options:SQLOptions) -> Result<DataFrame>{let plan = ...;ifis_dml(plan) && !optiobs.allow_dml{returnplan_err!("DML Plan {plan} is not allowed")}
...
}
I've honestly lost context on this, and haven't been following the recent additions, so I'll refrain from comment. So long as we're able to maintain some notion of atomicity, its fine by me 😅
Is your feature request related to a problem or challenge?
Some people want to use DataFusion as a read only engine (for example we do in IOx). We do not want to allow users to:
Other users, such as datafusion-cli want to allow all the features
Also, DataFusion has gained additional capabilities, such as the ability to INSERT into the included table providers like
Csv
andJson
, it may not be obvious to builders on top of DataFusion that such modifications are allowed and depending on their usecase may actually be a security riskWhile working on #7272 from @UlfarErl , it is pretty clear that the distinction between APIs that handle read only sql and SQL that modifies the catalog is confusing. Additionally
the new
COPY
command, is a normal execution plan, and thus without additional work on IOx (see https://github.com/influxdata/influxdb_iox/pull/8515#discussion_r1297654343 ) datafusion could allow users to run COPY (and overwrite local files, etc)Describe the solution you'd like
Thus I propose making an API on SessionContext and SessionState with the specific options about what types of operations are supported:
Something like:
And then add this:
Describe alternatives you've considered
No response
Additional context
Related to an earlier proposal #4720
The text was updated successfully, but these errors were encountered: