-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic export command #988
Basic export command #988
Conversation
Uses python feature API + OGR drivers, rather than the Kart python driver in contrib/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're likely going to want to set some format-specific defaults for config options eventually. Though I guess we can punt that by saying "we follow GDALs defaults, here's how to change them"
It'd be handy to set OGR_CURRENT_DATE
(GPKG), DBF_DATE_LAST_UPDATE
(SHP), to commit time.
And we should add whatever metadata/descriptions/titles/etc we have avaialble.
How are field-names laundered (eg: >8 chars for shapefiles)? At a minimum it'd be nice to show it's happening (eg: 'MyReallyLongField' has been truncated to 'MYREALLYLO'
) — I'm not sure whether ogr2ogr does that or can expose it.
"--primary-key-as-field/--no-primary-key-as-field", | ||
is_flag=True, | ||
default=True, | ||
help="Include the primary key column as a regular field that is sent to the GDAL driver.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would / wouldn't you want this? Is this exclusive with --primary-key-as-fid
?
@click.option( | ||
"--primary-key-as-fid/--no-primary-key-as-fid", | ||
is_flag=True, | ||
default=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wouldn't we want this to default to true? Or should it be true for some formats and false for others?
IIUC SetFID()
is ignored for formats that require sequential FIDs (like Shapefiles)
# Aliases `kart export` to `kart table-export`. | ||
# One day we might support other types of export, and when we do, this will then | ||
# be a general purpose command that delegates to one or other more specific commands. | ||
# See: how `kart import` delegates to one of table-import, point-cloud-import, raster-import. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some formats can do vectors and rasters (eg: GeoPackage, FileGDB)... how can/should we deal with that in future?
help="Report to the GDAL driver that the FID of each feature is the primary key value.", | ||
) | ||
@click.option( | ||
"--override-geometry", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--override-geometry-type
?
raise click.UsageError(f"Dataset {ds_path} is not a vector or tabular dataset") | ||
|
||
driver, destination = get_driver(destination_spec) | ||
out_ds = driver.CreateDataSource(destination, options=list(dataset_creation_option)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unclear what this would do if the output already exists? I guess to allow multiple datasets to end up in the same FGDB/GPKG/etc then we should generally allow this.
For some formats the output is a directory (shapefiles); for others it's a file (GPKG), which complicates things a bit.
We don't want the working copy overwritten, that would be bad.
layer_name, ogr_crs, ogr_geom_type, options=list(layer_creation_option) | ||
) | ||
else: | ||
out_layer = out_ds.CreateLayer(layer_name, options=list(layer_creation_option)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here, do we always error if the layer exists already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add some more test cases:
- using dataset creation options
- using layer creation options
- gdal config options (via env)
- existing directories, existing files, etc
- maybe a set of common formats? (SHP, FGDB, GeoJSON, CSV)
- truncated/fixed/clashing field names (SHP in particular)
- error handling / error messages
|
||
|
||
@click.command( | ||
"table-export", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing option - CRS? Exporting into a non-native CRS seems like a fairly common use case.
Uses GDAL's OGR drivers to do so - consult https://gdal.org/drivers/vector/index.html | ||
to find the options specific to a particular driver, which can then be set using | ||
-dsco and -lco just as when using `ogr2ogr`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add a comment about setting GDAL configuration options via the environment.
Uses python feature API + OGR drivers,
rather than the Kart python driver in contrib/
Still documented as experimental, marked as hidden, not yet in CHANGELOG.
Will document more and add to an upcoming release once tested a bit more.
Checklist: