Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic export command #988

Merged
merged 3 commits into from
May 6, 2024
Merged

Basic export command #988

merged 3 commits into from
May 6, 2024

Conversation

olsen232
Copy link
Collaborator

@olsen232 olsen232 commented May 3, 2024

Uses python feature API + OGR drivers,
rather than the Kart python driver in contrib/

Still documented as experimental, marked as hidden, not yet in CHANGELOG.
Will document more and add to an upcoming release once tested a bit more.

Checklist:

  • Have you reviewed your own change?
  • Have you included test(s)?
  • Have you updated the changelog?

Uses python feature API + OGR drivers,
rather than the Kart python driver in contrib/
@olsen232 olsen232 requested review from craigds and rcoup May 3, 2024 05:12
Copy link
Member

@craigds craigds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

kart/tabular/export.py Show resolved Hide resolved
kart/tabular/export.py Show resolved Hide resolved
kart/tabular/export.py Show resolved Hide resolved
kart/tabular/export.py Outdated Show resolved Hide resolved
kart/tabular/export.py Outdated Show resolved Hide resolved
@olsen232 olsen232 merged commit f68f614 into master May 6, 2024
12 checks passed
@olsen232 olsen232 deleted the kart-export-2 branch May 6, 2024 05:29
Copy link
Member

@rcoup rcoup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're likely going to want to set some format-specific defaults for config options eventually. Though I guess we can punt that by saying "we follow GDALs defaults, here's how to change them"

It'd be handy to set OGR_CURRENT_DATE (GPKG), DBF_DATE_LAST_UPDATE (SHP), to commit time.

And we should add whatever metadata/descriptions/titles/etc we have avaialble.

How are field-names laundered (eg: >8 chars for shapefiles)? At a minimum it'd be nice to show it's happening (eg: 'MyReallyLongField' has been truncated to 'MYREALLYLO') — I'm not sure whether ogr2ogr does that or can expose it.

kart/tabular/export.py Show resolved Hide resolved
"--primary-key-as-field/--no-primary-key-as-field",
is_flag=True,
default=True,
help="Include the primary key column as a regular field that is sent to the GDAL driver.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would / wouldn't you want this? Is this exclusive with --primary-key-as-fid?

@click.option(
"--primary-key-as-fid/--no-primary-key-as-fid",
is_flag=True,
default=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't we want this to default to true? Or should it be true for some formats and false for others?

IIUC SetFID() is ignored for formats that require sequential FIDs (like Shapefiles)

Comment on lines +7 to +10
# Aliases `kart export` to `kart table-export`.
# One day we might support other types of export, and when we do, this will then
# be a general purpose command that delegates to one or other more specific commands.
# See: how `kart import` delegates to one of table-import, point-cloud-import, raster-import.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some formats can do vectors and rasters (eg: GeoPackage, FileGDB)... how can/should we deal with that in future?

help="Report to the GDAL driver that the FID of each feature is the primary key value.",
)
@click.option(
"--override-geometry",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--override-geometry-type?

raise click.UsageError(f"Dataset {ds_path} is not a vector or tabular dataset")

driver, destination = get_driver(destination_spec)
out_ds = driver.CreateDataSource(destination, options=list(dataset_creation_option))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unclear what this would do if the output already exists? I guess to allow multiple datasets to end up in the same FGDB/GPKG/etc then we should generally allow this.

For some formats the output is a directory (shapefiles); for others it's a file (GPKG), which complicates things a bit.

We don't want the working copy overwritten, that would be bad.

layer_name, ogr_crs, ogr_geom_type, options=list(layer_creation_option)
)
else:
out_layer = out_ds.CreateLayer(layer_name, options=list(layer_creation_option))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here, do we always error if the layer exists already?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add some more test cases:

  • using dataset creation options
  • using layer creation options
  • gdal config options (via env)
  • existing directories, existing files, etc
  • maybe a set of common formats? (SHP, FGDB, GeoJSON, CSV)
  • truncated/fixed/clashing field names (SHP in particular)
  • error handling / error messages



@click.command(
"table-export",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing option - CRS? Exporting into a non-native CRS seems like a fairly common use case.

Comment on lines +146 to +148
Uses GDAL's OGR drivers to do so - consult https://gdal.org/drivers/vector/index.html
to find the options specific to a particular driver, which can then be set using
-dsco and -lco just as when using `ogr2ogr`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add a comment about setting GDAL configuration options via the environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants