Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSM driver: prefer JSON or other standard to HSTORE #7533

Closed
opk12 opened this issue Mar 29, 2023 · 5 comments · Fixed by #7538
Closed

OSM driver: prefer JSON or other standard to HSTORE #7533

opk12 opened this issue Mar 29, 2023 · 5 comments · Fixed by #7538
Assignees

Comments

@opk12
Copy link

opk12 commented Mar 29, 2023

Expected behavior and actual behavior.

I am a newbie to databases and GDAL. I set up a workflow around a standard SQLite or GeoPackage made out of OpenStreetMap (I tested PBF and GeoJSON sources). As described in the OSM driver docs, ogr2ogr made an HSTORE-formatted column other_tags to hold the OSM tags that do not have a dedicated column (i.e. are not mentioned in osmconf.ini).

As GDAL's parser hstore_get_value() (docs) does not exist in sqlite3, I use regexes when running Python / sqlite3 on another machine, which feels conceptually wrong and is hard or impossible to do safely. Or I could try this workaround to do hstore_to_json and alter the DB. Also GDAL's implementation does not support SELECT DISTINCT (#3671). I cannot even find the function in the PostgreSQL docs and I feel I'm wasting time to learn idiosyncrasies of a non-reusable tool.

What about using JSON or another standard format instead? SQLite has native support for JSON. If the user doesn't know the JSON-specific SQL, they can pass the column to the host programming language, which most likely has a parser. The syntax and concepts are shared to some extent across SQLite's, MySQL's and PostgreSQL's implementations, so time spent to learn this standard would not be wasted.

@jratike80
Copy link
Collaborator

GDAL does support the SELECT DISTINCT case, it is just that the OGR SQL dialect does not. Test with
ogrinfo -sql "SELECT DISTINCT hstore_get_value(other_tags, 'bicycle') from lines" liechtenstein-latest.osm.pbf -dialect sqlite

What is the function that you do not find from PostgreSQL documentation?
I believe that the hstore functions https://www.postgresql.org/docs/current/hstore.html are rather well known by folks who use OSM data because the osm2pgsql https://osm2pgsql.org/ tool can create and populate a hstore column into PostgreSQL database. So time spent to learn also this standard would not be wasted. If the aim is to read GeoPackage with only native SQLite functions then saving tags as json feels reasonable. But how do you plan to parse the GeoPackage geometries in that case?

@opk12
Copy link
Author

opk12 commented Mar 30, 2023

Ah, many thanks for the pointers. Sorry for the noise.

@rouault
Copy link
Member

rouault commented Mar 30, 2023

Re-opening as proposing JSON output as an alternative seems a reasonable enhancement as hstore is a kind of ancestor for json, and json is more widely supported

@rouault rouault self-assigned this Mar 30, 2023
rouault added a commit to rouault/gdal that referenced this issue Mar 30, 2023
…open option, as an alternative for HSTORE formatting for other_tags/all_tags fields (fixes OSGeo#7533)
@rouault
Copy link
Member

rouault commented Mar 30, 2023

implemented in #7538

@opk12
Copy link
Author

opk12 commented Apr 1, 2023

@rouault People from the OSM community are excited about this change. Thank you! After the merge I'll propose the news to weeklyOSM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants