-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tighten up type roundtrips ready for postgis launch #304
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
} | ||
|
||
|
||
# Types that can't be roundtrip perfectly in GPKG, and what they end up as. | ||
APPROXIMATED_TYPES = {"interval": "text", "time": "text", "numeric": "text"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion is numeric should probably map to a Double in gpkg? Feels like "it's-a-number" is conceptually more important than the accuracy?
reckons/discussion welcome
"boolean": (bool,), | ||
"blob": (bytes,), | ||
"date": (str,), | ||
"date": (str,), # might be datetime.date from DB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should setup the DB adapters to map types consistently to Python? Ie. add the adapters into apsw/sqlite3/psycopg2/etc so we always get a datetime.date
/etc.
Though sqlalchemy IIRC it does this consistently already? later
return t | ||
if t is None: | ||
return t | ||
# This makes postgis timestamps behave more like GPKG ones. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postgis timestamps-with-time-zones strings could be 2020-11-13T10:00:00+13:00
though? Seems like suffixing Z
is too simplistic?
|
||
|
||
# See https://github.com/psycopg/psycopg2/blob/master/psycopg/typecast_builtins.c | ||
TIMESTAMP_OID = 1114 | ||
TIMESTAMP = new_type((TIMESTAMP_OID,), "TIMESTAMP", adapt_timestamp_from_db) | ||
psycopg2.extensions.register_type(TIMESTAMP) | ||
|
||
# We mostly want data out of the DB as strings, just as happens in GPKG. | ||
# Then we can serialise it using MessagePack. | ||
# TODO - maybe move adapt-to-string logic to MessagePack? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feels like it should be a wc-specific function, perhaps with utils/base-classes/mixins/helpers to DRY.
Description
Lots of little changes to code dealing with types, now that we have two different working copies that support different types.
Approximated types:
Made sure that every type supported by sno can be written - somehow - to a working copy, even if just as a string.
Implemented "approximated types":
eg "time" is approximated as "text" in GPKG, and "int8" is approximated as "int16" in Postgis.
If approximated types aren't changed by the user and remain the closest possible approximation, they don't count as a diff.
Added tests to ensure that all types are either roundtrippable or known to be "approximated" in both working copies.
Make sure the more exotic types returned by postgis are strings and not eg numeric.Decimal, which we wouldn't be able serialise to a dataset using MessagePack.
Changed schema.json - drop attributes that are null eg {"size": null} or {"primaryKeyIndex": null}. Otherwise we have too many ways to represent the same data - we don't need to distinguish between null and missing, and there was already a
lack of consistency as to which was the right way. So, now the standard is that null attributes are dropped, and loaded schemas are normalised to this standard to prevent spurious diffs.
Standardise how X_TYPE_TO_Y_TYPE mappings - which we have a few of by now - are encoded in python.
Related links:
#293
Checklist: