Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tighten up type roundtrips ready for postgis launch #304

Merged
merged 1 commit into from
Nov 13, 2020
Merged

Conversation

olsen232
Copy link
Collaborator

@olsen232 olsen232 commented Nov 11, 2020

Description

Lots of little changes to code dealing with types, now that we have two different working copies that support different types.

Approximated types:

  • Made sure that every type supported by sno can be written - somehow - to a working copy, even if just as a string.

  • Implemented "approximated types":
    eg "time" is approximated as "text" in GPKG, and "int8" is approximated as "int16" in Postgis.
    If approximated types aren't changed by the user and remain the closest possible approximation, they don't count as a diff.

  • Added tests to ensure that all types are either roundtrippable or known to be "approximated" in both working copies.

  • Make sure the more exotic types returned by postgis are strings and not eg numeric.Decimal, which we wouldn't be able serialise to a dataset using MessagePack.

  • Changed schema.json - drop attributes that are null eg {"size": null} or {"primaryKeyIndex": null}. Otherwise we have too many ways to represent the same data - we don't need to distinguish between null and missing, and there was already a
    lack of consistency as to which was the right way. So, now the standard is that null attributes are dropped, and loaded schemas are normalised to this standard to prevent spurious diffs.

  • Standardise how X_TYPE_TO_Y_TYPE mappings - which we have a few of by now - are encoded in python.

Related links:

#293

Checklist:

  • Have you reviewed your own change?
  • Have you included test(s)?
  • Have you updated the changelog?

Copy link
Member

@craigds craigds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

tests/test_working_copy_postgis.py Outdated Show resolved Hide resolved
@olsen232 olsen232 merged commit 9ca24f1 into master Nov 13, 2020
@olsen232 olsen232 deleted the types branch November 13, 2020 02:54
}


# Types that can't be roundtrip perfectly in GPKG, and what they end up as.
APPROXIMATED_TYPES = {"interval": "text", "time": "text", "numeric": "text"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion is numeric should probably map to a Double in gpkg? Feels like "it's-a-number" is conceptually more important than the accuracy?

reckons/discussion welcome

@hamishcampbell @craigds ?

"boolean": (bool,),
"blob": (bytes,),
"date": (str,),
"date": (str,), # might be datetime.date from DB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should setup the DB adapters to map types consistently to Python? Ie. add the adapters into apsw/sqlite3/psycopg2/etc so we always get a datetime.date/etc.

Though sqlalchemy IIRC it does this consistently already? later

return t
if t is None:
return t
# This makes postgis timestamps behave more like GPKG ones.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Postgis timestamps-with-time-zones strings could be 2020-11-13T10:00:00+13:00 though? Seems like suffixing Z is too simplistic?



# See https://github.com/psycopg/psycopg2/blob/master/psycopg/typecast_builtins.c
TIMESTAMP_OID = 1114
TIMESTAMP = new_type((TIMESTAMP_OID,), "TIMESTAMP", adapt_timestamp_from_db)
psycopg2.extensions.register_type(TIMESTAMP)

# We mostly want data out of the DB as strings, just as happens in GPKG.
# Then we can serialise it using MessagePack.
# TODO - maybe move adapt-to-string logic to MessagePack?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like it should be a wc-specific function, perhaps with utils/base-classes/mixins/helpers to DRY.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants