Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PostGIS GeoJSON support #1564

Merged
merged 2 commits into from
Jun 23, 2022
Merged

Conversation

steve-chavez
Copy link
Member

@steve-chavez steve-chavez commented Jul 22, 2020

First step towards solving the long wanted #223!

This feature won't be included in the upcoming release. But I wanted to do a draft and kickstart discussion.

Implementation

PostGIS 3.0 added the ST_AsGeoJSON(record .. function(not available in a prior version). Now ST_AsGeoJSON just works:

-- with this table
-- create table shops (
--   id   int primary key
-- , address text
-- , shop_geom extensions.geometry(POINT, 4326)
-- );

select st_asgeojson(x) from shops x;

{"type": "Feature", 
 "geometry": {"type":"Point","coordinates":[-71.10044,42.373695]}, 
 "properties": {"id": 1, "address": "1369 Cambridge St"}
} 

It even includes all the non-geometry columns in properties. This means that we don't have to detect the geometry column, ST_AsGeoJSON does this internally.

A table that doesn't have a geometry column will err accordingly:

select st_asgeojson(x) from projects x;

ERROR:  22023: geometry column is missing
LOCATION:  composite_to_geojson, lwgeom_out_geojson.c:197

We take advantage of these for the implementation:

curl -H "Accept: application/geo+json" localhost:3000/shops

{ "type" : "Featurecollection"
, "features" : [
    {"type": "Feature", "geometry": {"type":"Point","coordinates":[-71.10044,42.373695]}, "properties": {"id": 1, "address": "1369 Cambridge St"}}
  ]
}

TODO

  • Should we support older PostGIS versions(< 3.0.0)? Not for now, we can redirect the user to the docs for how to do it manually.
  • Should we support the deprecated application/vnd.geo+json media type? We'll see if there's demand for this.
  • According to the geojson RFC, there can be extra attributes - bbox and id - on the feature. In the case of id, I think it'll be fine to keep it inside the properties attribute and use the standard ST_AsGeoJson. For bbox, not sure. I think we can omit that for now.
  • geojson should definitely be supported for GET and RPC. Should it also be supported for POST/PUT/DELETE? I think yes, to be consistent. It would only be used for getting the resource after mutation(return=representation).
  • Resource embedding interaction? Works, the embedded rows will go into the properties key.

References

@steve-chavez steve-chavez force-pushed the geojson branch 2 times, most recently from 47dea30 to 2d58fb3 Compare July 24, 2020 17:32
@steve-chavez
Copy link
Member Author

Resourcing embedding is working fine, the embedded resource gets included inside the "properties" key.

Check this test to confirm: 237f4a0#diff-916d27c5920ad2514d327ee39a639b47R56-R71

@ruslantalpa
Copy link
Contributor

imo the output response shape here is ugly, it might be an RFC for geo data types but the rest of the columns in a row are not properties of the geo object, it's the other way around.
If you are looking for an easy solution (so as to avoid having to detect the column types) one would be to have a special cast type ::GEOASJSON or something, this way people can signal which column is the GEO one (they are already doing it in this PR with a special header).
You only need 1 line to this function

pgFmtSelectItem table (f@(fName, jp), Just cast, alias, _) = "CAST (" <> pgFmtField table f <> " AS " <> cast <> " )" <> pgFmtAs fName jp alias
in order to treat this special case and apply st_asgeojson only to the geo column
the url would look like /shops?select=id,shop_geom::GEOASJSON, no need for special headers, response shape stays the same, easy code change.
In case, in the future postgrest start detecting column types (thus not needing the cast), it's easy to remove from frontend code (as opposed to handling the new response shape change) and also postgrest can silently ignore the special cast if need be.

@ruslantalpa
Copy link
Contributor

Maybe even my suggestion is not necessary since one can define CAST like this https://www.postgresql.org/docs/current/sql-createcast.html and have the geo/json conversion done transparently by the database

@steve-chavez
Copy link
Member Author

steve-chavez commented Jul 27, 2020

@ruslantalpa The problem with the column casting approach is that it would only give you the Geometry object(the {"type": "Point", "coordinates" ".."} part) and not the Feature object(the wrapping {"type": "Feature", "properties": ".."} part). We need to process the whole row for getting the feature object.

If we don't comply with geojson feature objects, then clients would not be able to easily use ready-made map libs like leaflet, mapbox gl, google maps. And that is really the main use case for geojson.

For example, the onEachFeature of leaflet(see https://leafletjs.com/examples/geojson/) would not work.

@steve-chavez
Copy link
Member Author

steve-chavez commented Jul 27, 2020

Selecting the geometry column in case of multiple

I think this is already working(check the new test). Basically selecting the column with ?select=my_chosen_geom,other_attr and requesting the geojson media type.

Multiple geometry columns output

Not a common use case, I believe. Users can have a single geometry column with collection types.

That being said, we could work this out in PostgREST. If a table has more than one geometry column, we can join them in a single feature by using a GeometryCollection.

This can be a future enhancement though.

@steve-chavez
Copy link
Member Author

Something I've noticed. In the latest postgis version, getting the geojson geometry object is done automatically for json_agg or row_to_json - with no need for calling st_asgeojson.

Though of course, we still need the geojson feature object.

@ruslantalpa
Copy link
Contributor

in regard to geo columns and json_agg, are you saying in the new postgis casting is done automatically (geo->json)? that is cool, does it work the other way around when inserting (json->geo)? (if that is the case, 2 test proving that in this pr seem ideal)
related to special geo header, if there are libs that consume this output then indeed it's a cool feature.

if i am understanding this correctly, you can use the custom header and get a format compatible with geo libs or you can just omit the header and get a normal output and the geo columns are transparently casted to geojson (this would be the ideal/flexible feature set)

@steve-chavez
Copy link
Member Author

are you saying in the new postgis casting is done automatically (geo->json)? that is cool

Yeah, that works good! I've added a test that confirms it: 1fd8a0a

does it work the other way around when inserting (json->geo)

That does not work with the geojson object unfortunately. Though using the literal representation works, as in this test: c3d7ca6#diff-916d27c5920ad2514d327ee39a639b47R89. Haven't tried a CAST for this case.

if i am understanding this correctly, you can use the custom header and get a format compatible with geo libs or you can just omit the header and get a normal output and the geo columns are transparently casted to geojson

Yes, that's pretty much it.

@steve-chavez steve-chavez marked this pull request as ready for review July 29, 2020 00:19
@wolfgangwalther wolfgangwalther changed the base branch from master to main December 31, 2020 14:12
@LorenzHenk
Copy link

That does not work with the geojson object unfortunately. Though using the literal representation works, as in this test: c3d7ca6#diff-916d27c5920ad2514d327ee39a639b47R89. Haven't tried a CAST for this case.

Could you go more into detail why this doesn't work?

This would be really nice to have 🙂.
Right now we're doing this conversion from and to GeoJSON with views & triggers. Moving this to PostgREST would be a huge relief.

@steve-chavez
Copy link
Member Author

Could you go more into detail why this doesn't work?

@LorenzHenk Sure. It's just that casting from json to geometry doesn't work now, I assume PostGIS doesn't have a CAST defined in this case:

select '{"type":"Point","coordinates":[1,1]}'::geometry
-- error: invalid geometry..

So POSTing a json to a geometry column through PostgREST won't work too.

However casting the literal representation does work:

select 'SRID=4326;POINT(1 1)'::geometry
-- 0101000020E6100000000000000000F03F000000000000F03F

And thus POSTing that will work:

http POST localhost:3000/tbl << JSON
{"geocol": "SRID=4326;POINT(1 1)"}
JSON

(related: PostgREST/postgrest-docs#273)

This would be really nice to have slightly_smiling_face.

Really this PR was ready, I didn't merge it because I wanted to keep releases more focused at the time, but that ship has sailed(many features on the pipeline). So I'll rebase and merge this one after #1760.

@LorenzHenk
Copy link

Does PostgREST automatically try to cast? Could we just create a cast on our own (e.g. by following this guide) from json to geometry and PostgREST would use that one automatically?

@steve-chavez
Copy link
Member Author

steve-chavez commented Mar 2, 2021

Could we just create a cast on our own (e.g. by following this guide) from json to geometry and PostgREST would use that one automatically?

That one doesn't seem to work unfortunately.

Creating the cast does work:

create or replace function json_to_geometry(json) returns geometry as $$
  SELECT ST_AsText(ST_GeomFromGeoJSON($1))::geometry;
$$ language sql;

create cast (json as geometry) with function json_to_geometry(json) as implicit;

select ('{"type":"Point","coordinates":[1,1]}'::json)::geometry
-- 01010000009279E40F061E48C0F2B0506B9A1F3440

But then PostgREST uses json_populate_recordset, which doesn't apply the cast:

create table entities(
  id int,
  geocol geometry  
);

SELECT "id", "geocol" FROM json_populate_record (null:: "public"."entities" , '{"id": 12, "geocol": {"type":"Point","coordinates":[1,1]}}') _
-- parse error - invalid geometry

I thought json_populate_recordset would apply the cast because it does for composite types, but looks this isn't the case.

@LorenzHenk
Copy link

I thought json_populate_recordset would apply the cast because it does for composite types, but looks this isn't the case.

I just checked the docs, here are the notes:

To convert a JSON value to the SQL type of an output column, the following rules are applied in sequence:

  • A JSON null value is converted to a SQL null in all cases.
  • If the output column is of type json or jsonb, the JSON value is just reproduced exactly.
  • If the output column is a composite (row) type, and the JSON value is a JSON object, the fields of the object are converted to columns of the output row type by recursive application of these rules.
  • Likewise, if the output column is an array type and the JSON value is a JSON array, the elements of the JSON array are converted to elements of the output array by recursive application of these rules.
  • Otherwise, if the JSON value is a string literal, the contents of the string are fed to the input conversion function for the column's data type.
  • Otherwise, the ordinary text representation of the JSON value is fed to the input conversion function for the column's data type.

In your case it'd hit the last point, meaning the JSON is taken as string and tried to cast to geometry.

I don't know if it's possible to overwrite the input conversion function, or rather if this is something we should aim for.

@wolfgangwalther
Copy link
Member

  • Otherwise, the ordinary text representation of the JSON value is fed to the input conversion function for the column's data type.

In your case it'd hit the last point, meaning the JSON is taken as string and tried to cast to geometry.

I don't know if it's possible to overwrite the input conversion function, or rather if this is something we should aim for.

Isn't this just a cast text -> geometry that is needed instead of json -> geometry? Not 100% sure about "input conversion function" == "cast text -> ...", though.

@steve-chavez
Copy link
Member Author

Isn't this just a cast text -> geometry that is needed

That won't work, because it's already used for the CAST I put above.

Also:

create cast (text as geometry) with function text_to_geometry(text) as implicit;
-- error: cast from type text to type geometry already exists

@steve-chavez
Copy link
Member Author

From the discussion here, we can conclude that's not possible to use a CAST for converting the payload to geojson.

Another option could be to do the transformation ourselves. When a Content-Type: application/geo+json is specified for the payload, we can process it with an expression similar to the one proposed in #1567 (comment) - here we'd use ST_GeomFromGeoJSON for each row.

I think we can continue the "writing" part on another issue, the "reading" part here should be good to merge. It just needs a rebase.

@wolfgangwalther
Copy link
Member

From the discussion here, we can conclude that's not possible to use a CAST for converting the payload to geojson.

Another option could be to do the transformation ourselves. When a Content-Type: application/geo+json is specified for the payload, we can process it with an expression similar to the one proposed in #1567 (comment) - here we'd use ST_GeomFromGeoJSON for each row.

If we were to implement #1582 (comment) and #1582 (comment), this should be fairly easy with pgrst.accept - in both directions.

I think we can continue the "writing" part on another issue, the "reading" part here should be good to merge. It just needs a rebase.

Assuming we implement a good pgrst.accept solution, I wonder whether we actually want to have any of that in core. In that case both directions could easily be solved with some SQL functions in postgrest-contrib.

@steve-chavez
Copy link
Member Author

steve-chavez commented Apr 13, 2021

Assuming we implement a good pgrst.accept solution, I wonder whether we actually want to have any of that in core.

Yeah, but the thing is we don't know how good the interface would be, and meanwhile this PR is already useful - geojson is a common use case.

Besides that, if geojson output would be in core, I think we could later allow override it through the pgrst.accept mechanism right? Here I mean just the output in geojson part, I'm still not sure how the geojson input would work(both in core and with pgrst.accept).

Edit: it'd be nice to show we're entering the postgis space in this release, there are some excting postgis-related features (postgis filters, mvt output, etc) I have in mind for later.

@wolfgangwalther
Copy link
Member

Yeah, but the thing is we don't know how good the interface would be, and meanwhile this PR is already useful - geojson is a common use case.

Agreed!

Besides that, if geojson output would be in core, I think we could later allow override it through the pgrst.accept mechanism right?

Yes, I think so, too.

Edit: it'd be nice to show we're entering the postgis space in this release, there are some excting postgis-related features (postgis filters, mvt output, etc) I have in mind for later.

Cool! Let's do it!

We cannot handle postgis and other postgresql dependencies with
stack.

Also delete test/with_tmp_db since it's no longer used.
Works for GET, POST, PATCH, PUT, DELETE, RPC.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants