Skip to content

Commit

Permalink
Update SQL queries, paths and add new SPARQL
Browse files Browse the repository at this point in the history
Updated SQL/wis2.sql to include additional aggregate functions and improved data extraction from 'active' directories in SQL/README.md. Added a new SPARQL query file, `distributions.rq`, under the OBIS directory. Also, expanded the documentation in SPARQL/OBIS/README.md to describe OBIS API leveraging and added relevant references.
  • Loading branch information
fils committed Jul 15, 2024
1 parent 7c43c05 commit 8197d37
Show file tree
Hide file tree
Showing 4 changed files with 45 additions and 7 deletions.
22 changes: 22 additions & 0 deletions SPARQL/OBIS/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,3 +105,25 @@ core_df['maximumDepthInMeters'].max() # 353.5 meters

is straightforward to process.

## OBIS API leveraging

There's no way currently to get depth statistics by dataset from the API except by going through all records
but I wouldn't recommend that.
One thing you could do is get dataset lists for depth slices,
eg https://api.obis.org/dataset?startdepth=5000&enddepth=6000
This is not the best approach since you have to query by ranges and get the related resources it seems.
However, there is a parquet (and csv) export from https://obis.org/data/access/ . Pieter said
that the parquet has depth in the form of the darwin
core fields minimumDepthInMeters and maximumDepthInMeters. So this might be the best route.
Pieter doesn't have time to work on this right away, but it might be easy for us to make an
"auxiliary" graph that we can test with and also share with Pieter. In the hopes it helps
him integrate the values into the production service eventually.
I am hoping that id in the parquet is the JSON-LD @id like https://obis.org/dataset/24e96d02-8909-4431-bc61-8cf8eadc9b7a
If that is the case this will be very easy! I am currently pulling down the parquet (18 Gb) and
will report what I find.


## References

* https://github.com/iodepo/odis-arch/blob/master/book/thematics/depth/index.md
* https://github.com/iodepo/odis-in/tree/master/SPARQL/OBIS (this document)
12 changes: 12 additions & 0 deletions SPARQL/OBIS/distributions.rq
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX schema: <https://schema.org/>

SELECT ?sid ?name ?url
WHERE {
?sid schema:distribution ?dist .
?dist schema:contentUrl ?url .
?sid schema:variableMeasured ?prop .
?prop schema:name ?name .
FILTER regex(str(?name), "depth", "i")
}
12 changes: 6 additions & 6 deletions SQL/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,12 @@ CREATE TABLE course (id VARCHAR, type VARCHAR, txt_location VARCHAR);
CREATE TABLE person (id VARCHAR, type VARCHAR, address VARCHAR, txt_knowsAbout VARCHAR, txt_knowsLanguage VARCHAR);
CREATE TABLE sup_geo (id VARCHAR, type VARCHAR, placename VARCHAR, geotype VARCHAR, geompred VARCHAR, geom VARCHAR, lat VARCHAR, long VARCHAR, g VARCHAR );

COPY base FROM '/home/fils/src/Projects/OIH/odis-arch/graphOps/extraction/mdp/output/*_baseQuery.parquet';
COPY dataset FROM '/home/fils/src/Projects/OIH/odis-arch/graphOps/extraction/mdp/output/*_dataset.parquet';
COPY sup_time FROM '/home/fils/src/Projects/OIH/odis-arch/graphOps/extraction/mdp/output/*_sup_temporal.parquet';
COPY course FROM '/home/fils/src/Projects/OIH/odis-arch/graphOps/extraction/mdp/output/*_course.parquet';
COPY person FROM '/home/fils/src/Projects/OIH/odis-arch/graphOps/extraction/mdp/output/*_person.parquet';
COPY sup_geo FROM '/home/fils/src/Projects/OIH/odis-arch/graphOps/extraction/mdp/output/*_sup_geo.parquet';
COPY base FROM '/home/fils/src/Projects/OIH/odis-arch/graphOps/extraction/mdp/output/active/*_baseQuery.parquet';
COPY dataset FROM '/home/fils/src/Projects/OIH/odis-arch/graphOps/extraction/mdp/output/active/*_dataset.parquet';
COPY sup_time FROM '/home/fils/src/Projects/OIH/odis-arch/graphOps/extraction/mdp/output/active/*_sup_temporal.parquet';
COPY course FROM '/home/fils/src/Projects/OIH/odis-arch/graphOps/extraction/mdp/output/active/*_course.parquet';
COPY person FROM '/home/fils/src/Projects/OIH/odis-arch/graphOps/extraction/mdp/output/active/*_person.parquet';
COPY sup_geo FROM '/home/fils/src/Projects/OIH/odis-arch/graphOps/extraction/mdp/output/active/*_sup_geo.parquet';


```
Expand Down
6 changes: 5 additions & 1 deletion SQL/wis2.sql
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ SELECT base_agg.id,
base_agg.b_desc,
base_agg.b_headline,
geo_agg.geom_list,
geo_agg.wkt_list,
geo_agg.geojson,
temporal_agg.tc_list,
temporal_agg.dp_list
FROM (SELECT id,
Expand All @@ -20,7 +22,9 @@ FROM (SELECT id,
FROM dataset
GROUP BY id) AS dataset_agg
ON base_agg.id = dataset_agg.id
JOIN (SELECT id, STRING_AGG(DISTINCT geom, ', ') AS geom_list
JOIN (SELECT id, STRING_AGG(DISTINCT geom, ', ') AS geom_list,
STRING_AGG(DISTINCT wkt, ', ') AS wkt_list,
STRING_AGG(DISTINCT geojson, ', ') AS geojson
FROM sup_geo
GROUP BY id) AS geo_agg
ON base_agg.id = geo_agg.id
Expand Down

0 comments on commit 8197d37

Please sign in to comment.