-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Packaging multiple networks #51
Comments
#13 has some examples of networks with large numbers of links which might require streaming |
For GeoJSON, I don't think we need to worry about packaging. As long as the network identifier is included in each feature's properties, a single GeoJSON file could contain nodes or links from multiple networks. |
On reflection, I think a bigger issue is API design and streaming support for links and nodes, since each dataset is likely to contain a small number of potential large networks, rather than a large number of small networks. Edit: See #75 for a proposal on streaming/paginating individual networks. |
@kindly, @lgs85 it would be great to get your thoughts on the proposal below.
Whilst this is true, the standard still needs to specify how to package multiple networks, to avoid a situation in which publishers mint a variety of packaging formats, which would make authoring tools that consume OFDS data difficult. ProposalBased on the discussion in open-contracting/standard#1084, offer two packaging formats each for the JSON and GeoJSON publication formats:
The approach to packaging multiple networks in CSV format will depend on the tool chosen in #14. JSONSmall files and API responsesA top-level JSON object with an array of {
"networks": [
{...},
{...}
],
"pages": {
"next": "",
"prev": ""
}
} The preferred approach is to publish embedded nodes and links. For networks that are too large to return in a single API response, {
"nodes": [
{...},
{...},
{...}
],
"pages": {
"next": "",
"prev": ""
}
} {
"links": [
{...},
{...},
{...}
],
"pages": {
"next": "",
"prev": ""
}
} Bulk downloadsA JSON Lines file with one network per line: {...}
{...}
{...} The preferred approach is to publish embedded nodes and links. If an individual network is too large to load into memory, GeoJSONSmall files and API responsesPublish separate files/endpoints for nodes and links, each structured as a top-level FeatureCollection object according to the GeoJSON transformation specification. Each file may contain features from multiple networks. The network each feature relates to is identified by its For data published via API, add a top-level {
"type": "FeatureCollection",
"features": [
{...},
{...}
],
"pages": {
"next": "",
"prev": ""
}
} Bulk downloadsSeparate Newline-delimted GeoJSON files for nodes and links, with one feature per line structured according to the GeoJSON transformation specification: {...}
{...}
{...} Other approaches consideredJSONSmall files and API responsesDo not support packaging multiple networks. Instead, publish networks one at a time, i.e. publish a JSON file for each network containing a top-level {
"relatedResources": [
{
"href": "",
"rel": "next"
},
{
"href": "",
"rel": "prev"
}
]
} As in the proposal, the preferred approach is to publish embedded nodes and links. For networks that are too large to return in a single API response, Pros:
Cons:
Bulk downloadsA ZIP or GZIP file containing a JSON file for each network. As in the proposal, the preferred approach is to publish embedded nodes and links. For networks that are too large to load into memory, Cons:
|
This looks fine to me. The pages approach in GEOJSON format looks odd and may confuse geo users expecting to have all the data in one go and if they do not they may not be able to traverse through the links. However, I see no real harm in it. |
Thanks, @kindly. The In case it's of interest, ArcGIS uses pagination to serve GeoJSON data (example). If the data is greater than one page, no link to the next page is provided. Instead |
I think that this approach looks fine. As discussed, it'll be important to make very clear in the guidance that this is unlikely to be used in the majority of cases |
The reference documentation has been updated to reflect the proposal in this issue: https://open-fibre-data-standard.readthedocs.io/en/latest/reference/publication_formats.html This issue will remain open against the beta milestone to gather feedback from the alpha consultation. |
We've not heard any further feedback on this issue so I'm going to close it for now. |
From the data stewardship, publication formats and access methods consultation document:
When designing the format, we'll need to consider streaming. See open-contracting/standard#1084 for a related discussion. We'll also need to consider packaging for the CSV and GeoJSON formats.
The text was updated successfully, but these errors were encountered: