Improvement of systems.csv #249

josee-sabourin · 2020-07-21T13:39:42Z

A handful of current open issues and PRs (28, 34, 98, 203, 209), as well as a considerable number of closed ones, discuss systems.csv either for general upkeep or to discuss its shortcomings. We are looking into moving systems.csv to its own repo to minimize the number of PR/issues that are not in regards to the specification itself. We are also considering revamping the way the systems are displayed and moving away from the .csv format. Currently, our main consideration is to use DCAT in JSON format. A sample of some systems in this new format can be found here. What are people's thoughts on this change? Does this change affect (either positively or negatively) the way you consume this data?

skinkie · 2020-07-24T13:17:03Z

I think it is a bad very bad idea to split this from the standard. Considering that GBFS is effectively in JSON, the move towards a system.json seems fine, but I don't understand why a different vocabulary is used for the geo stuff than GeoJSON. In addition, a polygon makes much more sense than a bounding box.

gcamp · 2020-07-27T13:49:57Z

I think the format of the file should probably be part of the standard (this repo) but the data should probably be put somewhere else.

I agree that GeoJSON should be used where it makes sense.

sven4all · 2020-08-03T13:20:42Z

It would be interesting to make it possible to distribute this information.

For example that there is a repository with all the bike-sharing feeds within the Netherlands that is referenced from a world wide repository.

bddq · 2020-08-11T20:56:56Z

CSV upgrade

GEOJson looks like the best alternative to CSV file if we want to provide location for systems but I don't think system location or coverage polygon is needed in that listing.

City centroid can be generally determined using current country code and location fields.
Coverage polygons will not be accurate/up-to-date if the system shrinks/expands operating coverage.
Feed provider should provide centroid and/or polygon coverage in a new file in spec (think about something like geofencing_zones.json).

More "traditional" JSON file will be good to handle that list easily with current data.

Content authorship

Other key point found in #259, some systems Auto-Discovery URLs found in systems.csv are not provided -exactly hosted- by network owners.
For example, all feeds hosted on https://transport.data.gouv.fr are not feeds provided by network owners.
As described for example here, hosted GBFS feeds are converted by French Governmental platform Transports from other OpenData feeds.

So, do we need to restrict systems file to feeds hosted by network owners?
How and who can approve feed providers?

mplsmitch · 2020-08-12T18:39:31Z

I support DCAT in JSON format and moving to a separate repository but I have the same concerns raised by @bixcorp . The purpose of systems.csv has been to direct feed consumers to the location of gbfs.json files. We intentionally limited the scope of the document and chose the csv format to set a low bar for contributors who may or may not be the feed publisher. The addition of location.bbox and location.centroid seems out of scope for this document. These data should live in the specification. If they're to be required, I'd put them in system_information.json since not all providers have reason to publish geofencing_zones.json.

Many of the feeds listed in systems.csv have been contributed by folks who have tracked down feed locations since some publishers (authors) have neglected to add them on their own. Requiring location polygons is too high a bar for non-publishers, particularly if this data is not required as part of the specification.

jcn · 2020-09-25T17:33:35Z

As JSON is a much more verbose format, and not really intended for human-editing, I still support keeping the systems list as a CSV, though I am ambivalent about keeping it in the main repository.

If moving it out helps to keep the history of the main repo then I could see that as a reasonable argument, though I'd almost argue the opposite - that driving people to the main repo to submit their PRs ensures that they have to keep coming to the spec's page, thus forcing them to see any changes or improvements that are being made. And having constant activity on the main repo will continue to tell an interesting story about the interplay between the changes to the spec and the changes to the actual published feeds (to some extent). But I have no actual evidence that this would be true. 😄

cmonagle · 2020-10-21T15:00:16Z

If we're open to building some tooling around this process, some of the more onerous data points (like bounding box or polygon) could be determined programatically. Similarly, it would be useful to tie in the future validator to this process.

As for blurring the concerns of system_information and systems.json: I see the distinction as "information for discoverability" and "information for consuming the feed". I think bounding boxes fit in the former, but I see how it could get murky.

drewda · 2020-11-04T21:46:24Z

FYI, we've added GBFS feeds to the Transitland platform:

systems.csv is regularly fetched from this repo and transformed into Transitland's Distributed Mobility Format Registry (DMFR) JSON file format in https://github.com/transitland/transitland-atlas
GBFS feeds can be browsed at https://www.transit.land/feeds (make sure GBFS checkbox is selected)

Transitland is not fetching and parsing contents of the auto-discovery endpoint itself for each feed — that is the next logical set of functionality to add.

Not that if the format if the systems.csv file is changed or it is moved elsewhere, we can try to adjust Transitland ingest as appropriate.

stale · 2021-03-04T21:49:17Z

This discussion has been automatically marked as stale because it has not had recent activity. It will be closed in 60 days if no further activity occurs. Thank you for your contributions.

stale · 2021-05-03T21:53:40Z

This discussion has been closed due to inactivity. Discussions can always be reopened after they have been closed.

heidiguenin mentioned this issue Aug 11, 2020

Updated feeds #260

Merged

This was referenced Oct 29, 2020

Question: Stability of systems.csv? #28

Closed

bike network service name should be added to root csv file. #34

Closed

stale bot added the stale label Mar 4, 2021

mplsmitch mentioned this issue Apr 8, 2021

License of data available in systems.csv #309

Closed

bartoliniii mentioned this issue Apr 9, 2021

For Spin, removed Rotterdam and updates links to new GBFS hostname #307

Merged

3 tasks

stale bot closed this as completed May 3, 2021

josee-sabourin added the systems.csv label Jun 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvement of systems.csv #249

Improvement of systems.csv #249

josee-sabourin commented Jul 21, 2020 •

edited

Loading

skinkie commented Jul 24, 2020

gcamp commented Jul 27, 2020

sven4all commented Aug 3, 2020

bddq commented Aug 11, 2020

mplsmitch commented Aug 12, 2020

jcn commented Sep 25, 2020

cmonagle commented Oct 21, 2020

drewda commented Nov 4, 2020

stale bot commented Mar 4, 2021

stale bot commented May 3, 2021

Improvement of systems.csv #249

Improvement of systems.csv #249

Comments

josee-sabourin commented Jul 21, 2020 • edited Loading

skinkie commented Jul 24, 2020

gcamp commented Jul 27, 2020

sven4all commented Aug 3, 2020

bddq commented Aug 11, 2020

CSV upgrade

Content authorship

mplsmitch commented Aug 12, 2020

jcn commented Sep 25, 2020

cmonagle commented Oct 21, 2020

drewda commented Nov 4, 2020

stale bot commented Mar 4, 2021

stale bot commented May 3, 2021

josee-sabourin commented Jul 21, 2020 •

edited

Loading