Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvement of systems.csv #249

Closed
josee-sabourin opened this issue Jul 21, 2020 · 10 comments
Closed

Improvement of systems.csv #249

josee-sabourin opened this issue Jul 21, 2020 · 10 comments

Comments

@josee-sabourin
Copy link
Contributor

josee-sabourin commented Jul 21, 2020

A handful of current open issues and PRs (28, 34, 98, 203, 209), as well as a considerable number of closed ones, discuss systems.csv either for general upkeep or to discuss its shortcomings. We are looking into moving systems.csv to its own repo to minimize the number of PR/issues that are not in regards to the specification itself. We are also considering revamping the way the systems are displayed and moving away from the .csv format. Currently, our main consideration is to use DCAT in JSON format. A sample of some systems in this new format can be found here. What are people's thoughts on this change? Does this change affect (either positively or negatively) the way you consume this data?

@skinkie
Copy link

skinkie commented Jul 24, 2020

I think it is a bad very bad idea to split this from the standard. Considering that GBFS is effectively in JSON, the move towards a system.json seems fine, but I don't understand why a different vocabulary is used for the geo stuff than GeoJSON. In addition, a polygon makes much more sense than a bounding box.

@gcamp
Copy link

gcamp commented Jul 27, 2020

I think the format of the file should probably be part of the standard (this repo) but the data should probably be put somewhere else.

I agree that GeoJSON should be used where it makes sense.

@sven4all
Copy link
Contributor

sven4all commented Aug 3, 2020

It would be interesting to make it possible to distribute this information.

For example that there is a repository with all the bike-sharing feeds within the Netherlands that is referenced from a world wide repository.

@bddq
Copy link
Contributor

bddq commented Aug 11, 2020

CSV upgrade

GEOJson looks like the best alternative to CSV file if we want to provide location for systems but I don't think system location or coverage polygon is needed in that listing.

  1. City centroid can be generally determined using current country code and location fields.
  2. Coverage polygons will not be accurate/up-to-date if the system shrinks/expands operating coverage.
  3. Feed provider should provide centroid and/or polygon coverage in a new file in spec (think about something like geofencing_zones.json).

More "traditional" JSON file will be good to handle that list easily with current data.

Content authorship

Other key point found in #259, some systems Auto-Discovery URLs found in systems.csv are not provided -exactly hosted- by network owners.
For example, all feeds hosted on https://transport.data.gouv.fr are not feeds provided by network owners.
As described for example here, hosted GBFS feeds are converted by French Governmental platform Transports from other OpenData feeds.

So, do we need to restrict systems file to feeds hosted by network owners?
How and who can approve feed providers?

@mplsmitch
Copy link
Collaborator

I support DCAT in JSON format and moving to a separate repository but I have the same concerns raised by @bixcorp . The purpose of systems.csv has been to direct feed consumers to the location of gbfs.json files. We intentionally limited the scope of the document and chose the csv format to set a low bar for contributors who may or may not be the feed publisher. The addition of location.bbox and location.centroid seems out of scope for this document. These data should live in the specification. If they're to be required, I'd put them in system_information.json since not all providers have reason to publish geofencing_zones.json.

Many of the feeds listed in systems.csv have been contributed by folks who have tracked down feed locations since some publishers (authors) have neglected to add them on their own. Requiring location polygons is too high a bar for non-publishers, particularly if this data is not required as part of the specification.

@jcn
Copy link
Contributor

jcn commented Sep 25, 2020

As JSON is a much more verbose format, and not really intended for human-editing, I still support keeping the systems list as a CSV, though I am ambivalent about keeping it in the main repository.

If moving it out helps to keep the history of the main repo then I could see that as a reasonable argument, though I'd almost argue the opposite - that driving people to the main repo to submit their PRs ensures that they have to keep coming to the spec's page, thus forcing them to see any changes or improvements that are being made. And having constant activity on the main repo will continue to tell an interesting story about the interplay between the changes to the spec and the changes to the actual published feeds (to some extent). But I have no actual evidence that this would be true. 😄

@cmonagle
Copy link
Contributor

If we're open to building some tooling around this process, some of the more onerous data points (like bounding box or polygon) could be determined programatically. Similarly, it would be useful to tie in the future validator to this process.

As for blurring the concerns of system_information and systems.json: I see the distinction as "information for discoverability" and "information for consuming the feed". I think bounding boxes fit in the former, but I see how it could get murky.

@drewda
Copy link
Contributor

drewda commented Nov 4, 2020

FYI, we've added GBFS feeds to the Transitland platform:

Transitland is not fetching and parsing contents of the auto-discovery endpoint itself for each feed — that is the next logical set of functionality to add.

Not that if the format if the systems.csv file is changed or it is moved elsewhere, we can try to adjust Transitland ingest as appropriate.

@stale
Copy link

stale bot commented Mar 4, 2021

This discussion has been automatically marked as stale because it has not had recent activity. It will be closed in 60 days if no further activity occurs. Thank you for your contributions.

@stale
Copy link

stale bot commented May 3, 2021

This discussion has been closed due to inactivity. Discussions can always be reopened after they have been closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants