Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: City of Montebello (Montebello Bus Lines) not showing up in GTFS Digest #1357

Open
1 of 5 tasks
evansiroky opened this issue Jan 23, 2025 · 4 comments
Open
1 of 5 tasks
Assignees
Labels
admin Administrative work bug Something isn't working

Comments

@evansiroky
Copy link
Member

Where did the bug occur?
Select from the below, and be sure to affix the appropriate label to this issue (e.g. dataset, jupyterhub, metabase, analysis.calitp.org)

  • Data (the warehouse)
  • JupyterHub
  • Metabase
  • analysis.calitp.org
  • Other (add detail)

Describe the bug

The City of Montebello (Montebello Bus Lines) not showing up in GTFS Digest - both agency Digest and also District Digest.

To Reproduce

Agency Digest for D7:

Image

D7 District Digest:

Image

Expected behavior

  • Agency Digest: City of Montebello expected at red line.
  • District Digest: City of Montebello expected in red circle.

Additional context

Since there was also another reported issue of a missing transit agency in #1254, I recommend cross referencing the complete GTFS Digest Agency List with the list of fixed-route agencies on the Statewide Transit Metrics dashboard.

@evansiroky evansiroky added admin Administrative work bug Something isn't working labels Jan 23, 2025
@amandaha8
Copy link
Contributor

@evansiroky City of Montebello will show up when I refresh the GTFS Digest with January's data. I cross referenced the GTFS Digest agency list with the 198 fixed-route agencies on the metabase dashboard you linked. While there were 185 matches, there were 14 agencies that were only found in GTFS Digest and 13 agencies that were only found in metabase. Do you believe some of the 13 of the metabase-only agencies should be in the GTFS Digest?

Image

@evansiroky
Copy link
Member Author

@amandaha8 I'd like to understand a little more why these lists are different. What data source is being used to obtain the list of agencies for the GTFS Digest? The metabase list pulls directly from the data warehouse which pulls from Airtable.

@amandaha8
Copy link
Contributor

@evansiroky The GTFS Digest merges RT Vehicle Positions and a number of schedule datasets like trips and stop times to create one large dataset. There's a function that tags whether a route is found in schedule only, realtime only or both. We only feature operators who are either both or schedule_only for the most recent date of data they have available.

@evansiroky
Copy link
Member Author

@evansiroky The GTFS Digest merges RT Vehicle Positions and a number of schedule datasets like trips and stop times to create one large dataset. There's a function that tags whether a route is found in schedule only, realtime only or both. We only feature operators who are either both or schedule_only for the most recent date of data they have available.

@amandaha8 Thanks for this analysis. If these are the lists of the remaining agencies, then I think the Jan Open Data (#1356) should probably do the trick.

The Transit Data Quality Team uses the following formula to determine what constitutes an "agency" in order for the data quality team to pursue the gathering of GTFS data and thus what makes up the list of agencies in the metabase question:

  1. The Organization manages at least one service that satisfies the following conditions:
    a. Is currently operating,
    b. Rideable by the general public,
    c. Is fixed-route
  2. The organization is either an NTD reporter (indicated by non-empty NTD ID column) or is a public entity (this excludes universities and non-profits)

Therefore, it seems that the explanations are as follows for each agency:

  • Explanation for some GTFS Digest agencies not showing up in Metabase List
    • Might have at one point had a service present in a GTFS Schedule Feed that no longer operates
      • Blue Lake Rancheria
      • Shasta County
    • Not considered a "public entity" by Transit Data Quality
      • The rest of them
  • Explanation for why some Metabase agencies not showing up in GTFS Digest
    • GTFS Digest doesn't show ferry service?
      • City of Alameda
      • Golden Gate
      • SF WETA
      • Santa Cruz Harbor
    • Might not be the "primary" organization managing a service that is represented in a GTFS feed with multiple services or maybe GTFS Digest doesn't include rail?
      • Dumbarton Bridge?
      • San Bernardino
      • Southern California Regional Rail Authority
    • Agency lacks any GTFS or Transit Data Quality unable to access GTFS
      • City of Beverly Hills
      • City of Carson
      • City of Compton
      • City of Laguna Niguel
      • City of Newport Beach
      • City of San Fernando

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
admin Administrative work bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants