Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research Request - Speeds chart for 20th, mean, 80th percentile by shape #764

Closed
tiffanychu90 opened this issue May 23, 2023 · 0 comments
Closed
Assignees
Labels
gtfs-rt Work related to GTFS-Realtime research request Issues that serve as a request for research (summary and handoff)

Comments

@tiffanychu90
Copy link
Member

tiffanychu90 commented May 23, 2023

Complete the below when receiving a research request, and continue to add to this issue as you receive additional details and produce deliverables. Be sure to also add the appropriate project-level label to this issue (eg gtfs-rt, DLA).

Research Question

Single sentence description: The original idea was to look at the speeds for each operator-route-stop combination through charts. Over this proved to be unwieldy due to sheer amount of data. Instead, each row from speeds_stop_segments and avg_speeds_stop_segments was tagged if meters_elapsed and sec_elapsed both displayed zeroes.

Detailed description:

  • After tagging out rows, looked through the previous processing stages to see what could have potentially created these zeroes.

Previous:
Single sentence description: Charts for various speed columns - 20th percentile speeds, mean speeds, 80th percentile speeds by segment.

Detailed description:

  • Since we want to publish 3 speed columns, let's take a look at the distribution.
  • Trip-level speeds by segment are available....show a stripplot of this.
  • Average speeds by segment are what we want to publish. Overlay stripplot with the 3 numbers we'll publish - 20th, mean, 80th percentile, which is a summary of trip-level speeds.
  • Use the charting functions to show the distribution of speeds for an individual operator - shape_array_key - stop_id - stop_sequence combination
  • A stop is a stop_id-stop_sequence pair. x-axis ordered by stop sequence.
  • shape_array_key: versioned key for feed_key-shape_id...use this
  • include column of loop_or_inlining value in the chart

How will this research be used?

  • Use this to figure out which shapes are still problematic for us, and give us an idea of which stops to focus on
  • Before, making a map of 20th percentile speeds means we're always hitting the lower range. Do we see unexpected things if we now broaden our range?

Data sources

  • Cal-ITP data sources: GCS folder: rt_segment_speeds
  • March and April dates available
  • speeds_stop_segments_{analysis_date} partitioned parquet: by trip-segment
  • avg_speeds_stop_segments_{analysis_date}.parquet: by segment

Deliverables:

Notebook

@tiffanychu90 tiffanychu90 added the research request Issues that serve as a request for research (summary and handoff) label May 23, 2023
@tiffanychu90 tiffanychu90 added the gtfs-rt Work related to GTFS-Realtime label Nov 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gtfs-rt Work related to GTFS-Realtime research request Issues that serve as a request for research (summary and handoff)
Projects
None yet
Development

No branches or pull requests

2 participants