Research Request - Segment-trip speed diagnostics #594
Labels
gtfs-rt
Work related to GTFS-Realtime
research request
Issues that serve as a request for research (summary and handoff)
Complete the below when receiving a research request, and continue to add to this issue as you receive additional details and produce deliverables. Be sure to also add the appropriate project-level label to this issue (eg gtfs-rt, DLA).
Research Question
Single sentence description: Exploratory work to better understand issues that arise from using segments. Start from the most granular, with each row being segment-trip.
Detailed description:
(1) How many segments have only 1 point per trip? Either missing
min_time
/min_dist
ormax_time
/max_dist
(2) How many segments have unusually high or low speeds calculated? What's happening here? Continue to drop unusually high speeds, but do we want to drop unusually low speeds?
(3) How are routes that inline / loop handled? Hypothesis: we should see unusually low speeds.
How will this research be used?
Stakeholders & End-Users
Metrics
Data sources
analysis_date = '2023-01-18' -->
shared_utils.rt_dates["jan2023"]
rt_segment_speeds/speeds_route_segments/
.To read in the partitioned parquets, you'll need to use
dd.read_parquet("gs://bucket/folder/speeds_{analysis_date}/")
, notpd.read_parquet()
External data sources:
Remaining data source questions:
Deliverables:
Notebooks, tables saved as parquets
Timeline of deliverables:
The text was updated successfully, but these errors were encountered: