Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hqta collinearity #1323

Merged
merged 22 commits into from
Dec 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
9cc6093
track if single or multi-route aggregation
edasmalchi Dec 4, 2024
737c738
actually add nb
edasmalchi Dec 5, 2024
ca3ba00
start crafting stop qualification algorithm
edasmalchi Dec 9, 2024
3109bcc
yolobus test, more complex
edasmalchi Dec 9, 2024
97ec389
more progress, attempt to scale with lbt
edasmalchi Dec 9, 2024
9d3f3cc
refine emperically
edasmalchi Dec 10, 2024
5550054
start moving to scripts, try different arrangements:
edasmalchi Dec 10, 2024
0b89486
keep on scriptin'
edasmalchi Dec 10, 2024
e2c0076
loop speed acceptable but function has a bug somewhere
edasmalchi Dec 10, 2024
0077bfa
working well! all in functions, move to script pending
edasmalchi Dec 11, 2024
9203c3a
min frequency screens should be >=, not >
edasmalchi Dec 12, 2024
3b01f17
more debug, try pipeline
edasmalchi Dec 13, 2024
268c3f1
update hqta segment keep one
edasmalchi Dec 13, 2024
7d94942
run pipeline, notice picking up unwanted cross-street stops, try tuni…
edasmalchi Dec 13, 2024
c20bbdf
35 still a bit high -- dynamic buffer also, filter route_type to bus …
edasmalchi Dec 13, 2024
59f59d3
only keep segments from routes with at least one qualifying stop
edasmalchi Dec 13, 2024
86068ff
exclude looping segments from intersection calculations
edasmalchi Dec 13, 2024
c885729
move all route collinearity steps to script
edasmalchi Dec 13, 2024
5ef30da
successful in full pipeline, changes add about 2min (but also vastly …
edasmalchi Dec 14, 2024
d03d5a4
docstrings, pending readme
edasmalchi Dec 14, 2024
e8b3c7f
new technical notes
edasmalchi Dec 17, 2024
ccc972e
fix typo
edasmalchi Dec 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
772 changes: 91 additions & 681 deletions high_quality_transit_areas/08_2024_methodology.ipynb

Large diffs are not rendered by default.

1,166 changes: 1,166 additions & 0 deletions high_quality_transit_areas/09_enforce_collinearity.ipynb

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions high_quality_transit_areas/Makefile
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
hqta_data:
python rail_ferry_brt_stops.py
python create_hqta_segments.py
python create_aggregate_stop_frequencies.py
python sjoin_stops_to_segments.py
python prep_pairwise_intersections.py
python get_intersections.py
Expand Down
7 changes: 5 additions & 2 deletions high_quality_transit_areas/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,11 @@ For more details, see [links](#High-Quality-Transit-Areas-Relevant-Statutes) to
* Corridors are cut for each operator-route-direction every 1,250 meters.
* Stops have a 50 meter buffer drawn around them and are spatially joined to corridors.
* For each 1,250 meter corridor segment, the highest frequency stop is counted (if there are multiple stops).
* We count _all_ trips at the highest frequency stop in each corridor segment, even if they're marketed as different "routes".
* For example, if Route A serves the stop twice an hour, and Route B twice an hour, we consider the stop frequency to be 4 trips/hour, qualifying it as a high quality transit corridor and potentially a major bus stop (if an intersecting corridor is present)
* We count trips at the highest frequency stop in each corridor segment. Where routes providing similar service share a stop, we combine their trips for frequency calculation. However, we don't combine trips across routes where those routes do not provide similar service. We currently do this by counting the number of stops that different routes share in a given direction.
* For example, suppose that Route A and Route B share many stops in the same direction. They may be two routes coming together to provide combined service on a corridor while branching elsewhere, or a local and rapid route on the same corridor.
* In that case if Route A serves the stop twice an hour, and Route B twice an hour, we consider the stop frequency to be 4 trips/hour, qualifying it as a high quality transit corridor and potentially a major bus stop (if an intersecting corridor is present)
* If, however, Route A and Route B only shared one or two stops, for example starting together at a transit center but going off in different directions, we would not combine them for stop frequency calculations.
* See [technical_notes.md](technical_notes.md) for additional details.

### Why Not one "Route" at a Time?

Expand Down
335 changes: 335 additions & 0 deletions high_quality_transit_areas/create_aggregate_stop_frequencies.py

Large diffs are not rendered by default.

60 changes: 60 additions & 0 deletions high_quality_transit_areas/logs/hqta_processing.log
Original file line number Diff line number Diff line change
Expand Up @@ -57,3 +57,63 @@
2024-10-22 11:04:23.355 | INFO | __main__:<module>:175 - C3_create_bus_hqta_types 2024-10-21 execution time: 0:00:28.143350
2024-10-22 11:05:01.542 | INFO | __main__:<module>:219 - D1_assemble_hqta_points 2024-10-21 execution time: 0:00:19.728061
2024-10-22 11:05:44.215 | INFO | __main__:<module>:168 - D2_assemble_hqta_polygons 2024-10-21 execution time: 0:00:25.999522
2024-12-12 16:47:26.757 | INFO | __main__:<module>:277 - A1_rail_ferry_brt_stops 2024-10-21 execution time: 0:00:18.872683
2024-12-12 17:12:30.180 | INFO | __main__:<module>:248 - B1_create_hqta_segments execution time: 0:24:45.443867
2024-12-13 10:18:57.764 | INFO | __main__:<module>:309 - B2_sjoin_stops_to_segments 2024-10-21 execution time: 0:00:47.718032
2024-12-13 10:19:23.594 | INFO | __main__:<module>:147 - C1_prep_pairwise_intersections 2024-10-21 execution time: 0:00:06.418968
2024-12-13 10:20:09.524 | INFO | __main__:<module>:124 - C2_find_intersections 2024-10-21 execution time: 0:00:26.010689
2024-12-13 10:20:51.317 | INFO | __main__:<module>:175 - C3_create_bus_hqta_types 2024-10-21 execution time: 0:00:22.777068
2024-12-13 10:21:31.926 | INFO | __main__:<module>:219 - D1_assemble_hqta_points 2024-10-21 execution time: 0:00:21.842912
2024-12-13 10:22:09.622 | INFO | __main__:<module>:168 - D2_assemble_hqta_polygons 2024-10-21 execution time: 0:00:19.468992
2024-12-13 10:55:54.405 | INFO | __main__:<module>:309 - B2_sjoin_stops_to_segments 2024-10-21 execution time: 0:00:46.353523
2024-12-13 10:56:20.168 | INFO | __main__:<module>:147 - C1_prep_pairwise_intersections 2024-10-21 execution time: 0:00:06.499986
2024-12-13 10:57:00.377 | INFO | __main__:<module>:124 - C2_find_intersections 2024-10-21 execution time: 0:00:20.644833
2024-12-13 10:57:41.425 | INFO | __main__:<module>:175 - C3_create_bus_hqta_types 2024-10-21 execution time: 0:00:22.287287
2024-12-13 10:58:22.304 | INFO | __main__:<module>:219 - D1_assemble_hqta_points 2024-10-21 execution time: 0:00:22.018224
2024-12-13 10:59:00.824 | INFO | __main__:<module>:168 - D2_assemble_hqta_polygons 2024-10-21 execution time: 0:00:19.198694
2024-12-13 11:29:41.820 | INFO | __main__:<module>:309 - B2_sjoin_stops_to_segments 2024-10-21 execution time: 0:00:44.139854
2024-12-13 11:30:07.220 | INFO | __main__:<module>:147 - C1_prep_pairwise_intersections 2024-10-21 execution time: 0:00:06.567972
2024-12-13 11:30:44.176 | INFO | __main__:<module>:124 - C2_find_intersections 2024-10-21 execution time: 0:00:18.665460
2024-12-13 11:31:22.746 | INFO | __main__:<module>:175 - C3_create_bus_hqta_types 2024-10-21 execution time: 0:00:20.396299
2024-12-13 11:32:02.933 | INFO | __main__:<module>:219 - D1_assemble_hqta_points 2024-10-21 execution time: 0:00:21.877206
2024-12-13 11:32:39.694 | INFO | __main__:<module>:168 - D2_assemble_hqta_polygons 2024-10-21 execution time: 0:00:18.424974
2024-12-13 11:55:48.649 | INFO | __main__:<module>:309 - B2_sjoin_stops_to_segments 2024-10-21 execution time: 0:00:44.101742
2024-12-13 11:56:12.576 | INFO | __main__:<module>:147 - C1_prep_pairwise_intersections 2024-10-21 execution time: 0:00:06.184147
2024-12-13 11:56:47.404 | INFO | __main__:<module>:124 - C2_find_intersections 2024-10-21 execution time: 0:00:17.150945
2024-12-13 11:57:23.415 | INFO | __main__:<module>:175 - C3_create_bus_hqta_types 2024-10-21 execution time: 0:00:18.277817
2024-12-13 11:58:00.304 | INFO | __main__:<module>:219 - D1_assemble_hqta_points 2024-10-21 execution time: 0:00:19.643261
2024-12-13 11:58:35.391 | INFO | __main__:<module>:168 - D2_assemble_hqta_polygons 2024-10-21 execution time: 0:00:17.158636
2024-12-13 13:00:08.789 | INFO | __main__:<module>:300 - B2_sjoin_stops_to_segments 2024-10-21 execution time: 0:00:14.130307
2024-12-13 13:07:31.794 | INFO | __main__:<module>:301 - B2_sjoin_stops_to_segments 2024-10-21 execution time: 0:00:13.237527
2024-12-13 13:07:54.987 | INFO | __main__:<module>:147 - C1_prep_pairwise_intersections 2024-10-21 execution time: 0:00:04.741102
2024-12-13 13:08:22.103 | INFO | __main__:<module>:124 - C2_find_intersections 2024-10-21 execution time: 0:00:09.561966
2024-12-13 13:08:53.745 | INFO | __main__:<module>:175 - C3_create_bus_hqta_types 2024-10-21 execution time: 0:00:13.602800
2024-12-13 13:09:30.997 | INFO | __main__:<module>:219 - D1_assemble_hqta_points 2024-10-21 execution time: 0:00:19.646276
2024-12-13 13:10:04.076 | INFO | __main__:<module>:168 - D2_assemble_hqta_polygons 2024-10-21 execution time: 0:00:14.769622
2024-12-13 13:35:37.412 | INFO | __main__:<module>:302 - B2_sjoin_stops_to_segments 2024-10-21 execution time: 0:00:13.446776
2024-12-13 13:36:44.557 | INFO | __main__:<module>:301 - B2_sjoin_stops_to_segments 2024-10-21 execution time: 0:00:12.039450
2024-12-13 13:41:50.347 | INFO | __main__:<module>:301 - B2_sjoin_stops_to_segments 2024-10-21 execution time: 0:00:12.691516
2024-12-13 13:42:12.391 | INFO | __main__:<module>:147 - C1_prep_pairwise_intersections 2024-10-21 execution time: 0:00:04.804543
2024-12-13 13:42:38.996 | INFO | __main__:<module>:124 - C2_find_intersections 2024-10-21 execution time: 0:00:09.349679
2024-12-13 13:43:10.018 | INFO | __main__:<module>:175 - C3_create_bus_hqta_types 2024-10-21 execution time: 0:00:12.967725
2024-12-13 13:43:48.097 | INFO | __main__:<module>:219 - D1_assemble_hqta_points 2024-10-21 execution time: 0:00:19.708957
2024-12-13 13:44:20.201 | INFO | __main__:<module>:168 - D2_assemble_hqta_polygons 2024-10-21 execution time: 0:00:15.203018
2024-12-13 14:33:01.805 | INFO | __main__:<module>:326 - B2_sjoin_stops_to_segments 2024-10-21 execution time: 0:00:14.181668
2024-12-13 14:33:23.474 | INFO | __main__:<module>:147 - C1_prep_pairwise_intersections 2024-10-21 execution time: 0:00:04.576992
2024-12-13 14:33:50.423 | INFO | __main__:<module>:124 - C2_find_intersections 2024-10-21 execution time: 0:00:09.325691
2024-12-13 14:34:21.885 | INFO | __main__:<module>:175 - C3_create_bus_hqta_types 2024-10-21 execution time: 0:00:12.960027
2024-12-13 14:34:59.154 | INFO | __main__:<module>:219 - D1_assemble_hqta_points 2024-10-21 execution time: 0:00:19.578227
2024-12-13 14:35:32.414 | INFO | __main__:<module>:168 - D2_assemble_hqta_polygons 2024-10-21 execution time: 0:00:14.906216
2024-12-13 15:09:26.400 | INFO | __main__:<module>:326 - B2_sjoin_stops_to_segments 2024-10-21 execution time: 0:00:14.579218
2024-12-13 15:09:51.521 | INFO | __main__:<module>:147 - C1_prep_pairwise_intersections 2024-10-21 execution time: 0:00:04.461899
2024-12-13 15:10:16.883 | INFO | __main__:<module>:124 - C2_find_intersections 2024-10-21 execution time: 0:00:07.830454
2024-12-13 15:10:47.425 | INFO | __main__:<module>:175 - C3_create_bus_hqta_types 2024-10-21 execution time: 0:00:11.914511
2024-12-13 15:11:23.514 | INFO | __main__:<module>:219 - D1_assemble_hqta_points 2024-10-21 execution time: 0:00:18.371460
2024-12-13 15:11:54.767 | INFO | __main__:<module>:168 - D2_assemble_hqta_polygons 2024-10-21 execution time: 0:00:13.257745
2024-12-13 16:09:11.914 | INFO | __main__:<module>:319 - B2_create_aggregate_stop_frequencies 2024-10-21 execution time: 0:02:30.392578
2024-12-13 16:09:42.724 | INFO | __main__:<module>:326 - B3_sjoin_stops_to_segments 2024-10-21 execution time: 0:00:12.911364
2024-12-13 16:10:05.196 | INFO | __main__:<module>:147 - C1_prep_pairwise_intersections 2024-10-21 execution time: 0:00:05.025651
2024-12-13 16:10:32.475 | INFO | __main__:<module>:124 - C2_find_intersections 2024-10-21 execution time: 0:00:07.908320
2024-12-13 16:11:06.186 | INFO | __main__:<module>:175 - C3_create_bus_hqta_types 2024-10-21 execution time: 0:00:16.642029
2024-12-13 16:11:42.067 | INFO | __main__:<module>:219 - D1_assemble_hqta_points 2024-10-21 execution time: 0:00:19.108129
2024-12-13 16:12:12.529 | INFO | __main__:<module>:168 - D2_assemble_hqta_polygons 2024-10-21 execution time: 0:00:12.842061
75 changes: 50 additions & 25 deletions high_quality_transit_areas/sjoin_stops_to_segments.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
import datetime
import geopandas as gpd
import pandas as pd
import numpy as np
import shapely
import sys

from loguru import logger
Expand Down Expand Up @@ -86,26 +88,17 @@ def stop_times_aggregation_max_by_stop(
trips,
on = ["feed_key", "trip_id"]
)

stop_times.direction_id = stop_times.direction_id.fillna(0).astype('int64').astype(str)
stop_times['route_dir'] = stop_times[['route_id', 'direction_id']].agg('_'.join, axis=1)
# Aggregate how many trips are made at that stop by departure hour
trips_per_hour = gtfs_schedule_wrangling.stop_arrivals_per_stop(
trips_per_peak_period = gtfs_schedule_wrangling.stop_arrivals_per_stop(
stop_times,
group_cols = stop_cols + trips_per_hour_cols,
count_col = "trip_id"
).rename(columns = {"n_arrivals": "n_trips"})

# Count based on fixed peak periods, take average in each
am_trips = max_trips_by_group(
trips_per_hour[trips_per_hour.peak == 'am_peak'],
group_cols = stop_cols,
max_col = "n_trips"
).rename(columns = {"n_trips": "am_max_trips"})

pm_trips = max_trips_by_group(
trips_per_hour[trips_per_hour.peak == 'pm_peak'],
group_cols = stop_cols,
max_col = "n_trips"
).rename(columns = {"n_trips": "pm_max_trips"})
am_trips = (trips_per_peak_period[trips_per_peak_period.peak == 'am_peak']).rename(columns = {"n_trips": "am_max_trips"})
pm_trips = (trips_per_peak_period[trips_per_peak_period.peak == 'pm_peak']).rename(columns = {"n_trips": "pm_max_trips"})

max_trips_by_stop = pd.merge(
am_trips,
Expand Down Expand Up @@ -183,7 +176,7 @@ def hqta_segment_keep_one_stop(
# Merge in and keep max trips observation
# Since there might be duplicates still, where multiple stops all
# share 2 trips for that segment, do a drop duplicates at the end
max_trip_cols = ["hqta_segment_id", "am_max_trips", "pm_max_trips"]
max_trip_cols = ["hqta_segment_id", "am_max_trips_hr", "pm_max_trips_hr"]

segment_to_stop_unique = pd.merge(
segment_to_stop_times,
Expand All @@ -203,6 +196,28 @@ def hqta_segment_keep_one_stop(

return segment_to_stop_gdf

def find_inconclusive_directions(hqta_segments: gpd.GeoDataFrame) -> gpd.GeoDataFrame:
'''
Where individual segments loop tightly, segment_direction becomes arbitrary.
Find these cases and mark segment_direction as "inconclusive"
OK to keep in possible HQ corridors, but shouldn't be used for bus intersection major stops
'''
circuitousness_ratio_threshold = 3

hqta_segments['length'] = hqta_segments.geometry.apply(lambda x: x.length)
hqta_segments['start'] = hqta_segments.geometry.apply(lambda x: shapely.Point(x.coords[0]))
hqta_segments['end'] = hqta_segments.geometry.apply(lambda x: shapely.Point(x.coords[-1]))
hqta_segments['st_end_dist'] = hqta_segments.apply(lambda x: shapely.distance(x.start, x.end),axis=1)
hqta_segments['circuitousness_ratio'] = ((hqta_segments.length / hqta_segments.st_end_dist)
.replace(np.inf, 10)
.clip(upper=5))
hqta_segments.segment_direction = hqta_segments.apply(
lambda x: x.segment_direction if x.circuitousness_ratio < circuitousness_ratio_threshold else 'inconclusive', axis=1)
calculation_cols = ['length', 'start', 'end',
'st_end_dist', 'circuitousness_ratio']
hqta_segments = hqta_segments.drop(columns=calculation_cols)
return hqta_segments


def sjoin_stops_and_stop_times_to_hqta_segments(
hqta_segments: gpd.GeoDataFrame,
Expand All @@ -221,7 +236,14 @@ def sjoin_stops_and_stop_times_to_hqta_segments(
Since frequency thresholds for hq corrs and major stops have diverged,
need to track both categories
"""
# Draw 50 m buffer to capture stops around hqta segments
# Only keep segments for routes that have at least one stop meeting frequency threshold
# About 50x smaller, so should both slash false positives and enhance speed
st_copy = stop_times.copy().drop_duplicates(subset=['schedule_gtfs_dataset_key', 'route_id'])
hqta_segments = (hqta_segments.merge(st_copy[['schedule_gtfs_dataset_key', 'route_id']], on=['schedule_gtfs_dataset_key', 'route_id']))
stop_times = stop_times.drop(columns=['route_id']).drop_duplicates() # prefer route_id from segments in future steps
# Identify ambiguous direction segments to exclude from intersection steps
hqta_segments = find_inconclusive_directions(hqta_segments)
# Draw buffer to capture stops around hqta segments
hqta_segments2 = hqta_segments.assign(
geometry = hqta_segments.geometry.buffer(buffer_size)
)
Expand Down Expand Up @@ -261,15 +283,18 @@ def sjoin_stops_and_stop_times_to_hqta_segments(

start = datetime.datetime.now()

# (1) Aggregate stop times - by stop_id, find max trips in AM/PM peak
# takes 1 min
max_arrivals_by_stop = helpers.import_scheduled_stop_times(
analysis_date,
get_pandas = True,
).pipe(prep_stop_times).pipe(stop_times_aggregation_max_by_stop, analysis_date)
# shift to new script which will add collinearity checks
# # (1) Aggregate stop times - by stop_id, find max trips in AM/PM peak
# # takes 1 min
# max_arrivals_by_stop = helpers.import_scheduled_stop_times(
# analysis_date,
# get_pandas = True,
# ).pipe(prep_stop_times).pipe(stop_times_aggregation_max_by_stop, analysis_date)

# max_arrivals_by_stop.to_parquet(
# f"{GCS_FILE_PATH}max_arrivals_by_stop.parquet")

max_arrivals_by_stop.to_parquet(
f"{GCS_FILE_PATH}max_arrivals_by_stop.parquet")
# new step 1!

## (2) Spatial join stops and stop times to hqta segments
# this takes < 2 min
Expand Down Expand Up @@ -299,7 +324,7 @@ def sjoin_stops_and_stop_times_to_hqta_segments(

end = datetime.datetime.now()
logger.info(
f"B2_sjoin_stops_to_segments {analysis_date} "
f"B3_sjoin_stops_to_segments {analysis_date} "
f"execution time: {end - start}")

#client.close()
Loading
Loading