open_data

History

Name		Name	Last commit message	Last commit date
parent directory ..
xml		xml
Makefile		Makefile
README.md		README.md
arcgis_pro_notebook_sample.ipynb		arcgis_pro_notebook_sample.ipynb
arcgis_pro_script.py		arcgis_pro_script.py
catalog.yml		catalog.yml
check_exported_data.ipynb		check_exported_data.ipynb
cleanup.py		cleanup.py
create_routes_data.py		create_routes_data.py
create_stops_data.py		create_stops_data.py
data_dictionary.yml		data_dictionary.yml
debug-amtrak.ipynb		debug-amtrak.ipynb
gcs_to_esri.py		gcs_to_esri.py
intake_justification.md		intake_justification.md
mermaid.md		mermaid.md
metadata.json		metadata.json
metadata.yml		metadata.yml
metadata_update_pro.py		metadata_update_pro.py
open_data_utils.py		open_data_utils.py
publish_public_gcs.py		publish_public_gcs.py
supplement_meta.py		supplement_meta.py
update_data_dict.py		update_data_dict.py
update_fields_fgdc.py		update_fields_fgdc.py
update_vars.py		update_vars.py

README.md

README

Open Data Portal

HQTA Areas: metadata feature server or map server
HQTA Stops: metadata feature server or map server
CA Transit Routes: metadata feature server or map server
CA Transit Stops: metadata feature server or map server
CA Average Transit Speeds by Stop-to-Stop Segments: metadata feature server or map server
CA Average Transit Speeds by Route and Time of Day: metadata feature server or map server
All GTFS datasets metadata/data dictionary

GTFS Schedule Routes & Stops Geospatial Data

Traffic Ops had a request for all transit routes and transit stops to be published in the open data portal.

Update update_vars.py for current month
In terminal: make create_gtfs_schedule_geospatial_open_data
- prep_traffic_ops: helper functions for creating routes and stops datasets
- create_routes_data: functions to assemble routes that appear in shapes
- create_stops_data: functions to assemble stop data

Metadata Automation Steps and References

Add your dataset to catalog.yml and run gcs_to_esri.
- In terminal: cd open_data followed by python gcs_to_esri.py
- The log will show basics like column names and EPSG. Make sure the metadata reflects the same info!
- Only use EPSG:4326 (WGS84). All open data portal datasets will be in WGS84.
- Download the zipped shapefiles from the Hub to your local filesystem.
If there are new datasets to add or changes to make, make them in metadata.yml and/or data_dictionary.yml.
- If there are changes to make in metadata.yml, make them. Afterwards, in terminal, run: python supplement_meta.py
If there are changes to be made to metadata.yml (adding new datasets, changing descriptions, change contact information, etc), make them. This is infrequent. An updated analysis date is already automated and does not have to be updated here.
In terminal: python supplement_meta.py
In terminal: python update_data_dict.py.
- Check the log results, which tells you if there are columns missing from data_dictionary.yml. These columns and their descriptions need to be added. Every column in the ESRI layer must have a definition, and where there's an external data dictionary website to cite, provide a definition source.
In terminal: python update_fields_fgdc.py. This populates fields with data_dictionary.yml values.
- Only run if update_data_dict had changes to incorporate
Run arcgis_pro_script to create XML files. Often it's easier to run via the notebook, but the script exists for better version control and to track feature changes.
- Open a notebook in Hub and find the ARCGIS_PATH (your preferred local path for ArcGIS work)
- Hardcode that path for arcpy.env.workspace = ARCGIS_PATH
- Download metadata.json and place in your local path.
- The exported XML metadata will be in file gdb directory.
- Upload the XML metadata into Hub in open_data/xml/.
If there are new datasets added, open update_vars.py and modify the script.
In terminal: python metadata_update_pro.py.
- Change into the open_data directory: cd open_data/.
- The overwritten XML is stored in open_data/xml/run_in_esri/.
- Download the overwritten XML files locally to run in ArcGIS.
Run arcgis_pro_script after importing the updated XML metadata for each feature class.
- There are steps to create FGDC templates for each datasets to store field information.
- This only needs to be done once when a new dataset is created.
In terminal: python cleanup.py to clean up old XML files and remove zipped shapefiles.
- The YAML and XML files created/have changes get checked into GitHub.

Metadata

Metadata
Data dictionary
update_vars contains a lot of the variables that would frequently get updated in the publishing process.
- Apply standardized column names across published datasets, even they differ from internal keys (org_id in favor of gtfs_dataset_key, agency in favor of organization_name).
- Since we do not save multiple versions of published datasets, the columns are renamed prior to exporting the geoparquet as a zipped shapefile.

Open Data Intake Process

Open a ticket on the Intranet to update or add new services and provide justification

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

open_data

open_data

README.md

README

Open Data Portal

GTFS Schedule Routes & Stops Geospatial Data

Metadata Automation Steps and References

Metadata

Open Data Intake Process

Files

open_data

Directory actions

More options

Directory actions

More options

Latest commit

History

open_data

Folders and files

parent directory

README.md

README

Open Data Portal

GTFS Schedule Routes & Stops Geospatial Data

Metadata Automation Steps and References

Metadata

Open Data Intake Process