Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring new NTD endpoint sources into the warehouse as staging #3467

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

charlie-costanzo
Copy link
Member

@charlie-costanzo charlie-costanzo commented Sep 19, 2024

Description

This PR brings in the new external NTD tables created in #3465 as staging tables in the warehouse. As staging tables, we are introducing basic cleaning and typecasting as necessary for further analytical use downstream.

Because we have to pull the whole files on extract, these staging tables filter for the most recent extract_ts on creation.

This work also includes the required source and staging .yml files as part of the dbt project, and early documentation to be built upon iteratively from here.

This PR seeks to satisfy #3404, part of Epic #3401, building upon recent work found in #3415 and #3465.

Resolves #3404

Type of change

  • New feature

How has this been tested?

locally with dbt

Post-merge follow-ups

  • Actions required (specified below)
    observe for expected behavior

Copy link

github-actions bot commented Sep 19, 2024

Warehouse report 📦

Checks/potential follow-ups

Checks indicate the following action items may be necessary.

  • For new models, do they all have a surrogate primary key that is tested to be not-null and unique?

New models 🌱

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__breakdowns

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__breakdowns_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__capital_expenses_by_capital_use

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__capital_expenses_by_mode

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__capital_expenses_for_existing_service

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__capital_expenses_for_expansion_of_service

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__employees_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__employees_by_mode

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__employees_by_mode_and_employee_type

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__fuel_and_energy

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__fuel_and_energy_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__funding_sources_by_expense_type

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__funding_sources_directly_generated

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__funding_sources_federal

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__funding_sources_local

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__funding_sources_state

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__funding_sources_taxes_levied_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__maintenance_facilities

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__maintenance_facilities_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__metrics

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__operating_expenses_by_function

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__operating_expenses_by_function_and_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__operating_expenses_by_type

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__operating_expenses_by_type_and_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__service_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__service_by_mode

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__service_by_mode_and_time_period

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__stations_and_facilities_by_agency_and_facility_type

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__stations_by_mode_and_age

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__track_and_roadway_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__track_and_roadway_by_mode

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__track_and_roadway_guideway_age_distribution

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__vehicles_age_distribution

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annual_data__2022__vehicles_type_count_by_agency

DAG

Legend (in order of precedence)

Resource type Indicator Resolution
Large table-materialized model Orange Make the model incremental
Large model without partitioning or clustering Orange Add partitioning and/or clustering
View with more than one child Yellow Materialize as a table or incremental
Incremental Light green
Table Green
View White

@charlie-costanzo charlie-costanzo changed the title Bring new NTD endpoint sources into the warehouse Bring new NTD endpoint sources into the warehouse as staging Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NTD Modeling – Staging and Intermediate – Remaining Tables in Dataset
1 participant