Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TDDE: Transformations in SQL Stored Proc #7

Open
donaldsawyer opened this issue Sep 14, 2021 · 0 comments
Open

TDDE: Transformations in SQL Stored Proc #7

donaldsawyer opened this issue Sep 14, 2021 · 0 comments
Assignees
Labels
must have Required for implementation

Comments

@donaldsawyer
Copy link
Owner

donaldsawyer commented Sep 14, 2021

Develop the transformation using a stored procedure to select data from ontime_data, join it to carrier_code, and perform projection. New table is ontime_carrier.

Transformations:.

  1. Join on carrier column
  2. description renamed to carrier_desc
  3. Calculate arrived_flag
    1. Y when not cancelled or diverted
    2. N when cancelled or diverted
    3. NULL when not cancelled or diverted, but arr_delay is NULL
  4. Columns: year, month, day_of_month, carrier_code, carrier_desc, flight_number,origin, destination,arrived_flag

Acceptance Criteria:

  1. Tests written in python as part of TDDE framework
    1. Cancelled
    2. Diverted
    3. Not cancelled/diverted, but arr_delay is NULL. Feature reference
    4. JOIN on missing carrier should result in UNKNOWN for carrier_desc
    5. There should only be one row per unique flight (think: what happens when there are multiple rows?)
    6. only listed columns exist, and in order
  2. Stored Procedure Created
  3. Smoke test run on full dataset
@donaldsawyer donaldsawyer added the must have Required for implementation label Sep 14, 2021
@mwallacemn mwallacemn self-assigned this Sep 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
must have Required for implementation
Projects
None yet
Development

No branches or pull requests

2 participants