Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPA Data Set #200

Open
1 of 11 tasks
akhaleghi opened this issue Apr 30, 2024 · 5 comments
Open
1 of 11 tasks

EPA Data Set #200

akhaleghi opened this issue Apr 30, 2024 · 5 comments
Assignees

Comments

@akhaleghi
Copy link
Contributor

akhaleghi commented Apr 30, 2024

Overview

REPLACE THIS TEXT -Text here that clearly states the purpose of this issue in 2 sentences or less.

Action Items

  • Add all data sources to Resources section below
    • EDA Tasks
      • Combine data from years into one data set and see differences
      • Data Dictionary
      • Data Cleaning
  • Write one-sheet
    • Define stakeholder (Access the data and 311 teams used for educational puposes)
    • Summarize project including value add
    • Define project 6 month roadmap
    • Detail history (if any)
  • Define tools to be used to visualize combined data

Additional tasks TBD

Resources/Instructions

  • Data source
  • Is there a link to an API to access the data?
@ktie1688 ktie1688 self-assigned this May 7, 2024
@Xeftor Xeftor self-assigned this May 9, 2024
@noelthomas28 noelthomas28 self-assigned this May 15, 2024
@akhaleghi akhaleghi moved this to New Issue Approval in CoP: Data Science: Project Board Jun 10, 2024
@max1million101 max1million101 self-assigned this Jun 30, 2024
@max1million101
Copy link
Member

Currently working on data dictionary. Will comment here for updates.

@max1million101
Copy link
Member

EPA_Data_Dictionary.xlsx

The above link should be the Data Dictionary for the EPA Data Set. Please let me know of any errors needed to be corrected.

@jackson6022 jackson6022 self-assigned this Aug 13, 2024
@Barreliza Barreliza self-assigned this Aug 13, 2024
@Megh-Dave Megh-Dave self-assigned this Sep 17, 2024
@sudhara
Copy link
Member

sudhara commented Sep 23, 2024

Working on merging datasets for last 5 years, performing EDA, creating plots in python.Image

@akhaleghi
Copy link
Contributor Author

@Barreliza @ayushpatel1501 @jackson6022 @noelthomas28 @ktie1688 @Xeftor We have too many people assigned to this issue who are inactive. If you would like to work on this issue further, please assign yourself again and provide weekly updates on your progress.

@phunguyen1195
Copy link
Member

Updates:

  1. Progress:

    • Data Collection:
      • Retrieved data using the EPA API for the years 2016–2024.
      • Included data for Los Angeles County and Ventura County.
      • Focused on the following criteria pollutants:
        • Lead (TSP) LC
        • Carbon Monoxide (CO)
        • Sulfur Dioxide (SO₂)
        • Nitrogen Dioxide (NO₂)
        • Ozone (O₃)
        • PM10 Total 0-10µm STP
        • Lead PM10 LC FRM/FEM
        • PM2.5 - Local Conditions
    • Data Processing:
      • Consolidated data from all years into a single dataset.
      • Segregated data based on pollutant attributes.
  2. Blockers:

    • Need further research on criteria pollutants and their corresponding AQI calculations.
  3. Availability:

    • Monday, Wednesday, Friday
  4. ETA:

    • TBD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In progress (actively working)
Development

No branches or pull requests