Skip to content

Feat: Add free weather module using Meteostat (and fix dependencies)#49

Merged
harshitaphadtare merged 1 commit intoharshitaphadtare:mainfrom
AyushAnand413:feature/meteostat-weather
Oct 24, 2025
Merged

Feat: Add free weather module using Meteostat (and fix dependencies)#49
harshitaphadtare merged 1 commit intoharshitaphadtare:mainfrom
AyushAnand413:feature/meteostat-weather

Conversation

@AyushAnand413
Copy link
Contributor

@AyushAnand413 AyushAnand413 commented Oct 21, 2025

Description

Fixes: #47

This PR adds a new, independent feature for fetching rich historical weather data using the free Meteostat library. This solves the core problem of relying on a limited, static precipitation.csv file.

This PR is purely additive and does not change the existing training pipeline. It also fixes critical bugs in the project's setup.

Changes Made

  • Added: src/features/weather_api.py - A new module with a function get_weather_for_trip() to get weather for a single, specific time and location.
  • Added: src/get_historical_weather.py - A bulk-download script to get the entire 2016 weather dataset for NYC in one command, saving it to data/processed/historical_weather.csv.
  • Added: meteostat to requirements.txt.
  • Updated: config.py to include a new path for 'historical_weather' in the DATA_PATHS dictionary.
  • Updated: The bulk-download script to be more robust, handling the case where 'visibility' data is missing from the API response.

Testing

  • Ran unit tests (pnpm test)
  • Tested manually (describe below):
    • Test case 1: Test live API module
      • Run: python src/features/weather_api.py
      • Expected Result: Script prints Success! Received data: ...
    • Test case 2: Test bulk-download module
      • Run: python src/get_historical_weather.py
      • Expected Result: Script prints Success! Saved 4368 hourly records to data/processed/historical_weather.csv
    • Test case 3: Test for regressions
      • Run: python main.py
      • Expected Result: The original pipeline runs successfully (proves that my new code did not break any existing logic).

Screenshots

N/A. This is a backend data pipeline enhancement.

Checklist

  • My code follows the project's coding standards.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas (via docstrings).
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have verified that no API keys or other secrets are committed. (Meteostat is keyless).
  • I have updated .env.example with any new environment variables. (No .env file was needed).

💡 How to Evolve the Model with This Data

This PR adds the tools to improve the model. To see the new results, the pipeline needs to be restructured as follows:

  1. Get the Data:
    First, run the new bulk-download script one time to create the historical_weather.csv file:

    python src/get_historical_weather.py
  2. Modify src/feature_pipe.py:

    • Find the logic that loads precipitation.csv.
    • Remove that logic.
    • Add new logic to load and merge historical_weather.csv:
      # Load the new data
      weather_path = config.DATA_PATHS['historical_weather']
      df_weather = pd.read_csv(weather_path, parse_dates=['datetime_hourly'])
      
      # Create merge key (round pickup time to nearest hour)
      df['datetime_hourly'] = df['pickup_datetime'].dt.floor('h')
      
      # Merge
      df = pd.merge(df, df_weather, on='datetime_hourly', how='left')
      
      # Fill any missing values
      for col in ['temp', 'humidity', 'wind_speed']:
          if col in df.columns:
              df[col] = df[col].fillna(df[col].mean())
  3. Modify config.py:

    • Find the FEATURE_COLUMNS dictionary.
    • Remove the 'precipitation' line.
    • Add the new weather features:
      'weather': ['temp', 'humidity', 'wind_speed']
  4. Re-Run and Compare:
    Finally, run python main.py. The pipeline will re-build the features and re-train the models. The new RMSE scores can then be compared to the old ones to see the improvement.

@netlify
Copy link

netlify bot commented Oct 21, 2025

Deploy Preview for gopredict ready!

Name Link
🔨 Latest commit 03922a3
🔍 Latest deploy log https://app.netlify.com/projects/gopredict/deploys/68f7ef636d585b000861662d
😎 Deploy Preview https://deploy-preview-49--gopredict.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@netlify
Copy link

netlify bot commented Oct 21, 2025

Deploy Preview for storied-fudge-a7595c canceled.

Name Link
🔨 Latest commit 03922a3
🔍 Latest deploy log https://app.netlify.com/projects/storied-fudge-a7595c/deploys/68f7ef63157a8600087ab875

@AyushAnand413
Copy link
Contributor Author

@harshitaphadtareI have made a PR please read the description for a full breakdown of the changes and testing instructions.
Also please add hacktoberfest label before merging !

@AyushAnand413
Copy link
Contributor Author

@harshitaphadtare Please reply!

@harshitaphadtare
Copy link
Owner

hey @AyushAnand413 sorry for the delay. loved the way you implemented it! Thankyou for contributing :)

@harshitaphadtare harshitaphadtare merged commit 29d6abb into harshitaphadtare:main Oct 24, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE]: Enhance Weather Data with a Live API Integration

2 participants