Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first commit for adding nfl data #66

Open
wants to merge 1 commit into
base: data
Choose a base branch
from

Conversation

evanmolinelli
Copy link

@evanmolinelli evanmolinelli commented Jul 16, 2024

Pull Request Guide

Checklist

Please check the following:

  • You have checked the Pull Request guidelines.
  • Tests for bug fixes or new features have been added.
  • Docs have been added or updated.

Type

What kind of change does this Pull Request introduce?

  • feat: New feature implementation.
  • fix: Bug fix.
  • docs: Documentation changes.
  • style: Code style or format changes.
  • refactor: Changes that are not features or bug fixes.
  • tests: Test additions or corrections.
  • chore: Maintenance code changes.
  • other

Current behaviour

Describe the current behaviour or provide a link to an issue.

What is the new behavior?

Describe the new behaviour.

Does this Pull Request introduce a breaking change?

  • Yes
  • No

Other information

Provide any other information.

@georgedouzas
Copy link
Owner

Hi @evanmolinelli,

The goal is to create a new branch that only contains the NFL data and follows the naming conventions of the dataloaders. This branch will create a Prefect flow and update the data periodically. There I suggest the following:

  1. Your nfl-data branch should be empty of commit history.
  2. Check the naming conventions of the data columns and try to follow them (otherwise it can not be used directly by the dataloaders).

@evanmolinelli
Copy link
Author

evanmolinelli commented Aug 16, 2024

@georgedouzas Thanks for the feedback.

  1. I'm going to create a new branch 'data-nfl' with an empty commit history.
  2. Can you point me to the dataloaders specifically (files, etc.) and would the naming convention be similar to the dicts COLS_MAPPING and SCHEMA in the soccer.py or data columns in the processed/*.csv(s)?

@georgedouzas
Copy link
Owner

@evanmolinelli

You can check the file soccer.py. The dataloader class validates the column names and expects three different categories:

  • Columns with the name convention 'odds__{bookmaker}{betting_market}{target}', for example 'odds__pinnacle_closing__over_2.5__full_time_goals'. Notice the double underscore. These columns contain the odds data of the selected bookmaker, for a specific betting market and target. The target is a subcategory of a betting market. For example instead of full time goals, we could describe half time goals for the same betting market over_2.5.
  • Columns with the name 'target__{betting_market}__{target}', for example 'target__home_team__full_time_goals'. These columns contain the target data for a particular betting market and target as explained above.
  • Any other column name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants