This project scrapes data from a Wikipedia table for longest running scripted U.S. primetime television series, and pulls episode information for those shows from a television API through TV Maze. This project then organizes this data with an appropriate key/value structure, and uploads the data to DynamoDB, an AWS NoSQL database.
- Web Scraping
- API Connection
- Aggregation
- Python
- DynamoDB
- NoSQL
- Requests
- Boto3
- bs4
AWS Credentials will need to be saved locally in the .aws directory of an operating system in order for this project to successfully run. Click here to learn more about this process.
On the command line of your operating system, navigate to the repository directory (ideally using a Python virtual environment).
Run the following code on the command line to install requirements:
pip install -r requirements.txt
Run the following code on the command line to run this project:
Python main.py
requirements.txt
- Python package requirements