This project is used as a means to use SQL to explore COVID-19 Data. The datebase was created using the help Python to extract information from Coronavirus (COVID-19) Deaths by Our World in Data from CSV files. A database was created using SQLite as the Database Management System. SQLite was chosen so that the database could be uploaded into GitHub for view for others. For easy viewing of the contents and queries of the database it was decided to use Jupyter which at the time of this writting could be viewed on GitHub without any local or remote installations of any application softwares. The results plainly put of the project only needs to be looked at from this repository.
There is also a related project using the data from this repository to display the information grapically using Tableau:
- The Github repository: Tableau Project for COVID
- The Tableau site page for results from Tableau Project for COVID
While not necessary the software required the run this project are
Then you would need the following Python Modules installed but would be included with the installion of Anaconda:
import numpy as np
import pandas as pd
import sqlite3
from sqlalchemy import create_engine
The first file to look at is initialization_COVID_DB_SQLite.ipynb
. This Python Notebook sets up the COVID_DB.db
database needed for this project. It takes the owid-covid-data.csv
file downloaded from Our World in Data then splits it into 2 CSV files: CovidDeaths.csv
and CovidVaccinations.csv
. Most of the results you'll read from SQL Data Exploration using Python Notebook for SQLite.ipynb
is refereced from CovidDeaths.csv
as a table.
After that take a look at SQL Data Exploration using Python Notebook for SQLite.ipynb
. This is where SQL is used to create queries from COVID_DB.db
. Python was used to help create SQL queries and then display the results of the queries.