A command line tool that can populate a timescaleDB database with historical price data of stocks. The project uses the Yahoo API python package to fetch the data and timescaleDB as a time-series database that stores the data.
Stocks list table
An SQL table that contains all the stocks` tickers/names that we want to get their historical price data into the "stock price" tables.Stock Price table
This is a table that contains all the historical stock price data, this table is indexed as a hyper-table, which means it is optimized for time series data. Table columns are date, ticker, open, high, low, close, close_adj, and volume. the table's index and primary key are (ticker, date).
Get to know and install the next tools:
- poetry (use the link to install, do not use brew)
- docker
- docker-compose
To start developing locally you will need to:
- Install the dependencies.
- Start your timescaleDB engines ON.
make db-up
Spin up a timescaleDB via docker-compose. you can see the running containers with "docker ps" command.- Open the browser at http://localhost:9000 for the PGAdmin UI.
- Connect with
my@email.com
andpassword
- Click
Add new server
.- On the first page name=
postgres
. - On
connection
page host=timescale
username=postgres
pass=1234
.
- On the first page name=
- Click
Save
. make db-init-tables
Create tables and indices.make db-populate-tickers-table
Populate the stocks list table.
make setup
to install python libraries in the poetry virtual environment.- Run the data downloader
make run-get-stocks-data ENV=<env_folder> ARGS=<args>
Download historical price data for a list of stocks.
Arguments:
-n, --number-of-tickers INTEGER (optional)
Number of tickers to iterate over.
-t, --tickers TEXT (optional) List of tickers to iterate over, separated
with whitespace. this flag replace the tickers database table functionality.
--help Show this message and exit.
make run-get-stock-data ENV=<env_folder> ARGS=<args>
Download a specific stock historical price data
Arguments:
-t, --ticker TEXT (required) A single stock ticker
--help Show this message and exit.
Note: Command line inputs override env variable values.
Dockerfile
A docker file to build a data_downloader image.Makefile
A make-file that contains shortcuts for useful commands within the context of the project, see available commands description.
data_downloader The python source code folder that populates an SQL database with data about stocks.
db_vars.py
Contains commands for creating tables, creating indices, and populating the "tickers" table. used bydb_cli.py
.db_cli.py
Is a command line tool for DB administration.helpers.py
contains helpers functions, which can be also called "utils", right now containing only a "load_config" function.timescale.py
A file that contains a class that encapsulates all the functionality for interacting with our timescaleDB.yahoo.py
A file with all the functions needed for interacting with yahoo API for getting data about stocks.data_downloader.py
The entry-point for a command line tool that downloads stocks' data from yahoo.
Under data_downloader/files
you can find CSV files that contain the data needed for the initial population of the "stocks list table", and a .env
file for local development.
The docker_compose folder contains all the configuration files needed for spinning up the database set-up.
The scripts folder contains random scripts needed for the project XD.
Currently, there is only a script used for installing poetry in the data_downloader docker image. this is done with a local file instead of fetching it from the web with curl. Docker does not recognize it's the same layer when being fetched from the web, this results in undesired rebuilding of all the layers, again and again, not using the docker layers caching mechanism.
A make-file that contains shortcuts for useful commands within the context of the project.
db-up
Spin up the database containers ("pgadmin" and "timescale") using docker-compose.db-down
Remove the database containers and networks using docker-compose.db-stop
Stop the database containers using docker-compose.db-rm-volumes
Remove the database containers, networks, and volumes using docker-compose.db-init-tables
Create tables and indices.db-populate-tickers-table
Populate the stocks list table.db-delete-tables-content
Delete all table's content.make run-get-stock-data ARGS=<args>
Run data_downloader cli to download specific stock data, use "ARGS=--help" to see all available options.make run-get-stock-data ARGS=<args>
Run data_downloader cli to download data for a list of stocks, use "ARGS=--help" to see all available options.docker-build-data-downloader
build a docker image for data_downloaderrun-data-downloader-container
Run data_downloader container.run-data-downloader-container-interactive
Run data_downloader container in interactive mode (bash).run-data-downloader-container-deatched
Run data_downloader container in detached head mode.format
Format the python code of the project.setup
Set poetry to create a virtual environment in the project folder and installs python dependencies.
ENV_FILE_PATH
(optional) Environment variables.TICKERS
(optional) A whitespace-separated tickers list, for example: "MSFT ADSK GOOGL".NUMBER_OF_TICKERS
(optional) Number of tickers to iterate over.DATA_PERIOD
(optional) How back to get data from in years.DATA_INTERVAL
(optional) Interval of price data in days/hours (examples: "1d" or "1h").DB_STOCK_TICKERS_TABLE
(required) Ticker tables name in timescaleDB.DB_STOCK_PRICE_TABLE
(required) Stock price table name in timescaleDB.POSTGRES_DB
(required) Database name.POSTGRES_HOST
(required) Postgres domain name.POSTGRES_PORT
(required) Postgres port.POSTGRES_USER
(required) Postgres username.POSTGRES_PASSWORD
(required) Postgres password.
Note: Command line inputs override env variable values.
- Download data about the finances and fundamentals of the companies.