A real time streaming tool for stocks and cryptocurrencies.
Yampa is a data engineering project that streams real-time cryptocurrency trade data from Coincap into a Redpanda cluster and processes it using ClickHouse for real-time analytics. The processed data can be used for various analyses, including the statistical distributions of trading patterns.
- Data Collection: Yampa CLI (Go)
- Data Streaming: Redpanda (Kafka API-compatible)
- Stream Processing: Redpanda Connect
- Storage & Analytics: ClickHouse
Trade Distributions - A companion project that visualizes the statistical patterns in cryptocurrency trading data collected by Yampa. The analysis reveals interesting patterns in trade volume distributions that resemble well-known probability distributions.
-
Create a docker network for the Redpanda containers:
docker network create --driver bridge --attachable redpanda-net
-
Start your Redpanda cluster:
cd redpanda && docker compose up -d && cd ..
-
Start streaming trades from Coincap into Redpanda:
-
Copy the contents of
yampa-cli/.env.example
intoyampa-cli/.env
-
Start the yampa-cli container:
cd yampa-cli && docker compose up --build -d && cd ..
-
-
View the trade data in the Redpanda Console UI at localhost:8080
-
Copy the contents of
clickhouse/.env.example
intoclickhouse/.env
-
Start your ClickHouse container:
cd clickhouse && docker compose up -d && cd ..
-
Create the
trades
database andraw_trades
table in the ClickHouse UI at localhost:8123 using theclickhouse/trades/tables/raw_trades.sql
schema.
-
Copy the contents of
redpanda-connect/.env.example
intoredpanda-connect/.env
-
Start your Redpanda Connect cluster:
cd redpanda-connect && docker compose up -d && cd ..
-
Create the ClickHouse sink connector in the Redpanda Console UI using the
redpanda-connect/connectors/clickhouse-sink.json
configuration.
This project is open source and available under the MIT License.