Skip to content


Repository files navigation


CoreCache is a distributed key-value store designed for high performance and scalability.

GitHub Actions Workflow Status GitHub Release GitHub Issues GitHub Closed Issues GitHub Commit Activity GitHub License


CoreCache MVP includes the following features:

  1. Leader Election and Coordination: Managed through ZooKeeper.
  2. Data Handling: Reads and writes are processed through a leader node.
  3. Data Storage: Utilizes a Log-Structured Merge Tree (LSM Tree) for efficient data storage.
  4. API Support: Provides GET, PUT, and DELETE operations.
  5. Data Management: Memtables handle data until it is flushed to SSTable, and deletions are managed during compaction.

Not Included in MVP

  • Replicas: Data replication is not implemented in the MVP.
  • Write-Ahead Logs (WAL): WAL is not included.

Memtable and SSTable

  • Data is first read from and written to Memtables.
  • Memtables are flushed to SSTables, at which point they are cleared.
  • DELETE Operations: Data is marked for deletion and collected during compaction.
  • Compaction Role: Handles updates and deletions by rewriting index and data files.


CoreCache requires the following dependencies:

  1. Kazoo: A library for interacting with ZooKeeper.
  2. dynaconf: Manages Python dependencies and configuration.
  3. Colima: Recommended for local development (alternative to Docker).

Running Locally

To run CoreCache locally:

  1. Start ZooKeeper: Use the following Docker command to run ZooKeeper in a container.
    docker run --name some-zookeeper -p 2181:2181 --restart always -d zookeeper
  2. Connect to ZooKeeper: Use the following command to connect to the ZooKeeper container.
    docker run -it --rm --link some-zookeeper:zookeeper zookeeper -server zookeeper

Deploying CoreCache

Follow these steps to deploy CoreCache:

  1. Ensure that Python and pip are installed on your system.
  2. Download the CoreCache release.
    curl -L -o corecache-0.11.tar.gz
  3. Extract the downloaded file.
    tar -xvzf corecache-0.11.tar.gz
  4. Run ZooKeeper: Ensure ZooKeeper is running.
  5. Navigate to the scripts directory.
    cd distributed-key-value-store-0.11/scripts
  6. Start the CoreCache server. --zooKeeperHost localhost --zooKeeperPort 2181


Here are some performance benchmarks:

Date CoreCache Version Number of Nodes Configuration Operation Total Requests Max Throughput Avg Latency p95 Latency Detailed Report
09/16/2024 v0.15 3 AWS t2.micro POST 10K 31.2 requests/sec 3.53 ms 12.2 ms More Details


CoreCache has few limitations that being actively addressed:

  • Race Conditions: Potential issues when data is being inserted while the cache is being flushed to SSTable.
  • Configuration Management: Configuration items such as data directory, port range, and flush conditions should be managed via a properties file.
  • Data Retrieval: Only the searched key is made available in Memcache when retrieving data from SSTable.
  • Single Leader: Only the leader node can insert data into the cache.
  • MemTable Flush: This process stops the world, potentially halting data insertion during a flush.
  • Index File Scanning: Empty MemTable requires scanning all index files to locate data, which could be optimized.
  • Timestamp Accuracy: Timestamp on data should reflect when the key-value pair was first inserted.
  • Dependency Management: Consider migrating to Poetry for improved dependency management.
  • Error Handling: Implement proper error handling across all APIs.
  • Pathlib Migration: Migrate file manipulations to pathlib.
  • Data Integrity: If a key marked as deleted (deleted=true) is not flushed before a node crash, the key remains undeleted. This can be mitigated with a Write-Ahead Log (WAL).