Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: binary SSTable format #68

Merged
merged 46 commits into from
Aug 3, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
8cee03e
feat!: use binary SSTable format
Terkwood Aug 1, 2021
0eef6ee
stub out funcs
Terkwood Aug 1, 2021
52b02ea
Draft
Terkwood Aug 1, 2021
cda70d8
draft
Terkwood Aug 1, 2021
150ebd1
draft
Terkwood Aug 1, 2021
83a608a
Draft
Terkwood Aug 1, 2021
702b6da
draft binary dump
Terkwood Aug 1, 2021
d1687ff
Draft
Terkwood Aug 1, 2021
425b0ba
try try
Terkwood Aug 1, 2021
7416a33
dump seems to work
Terkwood Aug 1, 2021
da535cf
stub query
Terkwood Aug 1, 2021
7a44430
set version
Terkwood Aug 1, 2021
eb0029a
try to query
Terkwood Aug 1, 2021
59a0af1
read correctly
Terkwood Aug 1, 2021
d4faccc
trim
Terkwood Aug 1, 2021
4b42458
stub out merge
Terkwood Aug 1, 2021
6645876
Draft
Terkwood Aug 1, 2021
8c920c6
do not unescape
Terkwood Aug 1, 2021
665a5f1
add sort
Terkwood Aug 1, 2021
23c3a01
docs: Update README.md
Terkwood Aug 2, 2021
c7c7d5b
docs: Update README.md
Terkwood Aug 2, 2021
0b07f64
docs: Update README.md
Terkwood Aug 2, 2021
8f21754
docs: Update README.md
Terkwood Aug 2, 2021
4855976
docs: Update README.md
Terkwood Aug 2, 2021
1582c95
docs: Update README.md
Terkwood Aug 2, 2021
dbf8736
docs: Update README.md
Terkwood Aug 2, 2021
3357d9e
hack up compaction and settings
Terkwood Aug 2, 2021
469f79e
trim
Terkwood Aug 2, 2021
6a09700
work up compaction with binary
Terkwood Aug 2, 2021
ca5ce47
sort lowest
Terkwood Aug 2, 2021
1e6beda
pull out write_kv
Terkwood Aug 2, 2021
9995afd
heckle
Terkwood Aug 2, 2021
5d1e8cc
hack it all together
Terkwood Aug 2, 2021
2c05467
try it out
Terkwood Aug 2, 2021
04ae451
fix error
Terkwood Aug 2, 2021
4a7e8c6
close but no cigar
Terkwood Aug 2, 2021
cb80fb9
hack around
Terkwood Aug 3, 2021
49b299a
fix read_one
Terkwood Aug 3, 2021
591052d
clear IO.inspects
Terkwood Aug 3, 2021
a947a47
clear extra script
Terkwood Aug 3, 2021
a4c0935
remove dead file
Terkwood Aug 3, 2021
f7a8b6c
refactor TSV a bit and delete old file
Terkwood Aug 3, 2021
69acbd5
set period
Terkwood Aug 3, 2021
2b383e2
update docs
Terkwood Aug 3, 2021
1cc75ff
rename
Terkwood Aug 3, 2021
3f7ac84
doc function
Terkwood Aug 3, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 55 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,59 @@ This project is a work in progress 🚧 and is being developed primarily to suit

## Initial design

Use gb_trees as memtable.
All writes are first written to a commit log, protecting against crashes.

Use nimble_csv to generate SSTable format.
Newly written values are stored in a memtable backed by [:gb_trees](https://erlang.org/doc/man/gb_trees.html).

Use phoenix to expose a REST API (PUT, GET, DEL).
We define a binary SSTable format.

## Notional distributed system
We use phoenix to expose a REST API (PUT, GET, DEL) for creating, reading, updating, and deleting (for now) string resources using `Content-Type: application/json`. [Support for application/octet-stream](https://github.com/Terkwood/AugustDB/issues/24) is forthcoming.

First implement a local key-value store that uses a memtable, SSTables, and a commit log. Then implement a replicating data store which syncs via gossip. Then implement partitioning using vnodes.
### SSTable Format

### Inspiration
This is the specification for [binary SSTables](https://github.com/Terkwood/AugustDB/issues/51).

#### value records

1. Length of key in bytes
2. Length of value in bytes
3. Raw key, not escaped
4. Raw value, not escaped

#### tombstone records

1. Length of key in bytes
2. -1 to indicate tombstone
3. Raw key, not escaped

### Making HTTP calls

Create a record

```sh
curl -X PUT -d value='meh meh' http://localhost:4000/api/values/1
```

Update a record
```sh
curl -X PUT -d value='n0 n0' http://localhost:4000/api/values/1
```

Get a record

```sh
curl http://localhost:4000/api/values/1
```

Delete a record

```sh
curl -X DELETE http://localhost:4000/api/values/1
```



## Inspiration

[Kleppmann: Designing Data-Intensive Applications](https://dataintensive.net/) gives a fantastic summary of local-node operation for data stores using SSTable, followed by detail on strategies for replication and partitioning. Check it out!

Expand All @@ -30,9 +72,6 @@ You should enable [multi-time warp mode](https://erlang.org/doc/apps/erts/time_c
ELIXIR_ERL_OPTIONS="+C multi_time_warp" iex -S mix
```

## REST API examples

You can see some examples using curl [in value_controller.ex](https://github.com/Terkwood/AugustDB/blob/main/lib/august_db_web/controllers/value_controller.ex).

## Generating docs

Expand All @@ -54,3 +93,10 @@ To start your Phoenix server:
Now you can visit [`localhost:4000`](http://localhost:4000) from your browser.

Ready to run in production? Please [check our deployment guides](https://hexdocs.pm/phoenix/deployment.html).




## 🔮 The Glorious Future: a distributed system

~~First implement a local key-value store that uses a memtable, SSTables, and a commit log~~. Then implement a replicating data store which syncs via gossip. Then implement partitioning using vnodes. [See the issue tracker](https://github.com/Terkwood/AugustDB/issues/15).
2 changes: 2 additions & 0 deletions commit.log.test0
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
one two 0
no yes 0
2 changes: 2 additions & 0 deletions commit.log.test1
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
three four 0
hey now 0
2 changes: 1 addition & 1 deletion lib/august_db/application.ex
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ defmodule AugustDb.Application do
# Make sure commit log exists, old entries are written into SSTable, etc.
{Task, fn -> Startup.init() end},
# Start periodic SSTable compaction
Compaction.Periodic,
SSTable.Compaction.Periodic,
# Start the Telemetry supervisor
AugustDbWeb.Telemetry,
# Start the PubSub system
Expand Down
6 changes: 4 additions & 2 deletions lib/august_db/commitlog.ex
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
NimbleCSV.define(CommitLogParser, separator: "\t", escape: "\"")
NimbleCSV.define(CommitLogParser, separator: TSV.col_separator(), escape: "\"")

defmodule CommitLog do
@tsv_header_string "k\tv\tt\n"
Expand All @@ -12,7 +12,9 @@ defmodule CommitLog do
def append(key, value) do
File.write!(
@log_file,
key <> "\t" <> value <> "\t" <> "#{:erlang.monotonic_time()}" <> "\n",
key <>
TSV.col_separator() <>
value <> TSV.col_separator() <> "#{:erlang.monotonic_time()}" <> TSV.row_separator(),
[:append]
)
end
Expand Down
164 changes: 0 additions & 164 deletions lib/august_db/compaction.ex

This file was deleted.

14 changes: 2 additions & 12 deletions lib/august_db/memtable.ex
Original file line number Diff line number Diff line change
Expand Up @@ -70,18 +70,8 @@ defmodule Memtable do
commit_log_backup = CommitLog.backup()
CommitLog.new()

sstable = SSTable.from(flushing)

time_name = "#{:erlang.system_time()}"

table_fname = "#{time_name}.sst"
table_file_stream = File.stream!(table_fname)
table_stream = sstable.table
table_stream |> Stream.into(table_file_stream) |> Stream.run()

index_binary = :erlang.term_to_binary(sstable.index)
index_path = "#{time_name}.idx"
File.write!(index_path, index_binary)
## Write the current memtable to disk in a binary format
SSTable.dump(flushing)

# Finished. Clear the flushing table state.
Agent.update(__MODULE__, fn %__MODULE__{current: current, flushing: _} ->
Expand Down
Loading