Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store small values in store-tuples instead of blob hunk logs #21

Open
tatsuya6502 opened this issue Jan 29, 2014 · 1 comment
Open

Store small values in store-tuples instead of blob hunk logs #21

tatsuya6502 opened this issue Jan 29, 2014 · 1 comment
Assignees
Milestone

Comments

@tatsuya6502
Copy link
Member

For both in-memory and disk space efficiency, Hibari v0.3 should store small values (less than 64 bytes or so) in in-memory store-tuples instead of blob hunk logs.

Here are some comparisons in R16B03, 64 bit Erlang VM:

Hibari v0.3

storage location record #w:
e.g.

#w{wal_seqnum=1000, wal_hunk_pos=1000000000,
   private_seqnum=1000, private_hunk_pos=1000000000, 
   val_offset=50, val_len=200}
  • 64 bytes (in-memory) [*1]
  • 31 bytes (on-disk) [*2]

storage location record #p:
e.g.

#p{seqnum=1000, hunk_pos=1000000000,
   val_offset=50, val_len=200}
  • 48 bytes (in-memory)
  • 21 bytes (on-disk)

blob hunk:

  • byte_size(Value) + hunk overhead
  • hunk overhead varies because Hibari v0.3 will try to store multiple blob values in one hunk (minimum 4KB or so).
  • The overhead of one hunk (type <<"p">>) is (30 + 4 * number of blobs) bytes (including 16 bytes md5 hash)
  • e.g. value blob size = 4 bytes, one hunk holds 500 values.
    • hunk overhead per blob = (30 + 4 * 500) / 500 = 4.06 bytes. (101.5%)
  • e.g. value blob size = 64 bytes, one hunk holds 60 values.
    • hunk overhead per blob = (30 + 4 * 60) / 60 = 4.5 bytes. (7.0%)

(FYI) Hibari v0.1.x

storage location tuple:
e.g. {1000,1000000000}

  • 24 bytes (in-memory)
  • 13 bytes (on-disk)

blob hunk:

  • byte_size(Value) + 37 bytes as hunk overhead (including 16 bytes md5 hash)

Footnotes:

*1: Calculated by the followings:

> Wal = #w{wal_seqnum=1000, wal_hunk_pos=1000000000,
           private_seqnum=1000, private_hunk_pos=1000000000,
           val_offset=50, val_len=200}.
> erts_debug:size(Wal) * erlang:system_info(wordsize).
64

*2: Calculated by the followings:

> byte_size(term_to_binary(Wal)).
31
@ghost ghost assigned tatsuya6502 Jan 29, 2014
@tatsuya6502
Copy link
Member Author

Set target milestone v0.3.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant