Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Week3 Homework #6

Closed
min-guk opened this issue Jul 13, 2022 · 8 comments
Closed

Week3 Homework #6

min-guk opened this issue Jul 13, 2022 · 8 comments

Comments

@min-guk
Copy link
Contributor

min-guk commented Jul 13, 2022

Please submit through google form until next Monday, 7/18 12PM

1. Why do LSM-tree and LevelDB use leveled structure?

Hint 1 - Stackoverflow

Hint 2 – Memory hierarchy
Hint 3 - Patrick O'Neil, The Log-Structured Merge-Tree (LSM-Tree), 1996

2. In leveldb, max size of level i is 10^iMB. But max size of level 0 is 8MB. Why?

Hint 1 - leveldb source code

  • leveldb/db/version_set.cc:VersionSet::Finalize
  • leveldb/db/dbformat.h:kL0_CompactionTrigger

Hint 2 - leveldb-handbook, Compaction (Use google chrome translator)

3. Practice 1

[A] $ ./db_bench --benchmarks="fillseq" 
[B] $ ./db_bench --benchmarks="fillrandom"

Q1. Compare throughput, latency, and stats of two benchmarks and explain why.
Hint - Seek Time, Key Range, Compaction

Q2. In benchmark A, SSTs are not written in L0. Why?
Hint - Flush, Compaction Trigger

Q3. Calculate SAF (Space Amplification Factor) for each benchmark.
Hint - db_bench meta operation

4. Practice 2

[Load] $ ./db_bench --benchmarks="fillrandom" --use_existing_db=0

[A] $ ./db_bench --benchmarks="readseq" --use_existing_db=1
[B] $ ./db_bench --benchmarks="readrandom" --use_existing_db=1
[C] $ ./db_bench --benchmarks="seekrandom" --use_existing_db=1

Note - Before running A, B, and C, run db_load benchmark.

Q1. Which user key-value interface does each benchmark use? (Put, Get, Iterator, ...)
Hint 1 - leveldb/doc/index.md
Hint 2 - leveldb/benchmarks/db_bench.cc

Q2. Compare throughput and latency of each benchmark and explain why.
Hint - Seek Time

5. Practice 3

[A] $ ./db_bench --benchmarks="fillrandom" --value_size=100 --num=1000000 --compression_ratio=1
[B] $ ./db_bench --benchmarks="fillrandom" --value_size=1000 --num=114173 --compression_ratio=1

Note 1. key_size = 16B
Note 2. same total kv pairs size.
Note 3. # of B's entries = 114173 = (16+100)/(16+1000) * 1000000

Q. The size of input kv pairs is the same. But One is better in throughput, the other is better in latency. Explain why.
Hint. Batch Processing

@min-guk
Copy link
Contributor Author

min-guk commented Jul 13, 2022

Of course, it is okay to answer in Korean.

@min-guk
Copy link
Contributor Author

min-guk commented Jul 13, 2022

When leveldb build is done, please check installation with $ ./db_bench, not $ db_bench.

Do not install rocksdb db_bench with sudo apt install rocksdb-tools

@min-guk
Copy link
Contributor Author

min-guk commented Jul 14, 2022

More hints for homework has been updated.

@min-guk
Copy link
Contributor Author

min-guk commented Jul 14, 2022

When studying the leveldb code such as leveldb/benchmarks/db_bench.cc, please use VScode "Go to Definition(F12)" and "Go to References(Shift+F12)" features.

@min-guk
Copy link
Contributor Author

min-guk commented Jul 14, 2022

Hints for question 2 has been updated.

@min-guk
Copy link
Contributor Author

min-guk commented Jul 16, 2022

There was a mistake in question 5 and it has now been corrected.
Please, check again.

@min-guk
Copy link
Contributor Author

min-guk commented Jul 18, 2022

[Deadline Extension]

Deadline has been extended until 7/18 12 PM.
It's okay if you can't answer all the questions, please submit within the extended deadline.

@min-guk
Copy link
Contributor Author

min-guk commented Jul 19, 2022

Great work everyone!

Homework solutions have been uploaded. Individual solutions will be presented today so that everyone can have a reference on how the others addressed the homework.

And also, you can check how other students answered question.

Feedback for your submitted solution will be given starting tomorrow. See you later!

@min-guk min-guk closed this as completed Jul 19, 2022
@min-guk min-guk pinned this issue Jul 19, 2022
@min-guk min-guk unpinned this issue Jul 20, 2022
@min-guk min-guk pinned this issue Jul 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant