Lots of small sst files generated in delete after put and frequent checkpoint workload #9540

Shenjiaqi · 2022-02-10T05:57:20Z

Actual behavior

If all keys are put and deleted in ascending order, and checkpoint is triggered frequently, Many small sst files is generated.
It seems that these sst files is compacted(trivial moved) from level-0:

Most sst files are in level-1.
All records in these sst files are kTypeDeletion.
Logs such as "Moving #${sst file id} to level-1 ${serveral KB} bytes" can be found in LOG.

These small sst files seems never selected for compact and finally cause Too many open files error.

Code to reproduce the behavior

//
//  main.cpp
//  reproduce
//
//  Created by shenjiaqi on 2022/2/8.
//

#include <iostream>
#include <filesystem>
#include <cstdio>
#include <cstdlib>
#include <string>

#include "rocksdb/utilities/checkpoint.h"
#include "rocksdb/db.h"
#include "rocksdb/slice.h"
#include "rocksdb/options.h"

using namespace ROCKSDB_NAMESPACE;
using ROCKSDB_NAMESPACE::DB;
using ROCKSDB_NAMESPACE::Options;
using ROCKSDB_NAMESPACE::PinnableSlice;
using ROCKSDB_NAMESPACE::ReadOptions;
using ROCKSDB_NAMESPACE::Status;
using ROCKSDB_NAMESPACE::WriteBatch;
using ROCKSDB_NAMESPACE::WriteOptions;

std::string kDBPath = "/Users/shenjiaqi/Workspace/rocksdb/data-test"; // need to be reconfigured

static void createCheckpoint(rocksdb::DB *db, rocksdb::Status &s) {
    std::cout << "create checkpoint" << std::endl;
    std::string chkPath = kDBPath + "-chp";
    assert(chkPath.find("/Users/shenjiaqi/Workspace/rocksdb/data-test") >= 0); // just in case
    system(("rm -rf " + chkPath).data()); // use with care.
    
    Checkpoint* checkpoint_ptr;
    s = Checkpoint::Create(db, &checkpoint_ptr);
    assert(s.ok());
    
    s = checkpoint_ptr->CreateCheckpoint(chkPath);
    assert(s.ok());
}


int main() {
    DB* db;
    Options options;
    options.IncreaseParallelism();
    options.OptimizeLevelStyleCompaction();
    options.create_if_missing = true;
    options.info_log_level = DEBUG_LEVEL;

    // open DB
    Status s = DB::Open(options, kDBPath, &db);
    assert(s.ok());

    for (int i = 0; i < 1000; ++i) {

        std::string key = "key" + /* std::to_string((int)rand()); // */std::to_string(i);
        std::string value = "value" + std::to_string(i);

        // Put key-value
        s = db->Put(WriteOptions(), key, value);
        assert(s.ok());

        // delete after put
        s = db->Delete(WriteOptions(), key);
        assert(s.ok());

        if (i > 0 && (i % 5) == 0) {
            // each checkpoint will trigger dump level 0 sst file, which contains only delete tags.
            // These sst files will be compacted(trivial moved) to level 1.
            createCheckpoint(db, s);
        }
    }
    
    createCheckpoint(db, s);
    return 0;
}

akankshamahajan15 · 2022-02-11T18:10:44Z

Can you try Manual compaction to force the compaction on the bottommost level. https://github.com/facebook/rocksdb/wiki/Compaction-Trivial-Move

Shenjiaqi · 2022-02-14T04:02:52Z

@akankshamahajan15
I mantain a long running job using rocksdb. Should I create a thread checking number of sst files and compact them periodlly？

akankshamahajan15 · 2022-02-14T17:38:29Z

@Shenjiaqi Yes, I think manual compaction should help in your case. Let me know if that doesn't work.

ajkr · 2022-02-15T06:18:34Z

You can also try increasing log_size_for_flush to avoid flushing of small files:

rocksdb/include/rocksdb/utilities/checkpoint.h

Line 47 in 241b5aa

uint64_t log_size_for_flush = 0,

Make sure to read the API doc carefully -- it can be dangerous if you set WriteOptions::disableWAL.

Note also WALs are copied not hard-linked, so multiple checkpoints containing the same WAL will duplicate data.

ajkr added the question label Feb 15, 2022

ajkr added the waiting Waiting for a response from the issue creator. label Feb 15, 2022

ajkr closed this as completed Apr 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lots of small sst files generated in delete after put and frequent checkpoint workload #9540

Lots of small sst files generated in delete after put and frequent checkpoint workload #9540

Shenjiaqi commented Feb 10, 2022

akankshamahajan15 commented Feb 11, 2022

Shenjiaqi commented Feb 14, 2022

akankshamahajan15 commented Feb 14, 2022

ajkr commented Feb 15, 2022 •

edited

Loading

Lots of small sst files generated in delete after put and frequent checkpoint workload #9540

Lots of small sst files generated in delete after put and frequent checkpoint workload #9540

Comments

Shenjiaqi commented Feb 10, 2022

Actual behavior

Code to reproduce the behavior

akankshamahajan15 commented Feb 11, 2022

Shenjiaqi commented Feb 14, 2022

akankshamahajan15 commented Feb 14, 2022

ajkr commented Feb 15, 2022 • edited Loading

ajkr commented Feb 15, 2022 •

edited

Loading