The INGEST Job takes lot of time to process SST data Files. #5498

porscheme · 2023-04-13T20:25:26Z

The INGEST Job takes lot of time to process SST data Files. Nebula uses only a fraction of resources (3 cores, 30 GB) to process SST Data files even storage nodes have 16 cores and 128 GB. Based on the logs, bunch of time spent in compaction. Is there way to speed up?

I found below logs

I20230413 20:09:58.910427    81 CompactionFilter.h:92] Do default minor compaction!
I20230413 20:09:59.766036    82 EventListener.h:35] Rocksdb compaction completed column family: default because of LevelMaxLevelSize, status: OK, compacted 3 files into 3, base level is 3, output level is 4
I20230413 20:09:59.887495    82 EventListener.h:21] Rocksdb start compaction column family: default because of LevelMaxLevelSize, status: OK, compacted 2 files into 0, base level is 3, output level is 4
I20230413 20:09:59.887539    82 CompactionFilter.h:92] Do default minor compaction!
I20230413 20:10:01.222849    81 EventListener.h:35] Rocksdb compaction completed column family: default because of LevelMaxLevelSize, status: OK, compacted 2 files into 2, base level is 3, output level is 4
I20230413 20:10:01.223263    81 EventListener.h:21] Rocksdb start compaction column family: default because of LevelMaxLevelSize, status: OK, compacted 2 files into 0, base level is 3, output level is 4
I20230413 20:10:01.223292    81 CompactionFilter.h:92] Do default minor compaction!
I20230413 20:10:02.157040    82 EventListener.h:35] Rocksdb compaction completed column family: default because of LevelMaxLevelSize, status: OK, compacted 2 files into 2, base level is 3, output level is 4
I20230413 20:10:02.157565    82 EventListener.h:21] Rocksdb start compaction column family: default because of LevelMaxLevelSize, status: OK, compacted 2 files into 0, base level is 3, output level is 4
I20230413 20:10:02.157598    82 CompactionFilter.h:92] Do default minor compaction!

The text was updated successfully, but these errors were encountered:

wenhaocs · 2023-04-19T18:04:19Z

Yes, you may increase the value of these two options in your storaged.conf --rocksdb_db_options={"max_subcompactions":"4","max_background_jobs":"4"}

porscheme · 2023-04-20T04:38:58Z

Yes, you may increase the value of these two options in your storaged.conf --rocksdb_db_options={"max_subcompactions":"4","max_background_jobs":"4"}

@wenhaocs thanks for the reply. these settings were already set the way you are suggesting.
Should I increase even more?

Below is our complete nebula-storaged.conf

Name:         nebula-cluster-storaged
Namespace:    graph
Labels:       app.kubernetes.io/cluster=nebula-cluster
              app.kubernetes.io/component=storaged
              app.kubernetes.io/managed-by=nebula-operator
              app.kubernetes.io/name=nebula-graph
Annotations:  <none>

Data
====
nebula-storaged.conf:
----

########## basics ##########
# Whether to run as a daemon process
--daemonize=true
# The file to host the process id
--pid_file=pids/nebula-storaged.pid
# Whether to use the configuration obtained from the configuration file
--local_config=true

########## logging ##########
# The directory to host logging files
--log_dir=logs
# Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
--minloglevel=0
# Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
--v=0
# Maximum seconds to buffer the log messages
--logbufsecs=0
# Whether to redirect stdout and stderr to separate output files
--redirect_stdout=true
# Destination filename of stdout and stderr, which will also reside in log_dir.
--stdout_log_file=storaged-stdout.log
--stderr_log_file=storaged-stderr.log
# Copy log messages at or above this level to stderr in addition to logfiles. The numbers of severity levels INFO, WARNING, ERROR, and FATAL are 0, 1, 2, and 3, respectively.
--stderrthreshold=2
# Wether logging files' name contain timestamp.
--timestamp_in_logfile_name=true

########## networking ##########
# Comma separated Meta server addresses
--meta_server_addrs=127.0.0.1:9559
# Local IP used to identify the nebula-storaged process.
# Change it to an address other than loopback if the service is distributed or
# will be accessed remotely.
--local_ip=127.0.0.1
# Storage daemon listening port
--port=9779
# HTTP service ip
--ws_ip=0.0.0.0
# HTTP service port
--ws_http_port=19779
# heartbeat with meta service
--heartbeat_interval_secs=10

######### Raft #########
# Raft election timeout
--raft_heartbeat_interval_secs=30
# RPC timeout for raft client (ms)
--raft_rpc_timeout_ms=500
## recycle Raft WAL
--wal_ttl=14400

########## Disk ##########
# Root data path. split by comma. e.g. --data_path=/disk1/path1/,/disk2/path2/
# One path per Rocksdb instance.
--data_path=data/storage

# Minimum reserved bytes of each data path
--minimum_reserved_bytes=268435456

# The default reserved bytes for one batch operation
--rocksdb_batch_size=4096
# The default block cache size used in BlockBasedTable.
# The unit is MB.
--rocksdb_block_cache=4
# Disable page cache to better control memory used by rocksdb.
# Caution: Make sure to allocate enough block cache if disabling page cache!
--disable_page_cache=false

# Compression algorithm, options: no,snappy,lz4,lz4hc,zlib,bzip2,zstd
# For the sake of binary compatibility, the default value is snappy.
# Recommend to use:
#   * lz4 to gain more CPU performance, with the same compression ratio with snappy
#   * zstd to occupy less disk space
#   * lz4hc for the read-heavy write-light scenario
--rocksdb_compression=lz4

# Set different compressions for different levels
# For example, if --rocksdb_compression is snappy,
# "no:no:lz4:lz4::zstd" is identical to "no:no:lz4:lz4:snappy:zstd:snappy"
# In order to disable compression for level 0/1, set it to "no:no"
--rocksdb_compression_per_level=

############## rocksdb Options ##############
# rocksdb DBOptions in json, each name and value of option is a string, given as "option_name":"option_value" separated by comma
--rocksdb_db_options={"max_subcompactions":"4","max_background_jobs":"4"}
# rocksdb ColumnFamilyOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma
--rocksdb_column_family_options={"disable_auto_compactions":"false","write_buffer_size":"67108864","max_write_buffer_number":"4","max_bytes_for_level_base":"268435456"}
# rocksdb BlockBasedTableOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma
--rocksdb_block_based_table_options={"block_size":"8192"}

# Whether or not to enable rocksdb's statistics, disabled by default
--enable_rocksdb_statistics=false

# Statslevel used by rocksdb to collection statistics, optional values are
#   * kExceptHistogramOrTimers, disable timer stats, and skip histogram stats
#   * kExceptTimers, Skip timer stats
#   * kExceptDetailedTimers, Collect all stats except time inside mutex lock AND time spent on compression.
#   * kExceptTimeForMutex, Collect all stats except the counters requiring to get time inside the mutex lock.
#   * kAll, Collect all stats
--rocksdb_stats_level=kExceptHistogramOrTimers

# Whether or not to enable rocksdb's prefix bloom filter, enabled by default.
--enable_rocksdb_prefix_filtering=true
# Whether or not to enable rocksdb's whole key bloom filter, disabled by default.
--enable_rocksdb_whole_key_filtering=false

############## Key-Value separation ##############
# Whether or not to enable BlobDB (RocksDB key-value separation support)
--rocksdb_enable_kv_separation=false
# RocksDB key value separation threshold in bytes. Values at or above this threshold will be written to blob files during flush or compaction.
--rocksdb_kv_separation_threshold=100
# Compression algorithm for blobs, options: no,snappy,lz4,lz4hc,zlib,bzip2,zstd
--rocksdb_blob_compression=lz4
# Whether to garbage collect blobs during compaction
--rocksdb_enable_blob_garbage_collection=true

############## storage cache ##############
# Whether to enable storage cache
--enable_storage_cache=false
# Total capacity reserved for storage in memory cache in MB
--storage_cache_capacity=0
# Number of buckets in base 2 logarithm. E.g., in case of 20, the total number of buckets will be 2^20. 
# A good estimate can be ceil(log2(cache_entries * 1.6)). The maximum allowed is 32.
--storage_cache_buckets_power=20
# Number of locks in base 2 logarithm. E.g., in case of 10, the total number of locks will be 2^10. 
# A good estimate can be max(1, buckets_power - 10). The maximum allowed is 32.
--storage_cache_locks_power=10

# Whether to add vertex pool in cache. Only valid when storage cache is enabled.
--enable_vertex_pool=false
# Vertex pool size in MB
--vertex_pool_capacity=50
# TTL in seconds for vertex items in the cache
--vertex_item_ttl=300

# Whether to add empty key pool in cache. Only valid when storage cache is enabled.
--enable_empty_key_pool=false
# Empty key pool size in MB
--empty_key_pool_capacity=50
# TTL in seconds for empty key items in the cache
--empty_key_item_ttl=300

############### misc ####################
--snapshot_part_rate_limit=10485760
--snapshot_batch_size=1048576
--rebuild_index_part_rate_limit=4194304
--rebuild_index_batch_size=1048576

########## Custom ##########
--enable_partitioned_index_filter=true
--max_edge_returned_per_vertex=100000
--move_files=true
--query_concurrently=true

wenhaocs · 2023-08-02T18:33:02Z

Sorry, I didn't make it clear. --rocksdb_db_options={"max_subcompactions":"4","max_background_jobs":"4"} is the default values in your xxx.conf.production. You need to increase it to something like 1/2 of your CPU cores for both values.

QingZ11 · 2023-09-18T09:24:59Z

@porscheme hi, I have noticed that the issue you created hasn’t been updated for nearly a month, so I have to close it for now. If you have any new updates, you are welcome to reopen this issue anytime.

Thanks a lot for your contribution anyway 😊

wey-gu mentioned this issue Apr 15, 2023

Weekly Report 2023-04-14 vesoft-inc/nebula-community#396

Closed

Sophie-Xie added the type/question Type: question about the product label Apr 19, 2023

QingZ11 closed this as completed Sep 18, 2023

wey-gu mentioned this issue Sep 23, 2023

Weekly Report 2023-09-22 vesoft-inc/nebula-community#409

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The INGEST Job takes lot of time to process SST data Files. #5498

The INGEST Job takes lot of time to process SST data Files. #5498

porscheme commented Apr 13, 2023

wenhaocs commented Apr 19, 2023

porscheme commented Apr 20, 2023

wenhaocs commented Aug 2, 2023

QingZ11 commented Sep 18, 2023

The INGEST Job takes lot of time to process SST data Files. #5498

The INGEST Job takes lot of time to process SST data Files. #5498

Comments

porscheme commented Apr 13, 2023

wenhaocs commented Apr 19, 2023

porscheme commented Apr 20, 2023

wenhaocs commented Aug 2, 2023

QingZ11 commented Sep 18, 2023