Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

manifest: optimize sizeof FileMetadata #2047

Open
sumeerbhola opened this issue Oct 24, 2022 · 1 comment
Open

manifest: optimize sizeof FileMetadata #2047

sumeerbhola opened this issue Oct 24, 2022 · 1 comment

Comments

@sumeerbhola
Copy link
Collaborator

sumeerbhola commented Oct 24, 2022

The FileMetadata struct has grown to sizeof 344 bytes, which means 344MB with 1M files (which can happen for various reasons), and is not an insignificant amount of memory. There is of course additional cost of FileMetadata due to the InternalKey.UserKey byte slices, which is something discussed in #1741.

We can reduce the FileMetadata size to 216 bytes with some rearrangement and moving mostly empty sets of fields to structs with pointers in FileMetadata (see ** for the changes below)

type FileMetadata struct {
	// Atomic contains fields which are accessed atomically. Go allocations
	// are guaranteed to be 64-bit aligned which we take advantage of by
	// placing the 64-bit fields which we access atomically at the beginning
	// of the FileMetadata struct. For more information, see
	// https://golang.org/pkg/sync/atomic/#pkg-note-BUG.
	Atomic struct {
		// AllowedSeeks is used to determine if a file should be picked for
		// a read triggered compaction. It is decremented when read sampling
		// in pebble.Iterator after every after every positioning operation
		// that returns a user key (eg. Next, Prev, SeekGE, SeekLT, etc).
		AllowedSeeks int64
		// statsValid is 1 if stats have been loaded for the table. The
		// TableStats structure is populated only if valid is 1.
		statsValid uint32
		// ** MOVED refs into struct to make it 16 byte instead of 16 byte struct
		// + 8 byte refs **
		// Reference count for the file: incremented when a file is added to a
		// version and decremented when the version is unreferenced. The file is
		// obsolete when the reference count falls to zero.
		refs int32
	}

	// InitAllowedSeeks is the inital value of allowed seeks. This is used
	// to re-set allowed seeks on a file once it hits 0.
	InitAllowedSeeks int64

	// FileNum is the file number.
	FileNum base.FileNum
	// Size is the size of the file, in bytes.
	Size uint64
	// File creation time in seconds since the epoch (1970-01-01 00:00:00
	// UTC). For ingested sstables, this corresponds to the time the file was
	// ingested.
	CreationTime int64
	// Smallest and largest sequence numbers in the table, across both point and
	// range keys.
	SmallestSeqNum uint64
	LargestSeqNum  uint64
	// SmallestPointKey and LargestPointKey are the inclusive bounds for the
	// internal point keys stored in the table. This includes RANGEDELs, which
	// alter point keys.
	// NB: these field should be set using ExtendPointKeyBounds. They are left
	// exported for reads as an optimization.
	SmallestPointKey InternalKey
	LargestPointKey  InternalKey
	// SmallestRangeKey and LargestRangeKey are the inclusive bounds for the
	// internal range keys stored in the table.
	// NB: these field should be set using ExtendRangeKeyBounds. They are left
	// exported for reads as an optimization.
	// ** Since range keys are rare, made into pointers to save 48 bytes **
	SmallestRangeKey *InternalKey
	LargestRangeKey  *InternalKey
	// Smallest and Largest are the inclusive bounds for the internal keys stored
	// in the table, across both point and range keys.
	// NB: these fields are derived from their point and range key equivalents,
	// and are updated via the MaybeExtend{Point,Range}KeyBounds methods.
	// ** Since range keys are rare, made into pointers to save 48 bytes **
	// If nil, this sstable only has point keys and the SmallestPointKey,
	// LargestPointKey should be used.
	Smallest *InternalKey
	Largest  *InternalKey
	// Stats describe table statistics. Protected by DB.mu.
	Stats TableStats

	// ** Most files are not in L0, so keep all L0 state behind a pointer **
	L0State *L0State
	// NB: the alignment of this struct is 8 bytes. We pack all the bools to
	// ensure an optimal packing.

	CompactionState     CompactionState
	// True if compaction of this file has been explicitly requested.
	// Previously, RocksDB and earlier versions of Pebble allowed this
	// flag to be set by a user table property collector. Some earlier
	// versions of Pebble respected this flag, while other more recent
	// versions ignored this flag.
	//
	// More recently this flag has been repurposed to facilitate the
	// compaction of 'atomic compaction units'. Files marked for
	// compaction are compacted in a rewrite compaction at the lowest
	// possible compaction priority.
	//
	// NB: A count of files marked for compaction is maintained on
	// Version, and compaction picking reads cached annotations
	// determined by this field.
	//
	// Protected by DB.mu.
	MarkedForCompaction bool
	// HasPointKeys tracks whether the table contains point keys (including
	// RANGEDELs). If a table contains only range deletions, HasPointsKeys is
	// still true.
	HasPointKeys bool
	// HasRangeKeys tracks whether the table contains any range keys.
	HasRangeKeys bool
	// smallestSet and largestSet track whether the overall bounds have been set.
	boundsSet bool
	// boundTypeSmallest and boundTypeLargest provide an indication as to which
	// key type (point or range) corresponds to the smallest and largest overall
	// table bounds.
	boundTypeSmallest, boundTypeLargest boundType
}

type L0State struct {
	SubLevel         int
	L0Index          int
	minIntervalIndex int
	maxIntervalIndex int
	// For L0 files only. Protected by DB.mu. Used to generate L0 sublevels and
	// pick L0 compactions. Only accurate for the most recent Version.
	//
	// IsIntraL0Compacting is set to True if this file is part of an intra-L0
	// compaction. When it's true, IsCompacting must also return true. If
	// Compacting is true and IsIntraL0Compacting is false for an L0 file, the
	// file must be part of a compaction to Lbase.
	IsIntraL0Compacting bool
}

Jira issue: PEBBLE-138

Copy link

github-actions bot commented May 8, 2024

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it
in 10 days to keep the issue queue tidy. Thank you for your
contribution to Pebble!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

3 participants