Skip to content

Commit

Permalink
feat(pageserver): add k-merge layer iterator with lazy loading (#8053)
Browse files Browse the repository at this point in the history
Part of #8002. This pull
request adds a k-merge iterator for bottom-most compaction.

## Summary of changes

* Added back lsn_range / key_range in delta layer inner. This was
removed due to #8050, but added
back because iterators need that information to process lazy loading.
* Added lazy-loading k-merge iterator.
* Added iterator wrapper as a unified iterator type for image+delta
iterator.

The current status and test should cover the use case for L0 compaction
so that the L0 compaction process can bypass page cache and have a fixed
amount of memory usage. The next step is to integrate this with the new
bottom-most compaction.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
Co-authored-by: Christian Schwarz <christian@neon.tech>
  • Loading branch information
skyzh and problame authored Jul 10, 2024
1 parent e78341e commit 9f4511c
Show file tree
Hide file tree
Showing 4 changed files with 452 additions and 3 deletions.
3 changes: 3 additions & 0 deletions pageserver/src/tenant/storage_layer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ pub(crate) mod layer;
mod layer_desc;
mod layer_name;

#[cfg(test)]
pub mod merge_iterator;

use crate::context::{AccessStatsBehavior, RequestContext};
use crate::repository::Value;
use crate::task_mgr::TaskKind;
Expand Down
30 changes: 27 additions & 3 deletions pageserver/src/tenant/storage_layer/delta_layer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,11 @@ pub struct DeltaLayerInner {
file: VirtualFile,
file_id: FileId,

#[allow(dead_code)]
layer_key_range: Range<Key>,
#[allow(dead_code)]
layer_lsn_range: Range<Lsn>,

max_vectored_read_bytes: Option<MaxVectoredReadBytes>,
}

Expand Down Expand Up @@ -742,6 +747,16 @@ impl DeltaLayer {
}

impl DeltaLayerInner {
#[cfg(test)]
pub(crate) fn key_range(&self) -> &Range<Key> {
&self.layer_key_range
}

#[cfg(test)]
pub(crate) fn lsn_range(&self) -> &Range<Lsn> {
&self.layer_lsn_range
}

/// Returns nested result following Result<Result<_, OpErr>, Critical>:
/// - inner has the success or transient failure
/// - outer has the permanent failure
Expand Down Expand Up @@ -790,6 +805,8 @@ impl DeltaLayerInner {
index_start_blk: actual_summary.index_start_blk,
index_root_blk: actual_summary.index_root_blk,
max_vectored_read_bytes,
layer_key_range: actual_summary.key_range,
layer_lsn_range: actual_summary.lsn_range,
}))
}

Expand Down Expand Up @@ -1639,7 +1656,7 @@ impl<'a> DeltaLayerIterator<'a> {
}

#[cfg(test)]
mod test {
pub(crate) mod test {
use std::collections::BTreeMap;

use itertools::MinMaxResult;
Expand Down Expand Up @@ -2217,13 +2234,20 @@ mod test {
}
}

async fn produce_delta_layer(
pub(crate) fn sort_delta(
(k1, l1, _): &(Key, Lsn, Value),
(k2, l2, _): &(Key, Lsn, Value),
) -> std::cmp::Ordering {
(k1, l1).cmp(&(k2, l2))
}

pub(crate) async fn produce_delta_layer(
tenant: &Tenant,
tline: &Arc<Timeline>,
mut deltas: Vec<(Key, Lsn, Value)>,
ctx: &RequestContext,
) -> anyhow::Result<ResidentLayer> {
deltas.sort_by(|(k1, l1, _), (k2, l2, _)| (k1, l1).cmp(&(k2, l2)));
deltas.sort_by(sort_delta);
let (key_start, _, _) = deltas.first().unwrap();
let (key_max, _, _) = deltas.first().unwrap();
let lsn_min = deltas.iter().map(|(_, lsn, _)| lsn).min().unwrap();
Expand Down
10 changes: 10 additions & 0 deletions pageserver/src/tenant/storage_layer/image_layer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -369,6 +369,16 @@ impl ImageLayer {
}

impl ImageLayerInner {
#[cfg(test)]
pub(crate) fn key_range(&self) -> &Range<Key> {
&self.key_range
}

#[cfg(test)]
pub(crate) fn lsn(&self) -> Lsn {
self.lsn
}

/// Returns nested result following Result<Result<_, OpErr>, Critical>:
/// - inner has the success or transient failure
/// - outer has the permanent failure
Expand Down
Loading

1 comment on commit 9f4511c

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3129 tests run: 3002 passed, 0 failed, 127 skipped (full report)


Flaky tests (1)

Postgres 14

  • test_ondemand_wal_download_in_replication_slot_funcs: debug

Code coverage* (full report)

  • functions: 32.7% (6968 of 21333 functions)
  • lines: 50.1% (54786 of 109449 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
9f4511c at 2024-07-10T19:40:26.035Z :recycle:

Please sign in to comment.