Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add allocation logging and visualization #64

Merged
merged 49 commits into from
Jan 13, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
d858877
Basic Logging in binary format
pzittlau Oct 9, 2024
612582a
write correct Segment size
pzittlau Oct 9, 2024
cfb14d0
minor
pzittlau Oct 17, 2024
7c78834
Fix None unwrap
pzittlau Oct 17, 2024
514c9cf
First visualization of allocations
pzittlau Oct 17, 2024
1619443
Storage saving binary logging
pzittlau Oct 21, 2024
73a8906
faster bitmap getting
pzittlau Oct 21, 2024
cbb5042
minor
pzittlau Oct 21, 2024
1f5e399
visualize bitmap by storage layer
pzittlau Oct 23, 2024
a61c46a
Speed up plotting by precalculating things
pzittlau Oct 23, 2024
58753f6
New get with output parameter, initial timestep is 1
pzittlau Oct 24, 2024
98ade18
Experimented with unpacked storage
pzittlau Oct 24, 2024
b172a39
Enforce max timestep
pzittlau Oct 24, 2024
69b8890
Remove simple plot
pzittlau Oct 24, 2024
667fd52
visual improvements
pzittlau Oct 24, 2024
c9a2225
First export to mp4
pzittlau Oct 24, 2024
25ae21a
minor
pzittlau Oct 24, 2024
199654a
Color disks seperately
pzittlau Oct 24, 2024
f50ee77
Validation helpers
pzittlau Oct 25, 2024
6c4b824
minor
pzittlau Oct 25, 2024
55792eb
Calculate and Plot fragmentation
pzittlau Oct 25, 2024
155a857
Indicate Timestep, handle empty storage
pzittlau Oct 25, 2024
c549a57
Type hints, formatting
pzittlau Oct 25, 2024
148b84e
parallel fragmentation building
pzittlau Oct 25, 2024
040f059
Code reorganization
pzittlau Oct 28, 2024
22ddb7c
parallel video exporting
pzittlau Oct 29, 2024
d382ea9
Rewrite of visualization script with better performance, memory usage…
pzittlau Nov 1, 2024
e7a3f60
backend selection parameter for matplotlib
pzittlau Nov 6, 2024
16ebb8c
Handle missing input file correctly
pzittlau Nov 6, 2024
9550c73
Plot correct global fragmentation
pzittlau Nov 6, 2024
d935e15
runtime plot adjustments
pzittlau Nov 10, 2024
921e62d
remove dependency of slider for exporting
pzittlau Nov 10, 2024
c18af85
Plot failed allocations
pzittlau Nov 18, 2024
6c6bc99
CLI improvements
pzittlau Nov 18, 2024
1a9a26c
plot free blocks
pzittlau Nov 18, 2024
104e712
use packed bits for faster plotting
pzittlau Nov 18, 2024
515a021
plot allocation sizes
pzittlau Nov 18, 2024
fb12986
minor changes for visual clarity
pzittlau Nov 18, 2024
dfe6dbe
minor
pzittlau Nov 19, 2024
e34615a
minor
pzittlau Nov 18, 2024
9661f90
unaligned allocation tries
pzittlau Nov 21, 2024
9ebd225
lines to seperate segments
pzittlau Nov 21, 2024
49d1b04
allocation_log feature flag
pzittlau Nov 22, 2024
fc7a1ff
more complete allocation_log feature flag
pzittlau Dec 11, 2024
cefae1b
runtime allocation_log path
pzittlau Dec 11, 2024
a4458aa
information on how to visualize allocations
pzittlau Dec 11, 2024
e864db1
moved scripts to project root
pzittlau Dec 13, 2024
ff6bfb4
removed allocation tries, instead count cycles
pzittlau Dec 18, 2024
8cbe70c
proportion of cycles spent in allocator
pzittlau Dec 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions betree/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -83,4 +83,6 @@ figment_config = ["figment"]
latency_metrics = []
experimental-api = []
nvm = ["pmdk"]
# Log the allocations and deallocations done for later analysis
allocation_log = []

5 changes: 3 additions & 2 deletions betree/src/allocator.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
//! This module provides `SegmentAllocator` and `SegmentId` for bitmap
//! allocation of 1GiB segments.

use crate::{cow_bytes::CowBytes, storage_pool::DiskOffset, vdev::Block};
use crate::{cow_bytes::CowBytes, storage_pool::DiskOffset, vdev::Block, Error};
use bitvec::prelude::*;
use byteorder::{BigEndian, ByteOrder};
use std::io::Write;

/// 256KiB, so that `vdev::BLOCK_SIZE * SEGMENT_SIZE == 1GiB`
pub const SEGMENT_SIZE: usize = 1 << SEGMENT_SIZE_LOG_2;
Expand Down Expand Up @@ -55,7 +56,7 @@ impl SegmentAllocator {
}
};
self.mark(offset, size, Action::Allocate);
Some(offset)
return Some(offset);
}

/// Allocates a block of the given `size` at `offset`.
Expand Down
119 changes: 111 additions & 8 deletions betree/src/data_management/dmu.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ use super::{
CopyOnWriteEvent, Dml, HasStoragePreference, Object, ObjectReference,
};
use crate::{
allocator::{Action, SegmentAllocator, SegmentId},
allocator::{Action, SegmentAllocator, SegmentId, SEGMENT_SIZE},
buffer::Buf,
cache::{Cache, ChangeKeyError, RemoveError},
checksum::{Builder, Checksum, State},
Expand All @@ -17,16 +17,21 @@ use crate::{
size::{Size, SizeMut, StaticSize},
storage_pool::{DiskOffset, StoragePoolLayer, NUM_STORAGE_CLASSES},
tree::{Node, PivotKey},
vdev::{Block, BLOCK_SIZE},
vdev::{Block, File, BLOCK_SIZE},
StoragePreference,
};
use byteorder::{LittleEndian, WriteBytesExt};
use crossbeam_channel::Sender;
use futures::{executor::block_on, future::ok, prelude::*};
use parking_lot::{Mutex, RwLock, RwLockReadGuard, RwLockWriteGuard};
use std::{
arch::x86_64::{__rdtscp, _rdtsc},
collections::HashMap,
fs::OpenOptions,
io::{BufWriter, Write},
mem::replace,
ops::DerefMut,
path::PathBuf,
pin::Pin,
sync::{
atomic::{AtomicU64, Ordering},
Expand Down Expand Up @@ -60,6 +65,8 @@ where
next_modified_node_id: AtomicU64,
next_disk_id: AtomicU64,
report_tx: Option<Sender<DmlMsg>>,
#[cfg(feature = "allocation_log")]
allocation_log_file: Mutex<BufWriter<std::fs::File>>,
}

impl<E, SPL> Dmu<E, SPL>
Expand All @@ -76,6 +83,7 @@ where
alloc_strategy: [[Option<u8>; NUM_STORAGE_CLASSES]; NUM_STORAGE_CLASSES],
cache: E,
handler: Handler<ObjRef<ObjectPointer<SPL::Checksum>>>,
#[cfg(feature = "allocation_log")] allocation_log_file_path: PathBuf,
) -> Self {
let allocation_data = (0..pool.storage_class_count())
.map(|class| {
Expand All @@ -87,6 +95,16 @@ where
.collect::<Vec<_>>()
.into_boxed_slice();

#[cfg(feature = "allocation_log")]
let allocation_log_file = Mutex::new(BufWriter::new(
OpenOptions::new()
.create(true)
.write(true)
.truncate(true)
.open(allocation_log_file_path)
.expect("Failed to create allocation log file"),
));

Dmu {
// default_compression_state: default_compression.new_compression().expect("Can't create compression state"),
default_compression,
Expand All @@ -103,6 +121,8 @@ where
next_modified_node_id: AtomicU64::new(1),
next_disk_id: AtomicU64::new(0),
report_tx: None,
#[cfg(feature = "allocation_log")]
allocation_log_file,
}
}

Expand All @@ -120,6 +140,36 @@ where
pub fn pool(&self) -> &SPL {
&self.pool
}

/// Writes the global header for the allocation logging.
pub fn write_global_header(&self) -> Result<(), Error> {
#[cfg(feature = "allocation_log")]
{
let mut file = self.allocation_log_file.lock();

// Number of storage classes
file.write_u8(self.pool.storage_class_count())?;

// Disks per class
for class in 0..self.pool.storage_class_count() {
let disk_count = self.pool.disk_count(class);
file.write_u16::<LittleEndian>(disk_count)?;
}

// Segments per disk
for class in 0..self.pool.storage_class_count() {
for disk in 0..self.pool.disk_count(class) {
let segment_count = self.pool.size_in_blocks(class, disk);
file.write_u64::<LittleEndian>(segment_count.as_u64())?;
}
}

// Blocks per segment (constant)
file.write_u64::<LittleEndian>(SEGMENT_SIZE.try_into().unwrap())?;
}

Ok(())
}
}

impl<E, SPL> Dmu<E, SPL>
Expand Down Expand Up @@ -201,6 +251,15 @@ where
obj_ptr.offset().disk_id(),
obj_ptr.size(),
);
#[cfg(feature = "allocation_log")]
{
let mut file = self.allocation_log_file.lock();
let _ = file.write_u8(Action::Deallocate.as_bool() as u8);
let _ = file.write_u64::<LittleEndian>(obj_ptr.offset.as_u64());
let _ = file.write_u32::<LittleEndian>(obj_ptr.size.as_u32());
let _ = file.write_u64::<LittleEndian>(0);
let _ = file.write_u64::<LittleEndian>(0);
}
if let (CopyOnWriteEvent::Removed, Some(tx), CopyOnWriteReason::Remove) = (
self.handler.copy_on_write(
obj_ptr.offset(),
Expand Down Expand Up @@ -484,6 +543,16 @@ where

let strategy = self.alloc_strategy[storage_preference as usize];

// NOTE: Could we mark classes, disks and/or segments as full to prevent looping over them?
// We would then also need to handle this, when deallocating things.
// Would full mean completely full or just not having enough contiguous memory of some
// size?
// Or save the largest contiguous memory region as a value and compare against that. For
// that the allocator needs to support that and we have to 'bubble' the largest value up.
#[cfg(feature = "allocation_log")]
let mut start_cycles_global = get_cycles();
#[cfg(feature = "allocation_log")]
let mut total_cycles_local: u64 = 0;
'class: for &class in strategy.iter().flatten() {
let disks_in_class = self.pool.disk_count(class);
if disks_in_class == 0 {
Expand Down Expand Up @@ -536,14 +605,40 @@ where

let first_seen_segment_id = *segment_id;
loop {
if let Some(segment_offset) = self
.handler
.get_allocation_bitmap(*segment_id, self)?
.access()
.allocate(size.as_u32())
// Has to be split because else the temporary value is dropped while borrowing
let bitmap = self.handler.get_allocation_bitmap(*segment_id, self)?;
let mut allocator = bitmap.access();

#[cfg(not(feature = "allocation_log"))]
{
let allocation = allocator.allocate(size.as_u32());
if let Some(segment_offset) = allocation {
let disk_offset = segment_id.disk_offset(segment_offset);
break disk_offset;
}
}
#[cfg(feature = "allocation_log")]
{
break segment_id.disk_offset(segment_offset);
let start_cycles_allocation = get_cycles();
let allocation = allocator.allocate(size.as_u32());
let end_cycles_allocation = get_cycles();
total_cycles_local += end_cycles_allocation - start_cycles_allocation;

if let Some(segment_offset) = allocation {
let disk_offset = segment_id.disk_offset(segment_offset);
let total_cycles_global = end_cycles_allocation - start_cycles_global;

let mut file = self.allocation_log_file.lock();
file.write_u8(Action::Allocate.as_bool() as u8)?;
file.write_u64::<LittleEndian>(disk_offset.as_u64())?;
file.write_u32::<LittleEndian>(size.as_u32())?;
file.write_u64::<LittleEndian>(total_cycles_local)?;
file.write_u64::<LittleEndian>(total_cycles_global)?;

break disk_offset;
}
}

let next_segment_id = segment_id.next(disk_size);
trace!(
"Next allocator segment: {:?} -> {:?} ({:?})",
Expand Down Expand Up @@ -1031,3 +1126,11 @@ where
self.report_tx = Some(tx);
}
}

fn get_cycles() -> u64 {
unsafe {
//let mut aux = 0;
//__rdtscp(aux)
_rdtsc()
}
}
11 changes: 10 additions & 1 deletion betree/src/database/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ use serde::{de::DeserializeOwned, Deserialize, Serialize};
use std::{
collections::HashMap,
iter::FromIterator,
path::Path,
path::{Path, PathBuf},
sync::{
atomic::{AtomicU64, Ordering},
Arc,
Expand Down Expand Up @@ -147,6 +147,9 @@ pub struct DatabaseConfiguration {

/// If and how to log database metrics
pub metrics: Option<MetricsConfiguration>,

/// Where to log the allocations
pub allocation_log_file_path: PathBuf,
}

impl Default for DatabaseConfiguration {
Expand All @@ -162,6 +165,7 @@ impl Default for DatabaseConfiguration {
sync_interval_ms: Some(DEFAULT_SYNC_INTERVAL_MS),
metrics: None,
migration_policy: None,
allocation_log_file_path: PathBuf::from("allocation_log.bin"),
}
}
}
Expand Down Expand Up @@ -237,6 +241,8 @@ impl DatabaseConfiguration {
strategy,
ClockCache::new(self.cache_size),
handler,
#[cfg(feature = "allocation_log")]
self.allocation_log_file_path.clone(),
)
}

Expand Down Expand Up @@ -432,6 +438,9 @@ impl Database {
dmu.set_report(tx.clone());
}

#[cfg(feature = "allocation_log")]
dmu.write_global_header()?;

let (tree, root_ptr) = builder.select_root_tree(Arc::new(dmu))?;

*tree.dmu().handler().current_generation.lock_write() = root_ptr.generation().next();
Expand Down
5 changes: 1 addition & 4 deletions betree/src/tree/imp/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -393,14 +393,11 @@ where
self.msg_action().apply(key, &msg, &mut tmp);
}

// This may never be false.
let data = tmp.unwrap();

drop(node);
if self.evict {
self.dml.evict()?;
}
Ok(Some((info, data)))
Ok(tmp.map(|data| (info, data)))
}
}
}
Expand Down
37 changes: 37 additions & 0 deletions scripts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Allocation Log Visualization

This script visualizes the allocation and deallocation of blocks within the key-value database. It helps to understand how storage space is being used and identify potential optimization opportunities.

The allocation log visualization script is tested with Python 3.12.7 and the packages listed in `requirements.txt`.

The main dependencies are matplotlib, tqdm and sortedcontainers.

## Setup

Run the following to create a working environment for the script:

```bash
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r scripts/requirements.txt
```

## Generating the Allocation Log

To generate the `allocation_log.bin` file, you need to enable the allocation_log feature flag when compiling the `betree` crate. For instance by running
```bash
cargo build --features allocation_log
```
or by enabling it in the `Cargo.toml`.

The path where the log is saved can be set with the runtime configuration parameter `allocation_log_file_path`. The default is `$PWD/allocation_log.bin`

## Using the Allocation Log

Once a log file has been obtained simply run the following to visualize the (de-)allocations recorded.
```bash
./scripts/visualize_allocation_log allocation_log.bin
```

To get help and see the options available run the script with the `-h` flag.

13 changes: 13 additions & 0 deletions scripts/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
contourpy==1.3.1
cycler==0.12.1
fonttools==4.55.3
kiwisolver==1.4.7
matplotlib==3.9.3
numpy==2.2.0
packaging==24.2
pillow==11.0.0
pyparsing==3.2.0
python-dateutil==2.9.0.post0
six==1.17.0
sortedcontainers==2.4.0
tqdm==4.67.1
Loading
Loading