-
Notifications
You must be signed in to change notification settings - Fork 1.3k
FoundationDB Release 8.0 Planning
Jingyu Zhou edited this page Oct 23, 2024
·
1 revision
Target date to cut release: Monday, March 10, 2025 and target two major releases every year.
This is a release mainly for scale and efficiency improvements, which unlocks many future opportunities for further efficiency gains.
- Sharded RocksDB: This is tested in prod with 7.3 releases via TSS. The target is to become production-ready in 8.0 release.
- Physical shard move: builds on top of Sharded RocksDB so that data movement among storage servers are through file transfers, instead of range reads and writes, avoiding extra SST file writes and compactions.
- Distributed Storage Audit: A distributed correctness audit tool, providing consistency check, as well as location metadata and shard assignment check
- Bulk loading: allows faster data ingestion into FDB bypassing the transaction system.
- gRPC: the first two use case are bulk loading (copy file from blob to storage servers) and physical shard move (data movement among storage servers). Other use cases include client functionality and status json fetched from cluster controller.
- New backup: this introduces backup workers for retrieving mutations from tlogs and uploading them to blob stores. This allows us to double write throughput to tlogs (since no backup mutations are needed any more).
- Version vector: avoids the broadcast of commit messages from commit proxies to tlogs, thus attempting to address the scalability and tail latency of the transaction system.
- Gray failure improvements: this includes functional changes (inclusion of remote processes for latency+disconnect health checks, inclusion of primary+remote storage servers for latency+disconnect health checks), observability (machine readable status, richer trace events), official documentation. Note that the functional changes will be under knobs, based on experiments we’ll decide what knobs to turn on.
- Corruption prevention features: in-memory page checksum, data move comparison, backup agent comparison, mutation checksum, cumulative checksum, rocksdb checksum.
- Others: DD improvement, newer compiler, latest boost, new fmtetc.
- Perf cluster and benchmarks: integrating this more tightly into the FDB release process and doing regression analysis of each patch release.
- Above feature testing
- New CI pipeline for releases
- TLog streaming
- DD improvement
- Storage engine comparison: SQLite, redwood, RocksDB, Sharded RocksDB
- Topology: colocate txn system, tlog/SS ratio
- New hardware qualification
These are features either not staffed or have no solutions yet.
- Sharded RocksDB backup and restore
- Range partitioned backup mutation logs
- Deterministic simulation for RocksDB, Sharded RocksDB, and gRPC
- Per region failure monitoring
- Singleton scalability
- Global coordination in the FDB Operator
- Limited transaction security profiles (limiting blast radius of malicious actors)
- FDB Tiering: storing cold data on S3
- RF6 to RF4: after we have ranged based backup and restore, we can do this change to increase 50% cluster capacity, knowing that S3 can restore any key range within a short period of time (to address [RF4 mean time to data loss]).
- Versioned storage engine: LSM tree stores multi-version data, allowing wider MVCC window for reads, rollback to earlier versions, and storage checkpoint/snapshot.