From 8ffc775fc41a7b691db81bbe7e18a122137999fb Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Tue, 11 Jan 2022 23:09:57 +0100 Subject: [PATCH 1/2] docs: badger in config.md This should fix the issue of users thinking badger is "no-brainer faster choice" and then running into problems. --- docs/config.md | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/docs/config.md b/docs/config.md index 1cc938c2753..3ad5cfbc8bc 100644 --- a/docs/config.md +++ b/docs/config.md @@ -188,37 +188,37 @@ documented in `ipfs config profile --help`. - `flatfs` - Configures the node to use the flatfs datastore. + Configures the node to use the flatfs datastore (the default). - This is the most battle-tested and reliable datastore, but it's significantly - slower than the badger datastore. You should use this datastore if: + This is the most battle-tested and reliable datastore. + You should use this datastore if: - You need a very simple and very reliable datastore and you trust your filesystem. This datastore stores each block as a separate file in the underlying filesystem so it's unlikely to lose data unless there's an issue with the underlying file system. - - You need to run garbage collection on a small (<= 10GiB) datastore. The - default datastore, badger, can leave several gigabytes of data behind when - garbage collecting. - - You're concerned about memory usage. In its default configuration, badger can - use up to several gigabytes of memory. + - You need to run garbage collection in a way that reclaims free space as soon as possible. + - You want to minimize memory usage. + - You are ok with the default speed of data import (or prefer to use `--nocopy`). This profile may only be applied when first initializing the node. - `badgerds` - Configures the node to use the badger datastore. - - This is the fastest datastore. Use this datastore if performance, especially - when adding many gigabytes of files, is critical. However: + Configures the node to use the experimental badger datastore (warning: uses an outdated badger 1.x). + Use this datastore if some aspects of performance, + especially the speed of adding many gigabytes of files, are critical. However, be aware that: + - This datastore will not properly reclaim space when your datastore is - smaller than several gigabytes. If you run IPFS with '--enable-gc' (you have + smaller than several gigabytes. If you run IPFS with `--enable-gc` (you have enabled block-level garbage collection), you plan on storing very little data in your IPFS node, and disk usage is more critical than performance, consider using - flatfs. - - This datastore uses up to several gigabytes of memory. + `flatfs`. + - This datastore uses up to several gigabytes of memory. + - Good for medium-size datastores, but may run into performance issues if your dataset is bigger than a terabyte. + - The current implementation is based on old badger 1.x which is no longer supported by the upstream team. This profile may only be applied when first initializing the node. From c4b7595b663ec41be6b9da7c6fc772b63771d1ac Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Fri, 28 Jan 2022 21:02:56 +0100 Subject: [PATCH 2/2] docs: apply suggestions from code review Co-authored-by: Johnny <9611008+johnnymatthews@users.noreply.github.com> --- docs/config.md | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/docs/config.md b/docs/config.md index 3ad5cfbc8bc..f855dbc1996 100644 --- a/docs/config.md +++ b/docs/config.md @@ -188,32 +188,31 @@ documented in `ipfs config profile --help`. - `flatfs` - Configures the node to use the flatfs datastore (the default). + Configures the node to use the flatfs datastore. Flatfs is the default datastore. This is the most battle-tested and reliable datastore. You should use this datastore if: - - You need a very simple and very reliable datastore and you trust your + - You need a very simple and very reliable datastore, and you trust your filesystem. This datastore stores each block as a separate file in the underlying filesystem so it's unlikely to lose data unless there's an issue with the underlying file system. - You need to run garbage collection in a way that reclaims free space as soon as possible. - You want to minimize memory usage. - - You are ok with the default speed of data import (or prefer to use `--nocopy`). + - You are ok with the default speed of data import, or prefer to use `--nocopy`. This profile may only be applied when first initializing the node. - `badgerds` - Configures the node to use the experimental badger datastore (warning: uses an outdated badger 1.x). + Configures the node to use the experimental badger datastore. Keep in mind that this **uses an outdated badger 1.x**. Use this datastore if some aspects of performance, especially the speed of adding many gigabytes of files, are critical. However, be aware that: - This datastore will not properly reclaim space when your datastore is - smaller than several gigabytes. If you run IPFS with `--enable-gc` (you have - enabled block-level garbage collection), you plan on storing very little data in + smaller than several gigabytes. If you run IPFS with `--enable-gc`, you plan on storing very little data in your IPFS node, and disk usage is more critical than performance, consider using `flatfs`. - This datastore uses up to several gigabytes of memory.