Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v0.12 #8344

Closed
57 of 59 tasks
BigLep opened this issue Aug 14, 2021 · 22 comments · Fixed by #8742
Closed
57 of 59 tasks

Release v0.12 #8344

BigLep opened this issue Aug 14, 2021 · 22 comments · Fixed by #8742
Assignees
Labels
topic/release Topic release
Milestone

Comments

@BigLep
Copy link
Contributor

BigLep commented Aug 14, 2021

go-ipfs 0.12.0 Release Notes

We're happy to announce go-ipfs 0.12.0. This release switches the storage of IPLD blocks to be keyed by multihash instead of CID.

As usual, this release includes important fixes, some of which may be critical for security. Unless the fix addresses a bug being exploited in the wild, the fix will not be called out in the release notes. Please make sure to update ASAP. See our release process for details.

🛠 BREAKING CHANGES

  • ipfs refs local will now list all blocks as if they were raw CIDv1 instead of with whatever CID version and IPLD codecs they were stored with. All other functionality should remain the same.

Note: This change also effects ipfs-update so if you use that tool to mange your go-ipfs installation then grab ipfs-update v1.8.0 from dist.

Keep reading to learn more details.

🔦 Highlights

There is only one change since 0.11:

Blockstore migration from full CID to Multihash keys

We are switching the default low level datastore to be keyed only by the Multihash part of the CID, and deduplicate some blocks in the process. The blockstore will become codec-agnostic.

Rationale

The blockstore/datastore layers are not concerned with data interpretation, only with storage of binary blocks and verification that the Multihash they are addressed with (which comes from the CID), matches the block. In fact, different CIDs, with different codecs prefixes, may be carrying the same multihash, and referencing the same block. Carrying the CID abstraction so low on the stack means potentially fetching and storing the same blocks multiple times just because they are referenced by different CIDs. Prior to this change, a CIDv1 with a dag-cbor codec and a CIDv1 with a raw codec, both containing the same multihash, would result in two identical blocks stored. A CIDv0 and CIDv1 both being the same dag-pb block would also result in two copies.

How migration works

In order to perform the switch, and start referencing all blocks by their multihash, a migration will occur on update. This migration will take the repository version from 11 (current) to 12.

One thing to note is that any content addressed CIDv0 (all the hashes that start with Qm..., the current default in go-ipfs), does not need any migration, as CIDv0 are raw multihashes already. This means the migration will be very lightweight for the majority of users.

The migration process will take care of re-keying any CIDv1 block so that it is only addressed by its multihash. Large nodes with lots of CIDv1-addressed content will need to go through a heavier process as the migration happens. This is how the migration works:

  1. Phase 1: The migration script will perform a pass for every block in the datastore and will add all CIDv1s found to a file named 11-to-12-cids.txt, in the go-ipfs configuration folder. Nothing is written in this first phase and it only serves to identify keys that will be migrated in phase 2.
  2. Phase 2: The migration script will perform a second pass where every CIDv1 block will be read and re-written with its raw-multihash as key. There is 1 worker performing this task, although more can be configured. Every 100MiB-worth of blocks (this is configurable), each worker will trigger a datastore "sync" (to ensure all written data is flushed to disk) and delete the CIDv1-addressed blocks that were just renamed. This provides a good compromise between speed and resources needed to run the migration.

At every sync, the migration emits a log message showing how many blocks need to be rewritten and how far the process is.

FlatFS specific migration

For those using a single FlatFS datastore as their backing blockstore (i.e. the default behavior), the migration (but not reversion) will take advantage of the ability to easily move/rename the blocks to improve migration performance.

Unfortunately, other common datastores do not support renames which is what makes this FlatFS specific. If you are running a large custom datastore that supports renames you may want to consider running a fork of fs-repo-11-to-12 specific to your datastore.

If you want to disable this behavior, set the environment variable IPFS_FS_MIGRATION_11_TO_12_ENABLE_FLATFS_FASTPATH to false.

Migration configuration

For those who want to tune the migration more precisely for their setups, there are two environment variables to configure:

  • IPFS_FS_MIGRATION_11_TO_12_NWORKERS : an integer describing the number of migration workers - defaults to 1
  • IPFS_FS_MIGRATION_11_TO_12_SYNC_SIZE_BYTES : an integer describing the number of bytes after which migration workers will sync - defaults to 104857600 (i.e. 100MiB)

Migration caveats

Large repositories with very large numbers of CIDv1s should be mindful of the migration process:

  • We recommend ensuring that IPFS runs with an appropriate (high) file-descriptor limit, particularly when Badger is use as datastore backend. Badger is known to open many tables when experiencing a high number of writes, which may trigger "too many files open" type of errors during the migrations. If this happens, the migration can be retried with a higher FD limit (see below).
  • Migrations using the Badger datastore may not immediately reclaim the space freed by the deletion of migrated blocks, thus space requirements may grow considerably. A periodic Badger-GC is run every 2 minutes, which will reclaim space used by deleted and de-duplicated blocks. The last portion of the space will only be reclaimed after go-ipfs starts (the Badger-GC cycle will trigger after 15 minutes).
  • While there is a revert process detailed below, we recommend keeping a backup of the repository, particularly for very large ones, in case an issue happens, so that the revert can happen immediately and cases of repository corruption due to crashes or unexpected circumstances are not catastrophic.

Migration interruptions and retries

If a problem occurs during the migration, it is be possible to simply re-start and retry it:

  1. Phase 1 will never overwrite the 11-to-12-cids.txt file, but only append to it (so that a list of things we were supposed to have migrated during our first attempt is not lost - this is important for reverts, see below).
  2. Phase 2 will proceed to continue re-keying blocks that were not re-keyed during previous attempts.

Migration reverts

It is also possible to revert the migration after it has succeeded, for example to go to a previous go-ipfs version (<=0.11), even after starting and using go-ipfs in the new version (>=0.12). The revert process works as follows:

  1. The 11-to-12-cids.txt file is read, which has the list of all the CIDv1s that had to be rewritten for the migration.
  2. A CIDv1-addressed block is written for every item on the list. This work is performed by 1 worker (configurable), syncing every 100MiB (configurable).
  3. It is ensured that every CIDv1 pin, and every CIDv1 reference in MFS, are also written as CIDV1-addressed blocks, regardless of whether they were part of the original migration or were added later.

The revert process does not delete any blocks--it only makes sure that blocks that were accessible with CIDv1s before the migration are again keyed with CIDv1s. This may result in a datastore becoming twice as large (i.e. if all the blocks were CIDv1-addressed before the migration). This is however done this way to cover corner cases: user can add CIDv1s after migration, which may reference blocks that existed as CIDv0 before migration. The revert aims to ensure that no data becomes unavailable on downgrade.

While go-ipfs will auto-run the migration for you, it will not run the reversion. To do so you can download the latest migration binary or use ipfs-update.

Custom datastores

As with previous migrations if you work with custom datastores and want to leverage the migration you can run a fork of fs-repo-11-to-12 specific to your datastore. The repo includes instructions on building for different datastores.

For this migration, if your datastore has fast renames you may want to consider writing some code to leverage the particular efficiencies of your datastore similar to what was done for FlatFS.

🚢 Estimated shipping date

RC1: 2021-12-13
ECD: 2022-01-06

✅ Release Checklist

For each RC published in each stage:

  • version string in version.go has been updated (in the release-vX.Y.Z branch).
  • tag commit with vX.Y.Z-rcN
  • upload to dist.ipfs.io
    1. Build: https://github.com/ipfs/distributions#usage.
    2. Pin the resulting release.
    3. Make a PR against ipfs/distributions with the updated versions, including the new hash in the PR comment.
    4. Ask the infra team to update the DNSLink record for dist.ipfs.io to point to the new distribution.
  • cut a pre-release on github and upload the result of the ipfs/distributions build in the previous step.
  • Announce the RC:
    • On Matrix (both #ipfs and #ipfs-dev)
    • To the early testers listed in docs/EARLY_TESTERS.md. Do this by copy/pasting their GitHub usernames and checkboxes as a comment so they get a GitHub notification. (example)

Checklist:

  • Stage 0 - Automated Testing
    • Fork a new branch (release-vX.Y.Z) from master and make any further release related changes to this branch. If any "non-trivial" changes (see the footnotes of docs/releases.md for a definition) get added to the release, uncheck all the checkboxes and return to this stage.
      • Follow the RC release process to cut the first RC.
      • Bump the version in version.go in the master branch to vX.(Y+1).0-dev.
    • Automated Testing (already tested in CI) - Ensure that all tests are passing, this includes:
  • Stage 1 - Internal Testing
    • CHANGELOG.md has been updated
    • Infrastructure Testing:
      • Deploy new version to a subset of Bootstrappers
      • Deploy new version to a subset of Gateways
      • Deploy new version to a subset of Preload nodes
      • Collect metrics every day. Work with the Infrastructure team to learn of any hiccup
    • IPFS Application Testing - Run the tests of the following applications:
  • Stage 2 - Community Prod Testing
  • Stage 3 - Release
    • Final preparation
      • Verify that version string in version.go has been updated.
      • Merge release-vX.Y.Z into the release branch.
      • Tag this merge commit (on the release branch) with vX.Y.Z.
      • Release published
      • Cut a new ipfs-desktop release
    • Submit this form to publish a blog post, linking to the GitHub release notes
    • Broadcasting (link to blog post)
  • Post-Release
    • Merge the release branch back into master, ignoring the changes to version.go (keep the -dev version from master).
    • Create an issue using this release issue template for the next release.
    • Make sure any last-minute changelog updates from the blog post make it back into the CHANGELOG.
    • Mark PR draft created for IPFS Desktop as ready for review.

⁉️ Do you have questions?

The best place to ask your questions about IPFS, how it works and what you can do with it is at discuss.ipfs.io. We are also available at the #lobby:ipfs.io Matrix channel which is bridged with other chat platforms.

Release improvements for next time

< Add any release improvements that were observed this cycle here so they can get incorporated into future releases. >

@BigLep BigLep added the topic/release Topic release label Aug 14, 2021
@BigLep BigLep added this to the go-ipfs 0.11 milestone Aug 14, 2021
@BigLep BigLep changed the title Release v0.11 Release v0.12 Aug 14, 2021
@BigLep BigLep modified the milestones: go-ipfs 0.11, go-ipfs 0.12 Aug 14, 2021
@BigLep BigLep pinned this issue Aug 14, 2021
@BigLep
Copy link
Contributor Author

BigLep commented Dec 2, 2021

Changelog

Full Changelog

❤ Contributors

Contributor Commits Lines ± Files Changed
Gus Eggert 10 +333/-321 24
Steven Allen 7 +289/-190 13
Hector Sanjuan 9 +134/-109 18
Adin Schmahmann 11 +179/-55 21
Raúl Kripalani 2 +152/-42 5
Daniel Martí 1 +120/-1 1
frrist 1 +95/-13 2
Alex Trottier 2 +22/-11 4
Andrey Petrov 1 +32/-0 1
Lucas Molas 1 +18/-7 2
Marten Seemann 2 +11/-7 3
whyrusleeping 1 +10/-0 1
postables 1 +5/-3 1
Dr Ian Preston 1 +4/-0 1

@hsanjuan
Copy link
Contributor

hsanjuan commented Dec 10, 2021

I add my edits here, rather than editing the top-level-commit, but feel free to port them there:

(moved to top, click to expand original comment)

Blockstore migration from full CID to Multihash keys

We are switching the default low level datastore to be keyed only by the Multihash part of CID, and deduplicate some blocks in the process. The blockstore will become codec-agnostic.

The blockstore/datastore layers are not concerned with data interpretation, only with storage of binary blocks and verification that the Multihash they are addressed with (which comes from the CID), matches the block. In fact, different CIDs, with different codecs prefixes, may be carrying the same multihash, and referencing the same block. Carrying the CID abstraction so low on the stack meant potentially fetching and storing the same blocks multiple times just because they are referenced by the different CIDs. For example, a CIDv1 with dag-cbor codec and a CIDv1 with raw codec containing the same multihash would result in two identical blocks stored. A CIDv0 and CIDv1 both being the same dag-pb block would also result in two copies.

In order to perform the switch, and start referencing all blocks by solely by their multihash, a migration will occur on update. This migration will take the repository version from 11 (current) to 12.

One thing to note is that any content addressed CIDv0 (all the hashes that start with Qm..., the current default in go-ipfs), does not need any migration, as CIDv0 are raw multihashes already. This means the migration will be very lightweight for the majority of users.

On the other side, migration process will take care of re-keying any CIDv1 block so that it is only addressed by its multihash. Large nodes will lots of CIDv1-addressed content will need to go through a heavier process as the migration happens. This is how the migration works:

  1. Phase 1: The migration script will perform a pass for every block in the datastore and will add all the CIDv1s found in it to a file named 11-to-12-cids.txt, in the go-ipfs configuration folder. Nothing is written in this first phase and it only serves to identify keys will be migrated in phase 2.
  2. Phase 2: The migration script will perform a second pass where every CIDv1 block will read and written with its raw-multihash as key. There will be 4 workers performing this task in parallel. Every 20MB-worth of blocks, each workers will trigger a datastore "sync" (to ensure all written data is flushed to disk) and delete the CIDv1-addressed blocks that were just renamed. This provides a good compromise between speed and resources needed to run the migration.

During the migration, the user will see log messages informing of how many blocks need to be rewritten and how far the process at every sync.

Caveats

Large repositories with very large numbers of CIDv1s should be mindful of the migration process:

  • We recommend ensuring that IPFS runs with an appropriate (high) file-descriptor limit, particularly when Badger is use as datastore backend. Badger is known to open many tables when experiencing a high number of writes, which may trigger "too many files open" type of errors during the migrations. If this happens, the migration can be retried with a higher FD limit (see below).
  • Migrations using the badger datastore may not immediately reclaim the space freed by the deletion of migrated blocks, thus space requirements may grow considerably. A periodic badger-GC is run every 15 minutes only, and this will be the moment when space used by deleted and de-duplicated blocks will be reclaimed. The last portion of the space will only be reclaimed after go-ipfs starts (the badger-GC cycle will trigger after 15 minutes).
  • While there is a revert process (detailed below), we recommend keeping a backup of the repository, particularly for very large ones, in case an issue happens, so that revert can happen immediately and cases of repository corruption due to crashes or unexpected circumstances (mostly badger, again) are not catastrophic.

Migration failures and reverts

If a problem occurred during the migration, it is be possible to simply re-start and retry it:

  1. Phase 1 will never overwrite the 11-to-12-cids.txt file, but only append to it (so that a list of things we were supposed to have migrated during our first attempt is not lost - this is important for reverts, see below).
  2. Phase 2 will proceed to continue re-keying blocks that were not re-keyed during previous attempts.

It will be also possible to revert the migration after it has succeeded, for example to go to a previous go-ipfs version, even after starting and using go-ipfs in the new version. The revert process works as follows:

  1. The 11-to-12-cids.txt file, which has the list of all the CIDv1s that had to be rewritten for the migration is read.
  2. A CIDv1-addressed block is written for every item on the list. This work is performed by 1 worker, syncing every 20MB.
  3. Additionally, it is ensured that every CIDv1 pin, and every CIDv1 reference in MFS are also written as CIDV1-addressed blocks, regardless of whether they were part of the original migration or were added later.

The revert process does not delete any blocks, only makes sure that blocks that were accessible with CIDv1s before the migration are again keyed with CIDv1s. This may result in a datastore becoming twice as large (i.e. if all the blocks were CIDv1-addressed before the migration). This is however done this way to cover corner cases: user can add CIDv1s after migration, which may reference blocks that existed as CIDv0 before migration. The revert heavily aims to ensure that no data becomes unavailable on downgrade.

@lidel
Copy link
Member

lidel commented Dec 10, 2021

Thank you @hsanjuan – looks great! – I've moved it to the top-comment and added some sub-headers.

@hsanjuan
Copy link
Contributor

I have updated the gc interval and the revert workers mention based on the fs-repo-migrations#144.

@BigLep
Copy link
Contributor Author

BigLep commented Dec 14, 2021

@guseggert : from 2021-12-14 standup, when we update Discuss to announce RC1, lets ensure we make it clear there is 0.11.0 and 0.12.0-rc1

Lets also please ensure we deploy the RC to some of our infra.

@BigLep
Copy link
Contributor Author

BigLep commented Dec 16, 2021

2021-12-16 conversation:

  1. When we do the final release, lets add a note about backing up (we haven't seen problems in RC phase, but if mission critical to not have any datastore problems, then standard recommendation to take a backup)
  2. Remind people about pin verify, etc. commands.
  3. Add a note about that "if your log messages look X, you have some data that wasn't accessible that is now accessible."
  4. Update the message log message to be really clear about the non-CID case. We don't want a case of us fixing stuff looking bad.

Currently 0.12-rc1 is one bank. That is good.

@BigLep
Copy link
Contributor Author

BigLep commented Jan 4, 2022

2022-01-04 conversation:

  1. (Adin) Double check with IPFS gateway operators
  2. (Gus) Need to fix ipfs-update: Update to v0.12.0-rc1 fails during blockstore migration ipfs-update#151
  3. (Adin) Need to improve dist HTTP reliability.
  • Talk with George, potentially create followup task to make dist a Gateway
  • go-ipfs should have fallen back to IPFS

@hsanjuan
Copy link
Contributor

hsanjuan commented Jan 4, 2022

We need to test a large migration on the nft1-cluster nodes before decommissioning it. This will be possible in a few days.

@BigLep
Copy link
Contributor Author

BigLep commented Jan 10, 2022

@hsanjuan : is this possible now? Who is owning this?

@hsanjuan
Copy link
Contributor

I'm owning. We can probably start that migration today.

@hsanjuan
Copy link
Contributor

223943126 CIDv1 keys added to backup file for /blocks <- it has started. It should do 200k per hour at this rate. 5h/million. 224*5 = 46 days til completion.

@BigLep
Copy link
Contributor Author

BigLep commented Jan 18, 2022

2022-01-18 Remaining work:

  1. Release notes updates about migration callouts/suggestions
  2. Landing fs-repo-11-to-12: add env vars for configuring the number of workers and the sync size fs-repo-migrations#149

@aschmahmann
Copy link
Contributor

  • People with non-SHA2-256-256 multihashes in their blockstores might see something like could not parse /blocks/UDSAEIBDUQ4F2KOA2LRMMN3H46IL2MD5XW4FRNHEMH53T4KVYGU6VCP724 as a Cid during the migration. This appears to be an error that doesn't actually effect user data and is an erroneous error message that comes with some amount of extra work done by the migration

@RubenKelevra
Copy link
Contributor

After I saw what @aschmahmann described my MFS was partly broken (I couldn't remove a file which ipfs files stat reported to exist). So I reimported by data, which slowed down to a crawl with no chance to finish within the next month or so.

This is obviously a blocker for 0.12 for me.

The corresponding ticket: #8694

@RubenKelevra
Copy link
Contributor

RubenKelevra commented Jan 29, 2022

@aschmahmann I can confirm this issue for 0.11 as well. Basically zero IO on the disk, high CPU load (above two cores), and slow as a snake to add more files or to ls the root of the MFS:

$ time ipfs files ls /
manjaro.pkg.pacman.store

real	0m33.394s
user	0m0.251s
sys	0m0.057s
$ ipfs version --all
go-ipfs version: 0.11.0-67220edaa
Repo version: 11
System version: amd64/linux
Golang version: go1.17.6

@BigLep
Copy link
Contributor Author

BigLep commented Feb 15, 2022

2022-02-15 notes:

  • There are no code reviews out
  • There is short spike that may help on migrations (instead of gets/puts, do renames with files on flatfs). We hope to include by default. This would isolate caveats/warnings to other datastores like Badger. If this isn't 10x+ faster, we will give up. We'll ask @hsanjuan to review.
  • Expected completion date: 2022-02-16.

@aschmahmann
Copy link
Contributor

The spike has been successful and ipfs/fs-repo-migrations#152 should make things a lot faster (I got 10x on a slower HDD, but have seen greater speedups on machines with faster disk access).

The plan is to get the last changes reviewed and merged so we can ship. This will come with updated release notes explaining more about how the migration works as well as some tunable parameters.


Also, for those following the issue #8694 is not related to v0.12.0 so it's not going to block the v0.12.0 release

@BigLep BigLep linked a pull request Feb 17, 2022 that will close this issue
3 tasks
@BigLep
Copy link
Contributor Author

BigLep commented Mar 1, 2022

2022-02-28 update: 0.12 has been released: https://github.com/ipfs/go-ipfs/releases/tag/v0.12.0

We're working on wrap up this week (announcements, ipfs-desktop)

@BigLep
Copy link
Contributor Author

BigLep commented Mar 8, 2022

@guseggert : I don't see the blog post under: https://blog.ipfs.io/
I assume the marketing air table request hasn't been handled by marketing? Can you please provide the airtable link so I can follow?
I'm going to reopen to denote that this isn't done-done-done.

@BigLep BigLep reopened this Mar 8, 2022
@guseggert
Copy link
Contributor

Ah thanks, it was premature to resolve this. I don't have a link to the AirTable req, it is just a form. I can follow up with Emily.

@BigLep
Copy link
Contributor Author

BigLep commented Mar 8, 2022

Brave release for go-ipfs 0.12 is here: https://github.com/brave/brave-browser/milestone/269

@guseggert
Copy link
Contributor

Confirmed blog entry is published.

@BigLep BigLep unpinned this issue Mar 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/release Topic release
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants