Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically download and run migrations if needed #2939

Merged
merged 7 commits into from
Jul 22, 2016

Conversation

whyrusleeping
Copy link
Member

@whyrusleeping whyrusleeping commented Jul 3, 2016

This will prompt the user when a migration is needed to let ipfs try to automatically download and run the migrations.

If a version of the migrations tool is installed that is high enough, ipfs will use it instead of downloading its own.

Currently, the download its own function will fail as there is not a build of fs-repo-migrations with ipfs-3-to-4 up on the distributions page yet.

Resolves: #2907

License: MIT
Signed-off-by: Jeromy why@ipfs.io

"strings"
)

var DistPath = "https://ipfs.io/ipns/dist.ipfs.io"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make this overridable -- will also make it much much easier to test-drive this

Copy link
Member

@Kubuxu Kubuxu Jul 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, as we can then deploy to just fs:/ipfs/.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preference on an env var name?

@Stebalien
Copy link
Member

Why not just build the migration functionality into IPFS and run it on start if needed? Downloading and running arbitrary code like this makes auditing IPFS impossible and is generally considered a big no-no from a security perspective.

@jbenet
Copy link
Member

jbenet commented Jul 5, 2016

Why not just build the migration functionality into IPFS and run it on start if needed? Downloading and running arbitrary code like this makes auditing IPFS impossible and is generally considered a big no-no from a security perspective.

Certifying the migration code the same way we certify go-ipfs code is enough. Meaning:

  • we have the migrations accounted for and all versions hash addressed
  • we should do code signing (and we will, havent had the bandwidth)

It's not "arbitrary code", it's "code from the developers", which if you don't trust, then you probably shouldn't be running go-ipfs anyway.

If we really wanted to, we could version lock (with a hash) directly from go-ipfs, but i don't think it's necessary. An IPNS pointer and code signing will be enough.

There are many reasons NOT to embed that code in go-ipfs:

  • the migrations need to run the code from those old versions with exactly so they just end up vendoring the old code. putting that code in go-ipfs would blow up the binary size
  • we can avoid having to lug around code for old migrations
  • we can make sure to freeze those migrations and make sure they always run that way (instead of keeping them in a mutable, high velocity codebase)
  • these migrations are on the ipfs fs-repo, which is go-ipfs independent (other impls will be able to read the same formats and have to evolve, too)

@Stebalien
Copy link
Member

It's not "arbitrary code", it's "code from the developers", which if you don't trust, then you probably shouldn't be running go-ipfs anyway.

No, it's whatever happens to be at https://ipfs.io/ipns/dist.ipfs.io when the migrations run. I generally trust the IPFS developers to not be actively malicious but I don't expect them to be perfect and never lose control of a code signing key. Also, as implemented here, all an attacker needs a cert for ipfs.io and the ability to MITM the client (no developer involvement whatsoever).

If we really wanted to, we could version lock (with a hash) directly from go-ipfs, but i don't think it's necessary.

This is absolutely necessary. While a unfortunate (it kills disconnected upgrades), it's still (mostly) fixes the security issue.

Would these migrations really be that big? Usually, they're pretty tiny (on the order of kilobytes).

An IPNS pointer and code signing will be enough.

Until someone steels your keys.


Basically, this all comes down to (1) ensuring that everyone gets the same code and (2) minimizing the attack surface.

  1. If everyone gets the same code, it's much harder to perform targeted attacks. Personally, I use the Tomoyo MAC system to jail most programs on my system so I generally know when a program does something fishy (usually, this turns out to be "just a bug" but it's still nice to know what's going on). If everyone is running the same code, then chances are I (or some other security researcher) will notice the backdoor and report it. However, if you rely on code signing, someone could steal your private keys, hack the distribution server, and distribute malicious binaries to targeted organizations only.
  2. I already have to trust the Arch Linux developers to (1) not lose control of their keys and (2) verify updates to some extend before packaging. I don't expect them to examine the code but I do expect them to verify that there's nothing too fishy about the release. If you rely on code signing and distribute code out-of-band, I lose this extra verification step and now have to trust even more people not to make mistakes.

@Kubuxu
Copy link
Member

Kubuxu commented Jul 5, 2016

The fs-repo-migrations, as for today's build are 10MiB.

@Stebalien
Copy link
Member

A lot of that is the go runtime and shared dependencies. For example, if I remove the latest migration (3-4), it's 9.9MiB. However, some of the other migrations appear to be bigger (on the order of a megabyte) so I agree that building this into IPFS may not be the best way to do this (basically, IPFS would never be able to drop any dependencies needed to read old datastore formats).

@ghost ghost mentioned this pull request Jul 6, 2016
@whyrusleeping
Copy link
Member Author

@Stebalien it prompts you before doing anything. If you want to download and verify everything yourself you still can. This just makes it easier on the 99% of users who are okay with auto updates

@Kubuxu Kubuxu added this to the ipfs-0.4.3 milestone Jul 6, 2016
}

return RunMigration(tovers)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for creeping up on you with this feedback one day before the planned release :( :)

This should be part of the daemon command. With previous migrations, the daemon would abort with a non-zero exit code, now it'll just hang waiting for input. It'll be useful to have a -migrate=true|false option which assumes yes/no and omits the prompt. (Actually it's really important for server environments.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! 👍

@ghost
Copy link

ghost commented Jul 6, 2016

we could version lock (with a hash) directly from go-ipfs

I agree that version locking with a hash is good. Any hijacking of the ipfs.io CNAME DNS record will make the update vulnerable. This doesn't even require SSL MitM since we're not talking about A/AAAA records.

@Stebalien
Copy link
Member

it prompts you before doing anything. If you want to download and verify everything yourself you still can. This just makes it easier on the 99% of users who are okay with auto updates

Good point. That with code signing is probably good enough for most cases. Unfortunately, it doesn't cover the unattended upgrade use case (e.g., running IPFS as a daemon on a NAS/server) but distros could probably include a subset of the migrations in their packages and run them in a post-upgrade script. As a matter of fact, they'll probably do this regardless.

However, I don't really see a reason to not just hash the migration code and embed the hash in IPFS -- checking a hash is much easier than code signing. Yes this means you can't fix bugs in the migration code after the fact without also releasing a new version of IPFS. However, this would only be an issue if you expected the migration code to be significantly buggier than IPFS itself.

@whyrusleeping
Copy link
Member Author

@Stebalien

this means you can't fix bugs in the migration code after the fact without also releasing a new version of IPFS

Thats the reason i'm not as comfortable hard coding a hash. If that ends up happening (even though i'm fairly confident it wont) then the only reasonable response (from my end) is to re-release the same version with a new hard-coded migrations hash. Otherwise we end up with a 'bad' version that users would still be able to download and fail with in the future.

@ghost
Copy link

ghost commented Jul 6, 2016

I agree that version locking with a hash is good. Any hijacking of the ipfs.io CNAME DNS record will make the update vulnerable. This doesn't even require SSL MitM since we're not talking about A/AAAA records.

Nevermind, this is actually only an issue if you use a local IPFS daemon.

@Stebalien
Copy link
Member

Stebalien commented Jul 6, 2016

Thats the reason i'm not as comfortable hard coding a hash. If that ends up happening (even though i'm fairly confident it wont) then the only reasonable response (from my end) is to re-release the same version with a new hard-coded migrations hash. Otherwise we end up with a 'bad' version that users would still be able to download and fail with in the future.

At the end of the day, this is solving a different concern. Think of it this way:

By default, X would be bundled with Y as it is logically a part of Y. However, X is being extracted into a separate binary to be downloaded on-demand because...

Case 1: ...it would make Y significantly larger and isn't needed for normal operation.
Case 2: ...it can't be distributed with Y (e.g. licensing issues).
Case 3: ...it needs to be updated more frequently than Y.

Only case 3 motivates using code signing over including a hash but only case 1 is applicable here. Now, you could argue that, because we're doing this anyways, why not tackle two birds with one stone. However, I feel that a bug in the migration code is actually significantly less likely than a bug in the rest of IPFS so the chances of there being a reason to release a new migration tool and there existing no reason to release a new version of IPFS is vanishingly small. Basically, doing this would be sacrificing some security and simplicity (again, code signing is non-trivial) for negligible gain.

@whyrusleeping
Copy link
Member Author

@Stebalien want to try running the migrations to help test it for me? If i have more people tell me it works well then i'll be more inclined to hard code a hash in

@Stebalien
Copy link
Member

I have. I'm running ad5730d.

@whyrusleeping
Copy link
Member Author

okay cool, everything went smoothly i take it?

@Stebalien
Copy link
Member

I haven't noticed any problems so far. I just tried extracting an Arch Linux ISO (large file) I put into IPFS before the migration and got the correct result (md5 was correct) so it appears to be working.

@whyrusleeping whyrusleeping force-pushed the feat/auto-migrate branch 2 times, most recently from 650a0eb to 88a72d8 Compare July 7, 2016 01:29
@@ -16,6 +16,12 @@ import (

var DistPath = "https://ipfs.io/ipns/dist.ipfs.io"

func init() {
if dist := os.Getenv("IPFS_DIST_PATH"); dist != "" {
Copy link
Member

@jbenet jbenet Jul 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this should be a variable, not an env var. env vars are really pernicious as is. but added here that this control the distribution updates, rough!

the issue with env vars and why they're really bad for security is that it can turn secure scripts and commands into something unexpected by running before. meaning, in something like this:

> ./something
...
> /secure/bin/ipfs daemon --yes-run-migrations

something may not have been secured to not download anything, doesnt have permissions to change binaries or anything, but it can change an env var. it adds attach surface area in ways an option does not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though i think this is irrelevant if the hashes are hard coded.

@jbenet
Copy link
Member

jbenet commented Jul 7, 2016

Thanks everyone for this awesome discussion. Very happy with the careful thought expressed here!

Couple of points, some already mentioned above:

  • Relying on dnslink is not great in general. DNS lumps in a ton of authorities and can get co-opted relatively easily, and that sucks. Though yes, we already rely on it, which sucks.
  • while we dont have code signing yet, and dont use IPNS key links, we should hardcode the hash for safety. i think it's fine to have to release a new version of the code to get the new hash.
  • this would also make it possible to use IPFS itself to download the migration! :) ❤️ instead of using HTTP and relying on the gateway. (would have to bring up an ephemeral daemon with a temporary repo (because the binary expects a new version of the repo and cant use the current one), or could even use ipget)

@jbenet
Copy link
Member

jbenet commented Jul 7, 2016

@whyrusleeping give me a set of commands to try

@whyrusleeping
Copy link
Member Author

@jbenet just install this version and make sure your repo is an older v3 version. Then run ipfs daemon

@ghost
Copy link

ghost commented Jul 11, 2016

I have a commit to add to this: bc6d71f

It detects whether the running go-ipfs binary is linked against musl libc, and sets the OS to linux-musl in case it does, so that we end up downloading fs-repo-migrations_v1.1.0_linux-musl-amd64.tar.gz. This allows our docker image to auto-migrate :) docker run -it -v /path/to/repo:/data/ipfs ipfs/go-ipfs --migrate=true

@ghost ghost force-pushed the feat/auto-migrate branch from 5a66520 to 4f82eb7 Compare July 17, 2016 18:43
@ghost
Copy link

ghost commented Jul 17, 2016

Okay, rebased this branch on top of a recent-ish master, added my OS detection commit, amended to not spawn a shell.

@ghost
Copy link

ghost commented Jul 17, 2016

BTW, here's how I've been testing this, with go-ipfs/ipfs being from the v0.4.2 tarball: image=$(docker build -f Dockerfile.fast .) && IPFS_PATH=ipfspath go-ipfs/ipfs init && docker run -it -v $(pwd)/ipfspath:/data/ipfs d2dea496599f --migrate=true

whyrusleeping and others added 6 commits July 19, 2016 06:50
License: MIT
Signed-off-by: Jeromy <why@ipfs.io>
License: MIT
Signed-off-by: Jeromy <why@ipfs.io>
License: MIT
Signed-off-by: Jeromy <why@ipfs.io>
License: MIT
Signed-off-by: Jeromy <why@ipfs.io>
License: MIT
Signed-off-by: Lars Gierth <larsg@systemli.org>
License: MIT
Signed-off-by: Jeromy <why@ipfs.io>
@@ -15,7 +15,7 @@ import (
"strings"
)

var DistPath = "https://ipfs.io/ipns/dist.ipfs.io"
var DistPath = "https://ipfs.io/ipfs/QmUnvqDuRyfe7HJuiMMHv77AMUFnjGyAU28LFPeTYwGmFF"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

License: MIT
Signed-off-by: Jeromy <why@ipfs.io>
@ghost
Copy link

ghost commented Jul 20, 2016

Feedback from @jbenet about the prompt and userspace: would be good to prompt only when there's a TTY, and otherwise assume --migrate=false. Otherwise SGTHim

@Stebalien
Copy link
Member

Where does it verify the hash or is that not happening (yet)?

@ghost
Copy link

ghost commented Jul 21, 2016

Gonna test this with ipfs-ctl too tomorrow

@ghost
Copy link

ghost commented Jul 21, 2016

Where does it verify the hash or is that not happening (yet)?

  • hardcoded hash -- this will require an fs-repo-migrations release before bumping the repo version on master the next time. I think that's good as it makes it easier for others to test migrations .

Feedback from @jbenet about the prompt and userspace: would be good to prompt only when there's a TTY, and otherwise assume --migrate=false.

  • It looks like fmt.Scanf() already does the right thing for us.

Gonna test this with ipfs-ctl too tomorrow

  • I tested this and I think it's fine -- js-ipfsd-ctl seems to always create a fresh repo.

@whyrusleeping
Copy link
Member Author

@lgierth did @jbenet get a chance to look at this then?

@whyrusleeping
Copy link
Member Author

Chooo Chooo!

@whyrusleeping whyrusleeping merged commit 83d9c1c into master Jul 22, 2016
@whyrusleeping whyrusleeping deleted the feat/auto-migrate branch July 22, 2016 12:47
@kevina kevina mentioned this pull request Aug 28, 2016
"strings"
)

var DistPath = "https://ipfs.io/ipfs/QmUnvqDuRyfe7HJuiMMHv77AMUFnjGyAU28LFPeTYwGmFF"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this need a migration usable by 0.4.3(-*)? ? i dont think this has the last migration needed

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it has fs-repo-migrations-1.1.0 which includes the 3-to-4 migration: https://github.com/ipfs/fs-repo-migrations/tree/v1.1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automate downloading and running of migrations
4 participants