Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Leveraging IPFS #148

Closed
JAremko opened this issue Aug 29, 2017 · 17 comments
Closed

Discussion: Leveraging IPFS #148

JAremko opened this issue Aug 29, 2017 · 17 comments

Comments

@JAremko
Copy link

JAremko commented Aug 29, 2017

food for thought ipfs/notes#171

Can it be the future 🤔

@raxod502
Copy link
Member

How does this play with Git? Or in other words, what would be gained by using IPFS? The linked issue seems to be mostly talking about centralized package managers.

@JAremko
Copy link
Author

JAremko commented Aug 29, 2017

The idea is to store packages decentralized in IPFS

Permanent storage for packages addressed by their content instead of names without relying on single server (GitHub) Also it will be much faster since the data will be fetched from the closest peers (super fast local network or Internet) also package content cannot be forged (packages are addressed by their content)

So basically everyone's Emacs package folder becomes mini GitHub (exposed via IPFS) and in the case when package can't be find on the IPFS network - we can fallback on the centralized method (getting a package from GitHub) and then (if user has IPFS gate) announce that he has the files to IPFS peers.

Example: https://github.com/whyrusleeping/gx#dependencies

The tricky thing is discovery. You need to know the hash of the package that you want to download. @syl20bnr Projects like Spacemacs can provide a file (generated in CI like we do with http://develop.spacemacs.org files) that will contain mappings between package names (or GitHub URLs)/versions and hashes that can be used to fetch the package from IPFS. This way we'll be able to fast and reliably download packages over untrusted network.

Theoretically it can be done without such file via IPFS announcements and package signing by the authors. (we still need a way to verify their identity) but I'm not so familiar with the topic 🤔

@raxod502
Copy link
Member

That sounds very fascinating. There are indeed infrastructural problems, like the fact that each user must keep track of the hashes of their packages (which will change on every update), but I can see it working. (Although I probably won't be the one to implement it.)

@raxod502 raxod502 changed the title Leveraging IPFS Discussion: Leveraging IPFS Aug 29, 2017
@JAremko
Copy link
Author

JAremko commented Aug 29, 2017

@raxod502 This task probably requires more inside knowledge of straight.el than IPFS specifics. It has RPC API over HTTP and pretty straight forward CLI tool

Examples

Looks really promising and simple https://github.com/ipfs/examples/tree/master/examples/git

Unfortunately the official demo gateway https://gateway.ipfs.io seems to be always overwhelmed

@raxod502
Copy link
Member

From my perspective, this aspect of straight.el seems quite simple: downloading a package is literally just git clone on the relevant URL retrieved from the package recipe. Pulling is git pull. Publishing changes is git push. Etc.

What types of inside knowledge would be useful to someone implementing this functionality? Or equivalently—I know I won't be able to get to this anytime soon, since the 1.0 release will come first, so is there anything I can do to make sure that lack of knowledge about straight.el internals is not a blocking issue?

@JAremko
Copy link
Author

JAremko commented Aug 29, 2017

@raxod502 So straight.el uses git clone Interesting.

Actually it should be much faster to get last N revisions of a package with something like
https://github.com/dashpay/dash/zipball/master
https://github.com/dashpay/dash/zipball/master~1
https://github.com/dashpay/dash/zipball/master~2
....
https://github.com/dashpay/dash/zipball/master~N

or a particular one https://github.com/dashpay/dash/zipball/3069e0c

unfortunately it will not have .git stuff. But a package can be downloaded "normally" i.e. cloned just before its user wants to do something else than just... use it. 🤔

@raxod502
Copy link
Member

I understand that many package managers operate by downloading a snapshot of the package's code, but that's not how straight.el works. One of the basic principles of straight.el is that when you want to contribute changes to a package upstream, there are no additional steps other than jumping to the source code and pushing your changes.

Now, I'm not saying there shouldn't be an option for an alternative mode of operation where the revision history is not downloaded, but that is entirely unexplored territory. And this is why I am saying that the conceptual work here does not really concern the current implementation of straight.el.

@dieggsy
Copy link
Contributor

dieggsy commented Aug 30, 2017

There are no additional steps other than jumping to the source code and pushing your changes.

@raxod502 This is what got me interested in the first place despite initial skepticism and though unrelated to this issue, I don't think I got the chance to say - I'm really enjoying it so far, well done! I hope straight.el thrives, cause it just feels like the way I wanted/should have done things from the start.

@syl20bnr
Copy link

syl20bnr commented Aug 30, 2017

One of the basic principles of straight.el is that when you want to contribute changes to a package upstream, there are no additional steps other than jumping to the source code and pushing your changes.

From the context of a user installing straight.el explicitly it makes sense but from the context of Spacemacs it is very different.

In the context of Spacemacs 80% of the users won't contribute to any package. And for the 20% who do contribute then they will contribute to maybe 20% of all the packages installed by Spacemacs. The eager download strategy of the source code history is then a guaranteed way to waste both time and space.

In the context of Spacemacs the eager download strategy should be replaced by a lazy strategy. For instance by prodividing an interactive function to prepare a package for contribution in one step.

Then all users benefit from this approach, by default the PM don't waste precious bandwidth and space, but at any moment any user can turn a bare package installation into a full package installation with all the source code history and the feature branch already in place with upstream configured and so on.

@raxod502
Copy link
Member

For instance by prodividing an interactive functions to prepare a package for contribution in one step.

I'm not opposed to adding such a mode of operation. It wouldn't be enabled by default in straight.el, but Spacemacs could certainly enable it by default.

waste both time

You'd be surprised how much network latency drowns out any difference between git clone and package.el HTTP, in practice. I think the real issue is disk space, which is usually cheap but not always.

@raxod502
Copy link
Member

Another thing to consider is shallow clones (see #2). I have a hunch that if shallow cloning is enabled, then straight.el would take less than 10% more time than package.el and less than 20% more disk space to install the same Spacemacs packages. I may do a benchmark, and we can see if my guesses are totally wrong or not.

@syl20bnr
Copy link

I'm not opposed to adding such a mode of operation. It wouldn't be enabled by default in straight.el, but Spacemacs could certainly enable it by default.

That would be perfect.

I think the real issue is disk space, which is usually cheap but not always.

Also volume of data downloaded, depending on the scale it can make a big difference. I think both Spacemacs and straight.el should provide an ecologically responsible solution.

@syl20bnr
Copy link

I have a hunch that if shallow cloning is enabled, then straight.el would take less than 10% more time than package.el and less than 20% more disk space to install the same Spacemacs packages.

💜

@JAremko
Copy link
Author

JAremko commented Aug 30, 2017

You'd be surprised how much network latency drowns out any difference between git clone and package.el HTTP, in practice. I think the real issue is disk space, which is usually cheap but not always.

For me
git clone https://github.com/dashpay/dash takes ~22sec

curl https://github.com/dashpay/dash/zipball/master takes ~3sec + unzip (negligible time)

Should be even faster if we can avoid doing ssl handshake each time.

@raxod502
Copy link
Member

I was talking about small repositories, which is the majority of Emacs packages. Indeed, there is a big difference for big repositories. However, the gap is reduced to virtually zero if you take shallow cloning into account:

  • curl: 7.602s
  • git clone --depth 1: 7.644s

(this is on the repository you linked)

Now, shallow cloning might not be feasible due to complications pointed out by @vyp at #2 (comment). So it may be necessary indeed to implement an alternate installation mechanism.

@raxod502
Copy link
Member

Looks like I was indeed wrong about my predictions vis-à-vis package.el vs. straight.el on runtime and space usage.

  • straight.el: 13m9s.
  • straight.el with shallow cloning: 8m51.
  • package.el: 4m8s.

(this is for installing the packages from #128 (comment))

Also, the disk usage is 78 MB for package.el versus 408 MB for straight.el with shallow cloning.

Indeed, it will be necessary to support an alternate initial mode of installation if straight.el is to be used in Spacemacs.

@raxod502 raxod502 modified the milestones: Spacemacs integration, Spacemacs Sep 2, 2017
@raxod502
Copy link
Member

I think this discussion has run its course. Please open a new one for further inquiries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants