Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial SPEC draft #2

Merged
merged 2 commits into from
Oct 3, 2019
Merged

Add initial SPEC draft #2

merged 2 commits into from
Oct 3, 2019

Conversation

lidel
Copy link
Member

@lidel lidel commented Aug 22, 2019

This PR proposes a simple spec for MFS-based cohosting scheme suggested in ipfs/ipfs-desktop#1034.

Motivation

  • make it easier for people to contribute storage and bandwidth to sites and datasets they care about
    • support IPNS (libp2p keys) and DNSLink roots (human-readable)
  • make it easy to implement in companion, desktop and webui
  • periodically detect updates to and preload them to a local node

TL;DR

  • proposed scheme does not introduce any config files, path convention is used instead
    • /cohosting - presence of directory in MFS root enables cohosting logic
    • /cohosting/<site-id> - presence of directory enables update checks for this site
    • /cohosting/<site-id>/<timestamp> - site snapshot at a point in time
  • Files API is all one needs to manage cohosted websites:
    • ipfs files ls /cohosting/ returns a list of cohosted websites
    • ipfs files mkdir -p /cohosting/docs.ipfs.io adds a website to cohosting list
    • ipfs files rm -r /cohosting/docs.ipfs.io stops cohosting of a site
  • a periodic check every 12h can be done by multiple apps without duplicated work
  • MFS takes care of keeping snapshot around (protecting from GC)
  • User can browse and manage cohosted snapshots via Web UI's Files screen

@lidel
Copy link
Member Author

lidel commented Aug 22, 2019

@hacdias @autonome mind glancing at the initial draft? (anything missing, rephrasing, typos)
(I want to write coshosting.sh as a PoC and include it in this PR, but want to ensure spec is sane first)

License: MIT
Signed-off-by: Marcin Rataj <lidel@lidel.org>
@autonome
Copy link
Collaborator

Let's take a stab at evaluating this spec against a set of use-cases.

Maybe we could mark in or out of scope, and for in-scope use-cases, describe how it would work?

Here's a set off the top of my head, which traverses the publisher-user and the reader-user flows, and also when those are the same user:

  • I want to save a new timestamped copy of the web page I'm currently browsing
  • I want to save a new timestamped copy of the web site I'm currently browsing
  • I want to see which websites, pages and data I have saved locally
  • I want to serve a specific web page or site I've saved locally
  • I want to stop serving a specific web page or site I've saved locally
  • I want to start serving all my local web pages and sites
  • I want to stop serving all my local web pages and sites
  • I'm offline, and want to read one of the websites I have saved locally
  • I'm offline but connected to a local network, and I want to know whether a specific website is cohosted on the network
  • I'm offline but connected to a local network, and I want to load a website I know is cohosted on the network
  • I'm offline but connected to a local network, and I want to see a list of websites and pages that are cohosted on the network

@lidel
Copy link
Member Author

lidel commented Aug 23, 2019

I took a stab at answering those, it surfaced some caveats (marked with 🍊 below).

Would be useful to think what should this spec cover, and what can be built in userland, on top of it. Personally I'd like to keep the spec as small as possible, a bare minimum to switch tools like ipfs-cohost and ipfs-companion from pins to MFS.


🍏 = in scope
🍊 = could be in scope, but there are caveats and/or additional work needed
🚫 = not in scope

  • 🍊 I want to save a new timestamped copy of the web page I'm currently browsing
    • working at the page level is not supported by the current spec draft
      (it was designedto cohost entire websites)
      • we could add it by introducing paths and sites namespaces:
        • /cochosting/paths/<site-id>/<url-escaped-path>/ - cohosted pages (only specific paths)
          • open question: how to handle subresources such as images? should we enumerate and copy them to cohosting directory as well? this gets hairy and complex really fast
        • /cohosting/sites/<site-id> - cohosted sites (entire roots)
      • ipfs-companion would provide options to "cohost this page" and "cohost entire website"
    • related feature was suggested in Alert on changes to dynamic website w/ option to inspect ipfs/ipfs-companion#749:

      Every time a dynamic ipfs-website (ipns or dnslink) is loaded through companion, companion could write down the resolved CID (in local storage)

    • there is also a related feature request to "save a detached copy of a page", but it error prone, as it requires mutating the HTML to fix subresource paths: Save entire Web page to IPFS ipfs/ipfs-companion#91. Adding data as-is to MFS is much better, does not creaty any new data, deduplication happens at the block level.
  • 🍏 I want to save a new timestamped copy of the web site I'm currently browsing
    • replace pinning toggle in ipfs-companion UI with cohosting one
      • once website is cohosted, every future visit will automatically refresh it
  • 🍏 I want to see which websites, pages and data I have saved locally
    • short term: traversal of /cohosting tree with Web UI or in CLI with ipfs files ls and ipfs files stat
    • future: custom UIs built on top of ipfs files ls and ipfs files stat
  • 🍏 I want to serve a specific web page or site I've saved locally
    • happens automatically for everything added to repo cache and MFS keeps it from being gc'd
  • 🍊 I want to stop serving a specific web page or site I've saved locally
    • remove /cohosting/<site-id> from MFS (+ optionally remove associated blocks)
      • this one is tricky because IPFS automatically provides every block that is cached in its repo, so even when user removed site from MFS, blocks are still provided until GC happens
      • personally I think keeping blocks in cache is ok, but if we want to be strict and stop providing immediately, potential solution would be to do ipfs block rm on top of removal from MFS or just trigger global gc
  • 🍏 I want to start serving all my local web pages and sites
    • just create valid /cohosting tree in MFS
  • 🍊 I want to stop serving all my local web pages and sites
    • (same caveats as for stopping serving of a single webpage)
  • 🍊 I'm offline, and want to read one of the websites I have saved locally
    • website data would be cached locally and ready for use, but DNSLink websites require DNS lookup to work. To have /ipns/docs.ipfs.io work in offline mode, we need to add a persistent DNSLink cache that is used as a fallback when online DNS lookup is not possible.
  • 🍊 I'm offline but connected to a local network, and I want to know whether a specific website is cohosted on the network
    • same DNSLink caveat as in previous one
    • list of peers providing the website root could be checked with ipfs dht findprovs <cid> or by posting ask on a well-known pubsub channel based on the local network mask
      • in theory dht findprovs would return peers from local network announced having it, however the exact behavior and lookup performance in disjoint DHT shards is unknown to me, pubsub sounds like a good alternative
  • 🍊 I'm offline but connected to a local network, and I want to load a website I know is cohosted on the network
    • same discovery and DNSLink caveats as in previous one
  • 🍊 I'm offline but connected to a local network, and I want to see a list of websites and pages that are cohosted on the network
    • tricky, but could be done by posting an ask on a well-known pubsub channel based on the local network mask and/or other proximity-detection heuristics

@hacdias
Copy link
Member

hacdias commented Aug 24, 2019

Thanks for bringing those questions @autonome! Those are excellent points and I believe @lidel made an awesome work giving feedback about each one. I have some additions/questions to add/make:

To have /ipns/docs.ipfs.io work in offline mode, we need to add a persistent DNSLink cache that is used as a fallback when online DNS lookup is not possible.

Can't we just look at MFS to check if the website is there? That way we can show the latest version of the website to the user. Am I missing sth?

I'm offline but connected to a local network, and I want to see a list of websites and pages that are cohosted on the network

Love this idea!

@lidel
Copy link
Member Author

lidel commented Aug 26, 2019

To have /ipns/docs.ipfs.io work in offline mode, we need to add a persistent DNSLink cache that is used as a fallback when online DNS lookup is not possible.

Can't we just look at MFS to check if the website is there? That way we can show the latest version of the website to the user. Am I missing sth?

We can, but there is a caveat: ipfs node won't be able to load /ipfs/docs.ipfs.io because DNSLink path triggers DNS lookup inside of IPFS node and in offline mode DNS will be down.

We could mitigate it by:

  • (no changes to go/js-ipfs) ipfs-companion could check MFS as a DNSLink fallback, then instead of loading /ipfs/docs.ipfs.io from gateway it would load last known snapshot from /ipfs/Qmd41WqbCsfTx4wJvP6vvv3hHb46bEHG1hC6Kqt7mhGQUR
  • (changes in go/js-ipfs) go-ipfs adds support for checking MFS as a fallback for DNSLink lookup when upstream DNS resolver is down

@lidel lidel marked this pull request as ready for review August 26, 2019 11:29
@lidel
Copy link
Member Author

lidel commented Aug 26, 2019

I think its ready for review: let me know if good to merge as initial spec and PR against it, or should we change something from the get go.

Copy link
Member

@hacdias hacdias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Collaborator

@autonome autonome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼 great start, just a few minor clarifications needed

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@hacdias
Copy link
Member

hacdias commented Sep 13, 2019

@lidel @autonome I believe this is quite ready to be merged.

+ apply changes from review

License: MIT
Signed-off-by: Marcin Rataj <lidel@lidel.org>
@lidel
Copy link
Member Author

lidel commented Oct 3, 2019

Thank you for the feedback!

I applied suggested changes and made it very clear that this is an exploratory experiment.
Let's merge this draft to enable linking to it and further discussion via PR/issues.

@lidel lidel changed the title Add cohosting SPEC Add initial SPEC draft Oct 3, 2019
@lidel lidel merged commit 820972a into master Oct 3, 2019
@lidel lidel deleted the docs/spec branch October 3, 2019 10:39
@DavidBurela
Copy link

Would a lot of the cohosting scenarios be solved more simply by having IPFS have better support for pinning /ipns addresses?

Currently ipfs pin /ipns/blog.ipfs.io only pins once, and doesn't pin new content as the IPNS address is updated. It seems a lot of cohosting could be done if the IPFS monitored IPNS updates and repins ones it is watching.

@lidel
Copy link
Member Author

lidel commented Dec 12, 2019

Yes, ability to pin ipns names was discussed in ipfs/kubo#1467
and pinning that follows updates is tracked in ipfs/kubo#4435

Both are good ideas but require changes to Core APIs and implementations.
We experiment with this spec in userland to better understand the problem space before making upstream changes (eg. ways to use "lazy" cohosting (#6) when disk space is limited)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants