Skip to content
This repository has been archived by the owner on Apr 16, 2020. It is now read-only.

add an experimental version of mount to go-ipfs #74

Closed
3 of 20 tasks
djdv opened this issue Jul 11, 2019 · 19 comments
Closed
3 of 20 tasks

add an experimental version of mount to go-ipfs #74

djdv opened this issue Jul 11, 2019 · 19 comments
Assignees

Comments

@djdv
Copy link

djdv commented Jul 11, 2019

Fundamentally, we want interoperability between an IPFS node and a variety of existing platforms and tools. (such as using rsync on a POSIX system with IPFS source and target paths)

Just as the gateway is a bridge between an IPFS node and HTTP, we would like to build a bridge between IPFS and various file system APIs.

We can achieve this by using a platform-agnostic protocol that clients can use, to interact with IPFS, in a way that is conventional for their platform. Oriented around file system operations to start.

For our experimental version, we abstractly want to create a server and client, which uses this protocol to construct a conventional Unix file system on the client-side, which is usable/mountable by the operating system.
(Exposing file-like objects to the host API, that map to IPFS constructs rather than files on disk)

At current, we have a daemon plugin for go-ipfs that hosts a service, and we're testing this against various client implementations.
The protocol we're using is 9p2000.L since it already takes some of our interface needs into consideration, such as platform and transport independence.
In addition there are existing server and client implementations to compare and test against.

When various milestones are reached, we should merge them back into go-ipfs, behind an experimental flag.

A check signifies an experimental version of this has worked in testing, but the implementation may still need to be changed again for standard compliance, operation additions, edge cases, etc.


go-ipfs "filesystem" plugin (resource server implementation)

  • Exposes read-only output support for the following IPFS subsystems
    • IPFS
    • Pin API (We treat /ipfs as a directory of the node's pins)
    • IPNS
    • Key API
    • Files API (currently synonymous with MFS)
  • Exposes and maintains standard metadata across all subsystem nodes
    • Basic (bare requirments for operation; type, size, etc.)
    • Standard (standard expectations from POSIX (maintained timestamps, permission/access info, etc.)
    • Extended (UFSv2?)
  • Exposes read+write support for the following IPFS subsystems
    • IPNS + Key API
    • Files API
    • Pin API (? Delete should incur pin rm)
    • IPFS (? Delete should incur block rm)

File system clients

  • Can attach to the IPFS node without protocol conflicts or operation faults
    • Our own test client
    • v9fs (standard mount handler)
    • 9pfuse (standard mount handler; requires 9p2000 or 9p2000.u server support)
    • our own fuse client (*nix+Windows ipfs mount "filesystem-client" plugin)
@djdv
Copy link
Author

djdv commented Jul 23, 2019

Status update:
The current plan is to create a go-ipfs daemon plugin that exposes a filesystem(-server) upon daemon startup.

A client (the mount program in this case) will request and perform operations on this filesystem, and do whatever translations are needed (for the local system).
e.g. translating IPFS semantics into FUSE ones, or overriding certain functions (such as injecting metadata)

The interface around this will require some kind of protocol between the client program and the file system(-server).
With the options being to use either language native or language agnostic semantics.
In the former case you can imagine Go interfaces, in the latter something like 9P (the latter would still be abstracted by interfaces, but implies serialization)

In the short term I'm going to investigate the viability of modifying the 9P protocol, so that we can use it to handle transparency at a data transport layer.
i.e. The interface for interacting with a node's filesystem should work transparently. Regardless of local/remote or cross language considerations

Something like this should allow us to better compose and manipulate tree structures, in a distributed and shared way.
(imagine a single node constructing a tree and serving it to multiple mount clients, concurrently).

It should also allow us to separate parts of the logic. Pushing edge cases or special handling into separate packages, that compose a specific view by wrapping other filesystem servers.

This remains to be seen though. If things start to sprawl out, I'm going to pivot into a fork of go-ipfs that is only capable of mounting itself, and then try to achieve the other goals through other means more gradually.

Backlink: #71

@Stebalien
Copy link
Contributor

Note: The plan is to have a basic demo of the 9P based system by Monday or a solid understanding of why it won't work.

@warpfork
Copy link
Collaborator

It might be neat to use gvisor as a test target -- IIUC, they have a filesystem plugin system which is fairly literally 9P protocols. Might be good for testing convenience that's free of host system hijinx, as well as being a cool demo in and of itself.

Ignore this if it's not on your hot/convenience path of course. Just throwin' it out there.

@djdv
Copy link
Author

djdv commented Jul 24, 2019

Originally I was looking at https://godoc.org/github.com/docker/go-p9p
intending on building to an end that looks similar to what I'm seeing in gvisor's fs pkg's.
With that in mind, it would probably make the most sense to utilize the gvisor libs if we can here.
Very cool!

Edit: while cool, those packages seem to depend on Linux syscalls.
Edit 2: this also seems cool https://github.com/droyo/styx

@djdv
Copy link
Author

djdv commented Jul 29, 2019

Status update:
A prototype of the daemon plugin exists and exposes a 9P2000 resource server for the IPFS namespace.
An existing 9P shell can attach to it and exchange protocol messages.
https://www.youtube.com/watch?v=KpwFhV3aNYI

This introduces separation between the node's file system(s) and its client implementations, that are unified around the 9p protocol.
IPFS nodes will expose a 9p service.
Clients will utilize this service; interpreting the data to fit their native requirements (without requiring changes in the node/server implementation).

The demo is not fully functional right now, so the current plan is to keep implementing 9P operations on the server, until we have basic read support for the IPFS namespace, exposed over 9P.

A client program that utilizes this namespace, providing 9P <-> cgofuse bindings, should follow.

For go-ipfs, this will likely mean a series of PR's where we gradually add support.
Starting with a plugin that can serve a read only /ipfs tree.
With a client that can read from it, spawning in a separate repo. (probably something like pkg/9p-cgofuse/cmd/mount-ipfs)
We would then repeat, adding more file trees to this plugin (like /ipns), for clients to attach to. (as they get implemented, reviewed, merged, etc.)


Additional notes:
Testing with existing 9P tools such as mount(1), 9pfuse, v9fs will have to be done as well sometime.
Using 9P software directly with the node if possible seems preferable, while the 9P <-> cgofuse bridge would still be available for compatibility with a wider range of systems/tools.

This may require changes to increase compatibility with other versions of the styx/9p2000 protocol (9p2000.u, 9p2000.L, 9p2000.e, et al.)
Likewise we may want to make our own modifications to 9P messages / the protocol, later.

@hugelgupf
Copy link

Hey yall, I just split out gVisor's 9p code into its own lib. I feel it's a bit cleaner than the alternatives mentioned here, and I'm still cleaning it up a bit: http://github.com/hugelgupf/p9

@Stebalien
Copy link
Contributor

Status update: @djdv ran into some trouble getting that P9 implementation working on Windows and will focus on getting a usable IPFS plugin working on Linux first.

@djdv
Copy link
Author

djdv commented Aug 7, 2019

I'm putting this note here because it seems like a good spot for it.

Given the long time between the start of this effort and now, it may be best to recap some things and summarize the status.

We wanted to support mount on more platforms, as well as add some features to it, and improve things such as the performance (common complaint).

Progress was made on this, and resulted in a reasonably functional fork that did the thing (most of the time). All related demos are up here and there's even more textual updates in the related issues.

It mounts on a few more platforms than mainline. And the read performance + scheduling was pretty good. Write support was also there for IPNS and MFS, however these suffered severely in performance for batches of small writes. Copying over GB large files seemed like native speeds during testing, but doing a shallow clone of the source repo took longer than a day to complete.

This matches up with things that have been reported by @dirkmc and @andrew, in regards to adding files and mirroring package repos.

The fuse branch(es) that exists, is made up of spaghetti code unfortunately never became stable enough to replace the existing implementation with it. There was a lot of complexity initially and this was reduced several times over, but issues would persisted in some areas.

Since then, research on other file and operating systems has been done, and designs have continued to be discussed, with more people joining in.

Long term, the ideal is to not only have something compatible, but also flexible. We should be able to have the same relationship the gateway has with HTTP, but at a file system protocol/ABI/API level. (the ideal being that supporting other protocols/platforms should not be a maintenance burden or be too unapproachable for those not familiar with the entire project codebase)
Code review with the existing fork would have been just unreasonable.

With all that in mind, the current plan is to just take all this context and take a more sensible approach to the same problem. Gradually migrating step by step from the existing fork, and making design decisions in phases (rather than not at all / all at once).


Editorial:
I feel as though this sort of approach was much less feasible when the effort began, since starting a lot of our APIs have changed for the better. Despite gaining flexibility, the complexity has been reducing over time, and indirectly our foundational APIs have been able to improve based on requirements from efforts like these. Things such as the coreapi have been a great aid.

@djdv
Copy link
Author

djdv commented Aug 8, 2019

Status update: @djdv ran into some trouble getting that P9 implementation working on Windows and will focus on getting a usable IPFS plugin working on Linux first.

Clarifying this. The p9 library couldn't receive messages on Windows but this was quickly amended thanks to @hugelgupf.

I'm at a point currently where I can start a daemon which hosts a 9p resource server/file system server, and have a client connect to it and do some operations. Right now we have a root that contains a soft directory, and that subdirectory contains the contents of your IPFS pins.

Doing development on a Windows machine primarily at the moment.
vfs - listing directories

bonus image from yesterday's meeting (content warning, broken font rendering in my terminal)

Next should come metadata translations, and file reading support for IPFS objects under /ipfs/Qm....

@djdv
Copy link
Author

djdv commented Aug 8, 2019

The topic of cross platform error values did come up though (centring around the use of syscall.Errno being used). It remains to be seen if this is going to cause issues yet when dealing with different types of machines and operating systems.

@djdv
Copy link
Author

djdv commented Aug 14, 2019

Status update: (djdv/go-ipfs@3dc315f)
I managed to read some IPFS paths.
vfs - Reading
Unfortunately my terminal fonts still look weird.

Next I'm going to try and put this behind some kind of experimental flag, so that the server only starts listening when enabled. This should help us towards getting a PR against go-ipfs.

I'm also going to point a 9P aware client (like 9vfs) at it and see what happens.
The client in that screenshot is written by us and lives inside the plugin directory here.
It simply connects and spits out a lot of info. In the future we're going to want one of these that connects to the server and translates between 9P <-> FUSE.

There's also the issue about error return values being different across platforms. I'm going to ignore this for now and continue testing the client and server on the same platforms. With cross platform support considerations to come later. Ideally any client should be able to connect to any server without inconsistencies.

@djdv
Copy link
Author

djdv commented Aug 19, 2019

Note:
For testing the client, I needed to enable a kernel config option CONFIG_NET_9P_DEBUG, otherwise the debug parameter in mount -v -t 9p -o debug=4 127.0.0.1 ./9test had no effect, only errors were being printed.
VFS Linux debug messages

Related debug kernel packages for Arch: https://drive.google.com/open?id=1Bf-vKSsscNa4C4ZSaCQfDEhrdTAjpsYq


When mounting, it seems like the kernel client is trying to open the root as a regular file, with write access. The server responds with (-22) which I assume is EINVAL. The server should be telling the client, that the mode of the root is ModeDirectory so I'll have to see why the client is making this request.

@djdv
Copy link
Author

djdv commented Aug 19, 2019

Followup to the previous post.
https://youtu.be/89XNgCO3Dw0

Linux's ls is making a request that should be valid, but is getting denied here
https://github.com/hugelgupf/p9/blob/7ba0920fba11f7a99b28401f1a01852643f3717d/p9/handlers.go#L273-L277
before the library actually invokes our own Open call
https://github.com/hugelgupf/p9/blob/7ba0920fba11f7a99b28401f1a01852643f3717d/p9/handlers.go#L294-L295

We need to see where the discrepancy is. The request from ls may be valid, and the check returning EINVAL may be too strict. It's possible the server is returning bad data somewhere. Going to continue checking the debugger and message logs.
cc: @Stebalien

@Stebalien
Copy link
Contributor

Ah, I found the issue. The issue is that we're trying to open the file non-blocking, as far as I can tell. That is. The 0o4000 in 0o304000 is O_NONBLOCK.

I misread the code. It's not checking for a read/write flag. It's checking to make sure there aren't any bits we don't understand. We need to modify the library to handle this "don't block" request and turn it into a "operate offline" request (maybe prefetching on the side?).

@Stebalien
Copy link
Contributor

Filed an issue: hugelgupf/p9#6.

@djdv
Copy link
Author

djdv commented Aug 20, 2019

Not there yet, but getting there.
https://youtu.be/e719oTQUu1U

@djdv
Copy link
Author

djdv commented Aug 22, 2019

Status update: (djdv/go-ipfs@f97fde4)
I fixed some issues around walking paths, as well as some issues around returning bad metadata. (times where simply wrong (lol) and "kind" conversion between IPFS types (core, ufs) <-> 9P (9P FileMode, QIDType), was ironed out a bit.

I managed to remotely interact with an IPFS node running on a Windows machine, from inside a Linux vm. Using IPFS data with a few existing programs.

Desktop 2019 08 22 - 13 13 11 12 - 00 02 42 581
https://youtu.be/0XXn8iqBtGs

@djdv
Copy link
Author

djdv commented Aug 29, 2019

A draft PR has been put up here ipfs/kubo#6612
I'll need to write some tests, and get input on some work that can be done in other packages before this can go it, but it should be functional at the moment.

After all that, these are the current plans for next steps:

  • IPNS Read+Write
  • Maybe add 9P2000.u support to server library (p9) (would support existing 9pfuse (written in Plan9-C) client)
    | (both should probably happen, independent of order)
  • Maybe write our own fuse client (in Go) that uses 9P2000.L (or our own) specification

In speaking with @Stebalien about metadata, it seems like we'll want to couple attributes with newly generated data, rather than try to retrofit existing data with non-distributed attributes (essentially would have been a node-local database). So this will be considered going forward.
e.g. We'll see write calls producing new formats containing attributes instead of adding a layer which adds attributes during calls on old formats.

@djdv
Copy link
Author

djdv commented Jan 21, 2020

I failed to meet expectations for this endeavour, and thus progress is officially halted.

... ipfs mount support, has been inflight for over a year without shipping any incremental milestones or value.

The branches for read and write were never deemed acceptable experiments and thus remain out of tree.
The last status update was posted here.

I apologize to the community, to which I made promises I could not keep.

An updated version of the branch lives here.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants