-
Notifications
You must be signed in to change notification settings - Fork 30
Implications of the Filecoin launch for IPFS #435
Comments
Good alert Mikeal! I think it's a huge win that it's so easy to use IPFS and Filecoin together (IPFS users will rejoice about being able to just reference datasets already loaded onto IPFS when persisting with Filecoin!). However, I agree that now is a time to look at some of the user defaults and make sure they make sense for the influx of new users building on IPFS and Filecoin together. @aschmahmann / @petar - now may actually be the time to switch default pinning strategy to avoid the large data failure case Mikeal mentions above. FWIW, I don't think the point 1 is actually a blocker/concern. AFAIK you can configure your lotus node (either client or miner) to instead/additionally listen on your special private IPFS DHT and load in data that way - so I don't think this will actually be decisive for overall network dynamics. |
I'll get the technical note out of the way before going into the higher level question here. Technical: The main reason we haven't switched the defaults is that if you only advertise the root of the file and then you download the root block followed by your connection terminating then there's basically no way to resume the download. If you try This is, and has been, an important problem to solve. If we expect it to be critical for onboarding a large number of new IPFS users in the near term then we can prioritize it (means other things will get deprioritized), but it requires understanding what the ask is. Higher Level: I don't have enough context on what the expected experience is for Filecoin users that are planning to use For example, are these users that are uploading large amounts of data to Filecoin storage miners also planning on utlizing their bandwidth to freely distribute that data? If not then we should probably create a simple tool that is the inverse of
I suspect there are a few issues on the boundary between Filecoin and IPFS that are worth discussing the details of to ensure that these new IPFS users will have a good experience. Similarly, I suspect there are some patterns and issues in Filecoin that could be solved by using solutions shared with IPFS. For example, filecoin-project/lotus#2152 sounds like a request for content routing in Filecoin. While IPFS's current default content routing system (i.e. the public DHT) may not be what Filecoin wants or needs there have been numerous discussions and proposals within the core team and in the community that may be of use. |
Very cool! I didn’t know this ;) I think this is probably enough that the existing DHT forks won’t come back, but I still think we’ll see less forks in the future because as people build services around this feature it’ll be a lot easier to keep your data in the main IPFS network rather than a fork. |
Something to consider. Could IPFS just have a default, but configurable, max limit on the number of CID’s it was trying to broadcast? A reasonable default here would greatly reduce the risk profile. I can’t think of a case in which a regular user would need to broadcast more than 10K CID’s, the user experience when someone does this would be so poor that it’s hard to imagine it’s something that someone would want to do. Breaching the max limit would cause an import error that could point them towards the settings for only publishing the CID of your pins, which we expect is what they probably want to do with a graph this size. This doesn’t solve every concern, but when someone does the wrong thing this would greatly reduce the potential harm it would cause and it would take many many more users doing the wrong thing all at once to arrive at the same load that, right now, only a few would need to do. |
Linking another issue for continuinty: filecoin-project/lotus#2875 |
I’m logging this as a note because it touches on a lot of areas that touch all implementations of IPFS.
lotus
can now create a new storage deal from any CID that is available in the IPFS network. It’ll pull the entire graph out of the network for the user and store that data through the normal dealflow. This is almost the only way you can load data into Filecoin that isn’t just passing a single file you have locally (there’s also a more complicated offline flow that I won’t get into). Some of the applications being built to store data in Filecoin you may have heard of are also using this feature, effectively leveraging IPFS as the transport for creating a new storage deal.This has some implications I don’t think IPFS has had time to fully consider because it’s going to create some new incentives in the IPFS network that weren’t there before.
1. There’s now a substantial penalty for forking the DHT.
In the past, the main penalty of forking the DHT was losing some of the ease of use IPFS provides. But if you were doing anything other than the default IPFS data model you already couldn’t leverage a lot of the simpler DX that IPFS ships with, so we saw some notable users fork the DHT.
That’s unlikely to happen in the future. In fact, we may see some of those forks come back.
2. A lot more people are going to be loading data into IPFS in order to get it in to Filecoin
We probably already expected this, but it’s worth looking at in a little more detail. The easiest way, by far, to get data into Filecoin if it’s anything other than a single file will be to load it into IPFS. That means a lot of new people using the default IPFS configuration will be loading data into IPFS just to get it into Filecoin.
We have a lot of non-default options that we push people towards when they hit a particular scale. I think it’s worth looking at these and making sure that we default to what is best for large scale data sets because many more new IPFS users will be almost immediately loading large amounts of data.
For example:
raw
leaves (lotus turns this on by default when it imports the file but that’s not default in ipfs).For the launch we prepared a bunch of CAR files with Bitcoin data. It’s only about 400GB of data, but it’s enough CID’s to overwhelm the DHT. The easiest way someone would have to setup a deal with this data would also insert millions of unnecessary records into the DHT.
The text was updated successfully, but these errors were encountered: