-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimistic Provide #783
Optimistic Provide #783
Conversation
This can be difficult to deal with operationally because the work for handling the request can't exert backpressure and control flow isn't tied to the req (which e.g. complicates debugging). One alternate approach here might be to kick off 21 or 22 ADD_PROVIDER RPCs and stop when 20 succeed. This might generate slightly more load on DHT servers, but how much and can we find a way to make it negligible? Also if we do the work async, we should make sure it's using a bounded work queue, have metrics on the work queue, and drop work on the floor when the queue is full. (May not be literally a work queue, e.g. could be goroutines + semaphore.) |
Thanks for the review @guseggert
Valid points :/ this wouldn't be the first time though in the libp2p stack :D (this obviously doesn't justify it here).
I hoped we can avoid any additional networking overhead but that'd certainly work. Right now, we return to the user when we received responses from 15 peers while we send the request to 20. If we wanted the same ratio and only return after we have received responses from 20 peers we'd need to shoot for 26.6 peers. This sounds a lot (at least to me) but now I'm thinking that we're saving so many requests (not quantified) in the DHT walk that we'd presumably still be net negative in terms of networking overhead. The graph (Just a note: A finding from another project is that we actually only need 15 provider records)
Just to understand you correctly: We could have a limit on the number of in-flight Is this would you mean? |
@guseggert : did you have other code review comments you were going to provide? |
fb6fa01
to
07000e8
Compare
71fef1e
to
6138b9e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That the code quality things that sticked out to me.
routing.go
Outdated
@@ -441,6 +441,23 @@ func (dht *IpfsDHT) Provide(ctx context.Context, key cid.Cid, brdcst bool) (err | |||
return ctx.Err() | |||
} | |||
|
|||
func (dht *IpfsDHT) OptimisticProvide(ctx context.Context, key cid.Cid) (err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not expose this as a second method, now every caller needs to have it's if
check to know what to call.
I would do if dht.enableOptProv {
in the Provide
method, and then dispatch to either classical or optimistic client.
I do not think we are gonna start having provides where some of them can be done optimistically and other where we really want them to be precise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed public API and added the following to the Provide
method:
if dht.enableOptProv {
err := dht.optimisticProvide(ctx, keyMH)
if errors.Is(err, netsize.ErrNotEnoughData) {
logger.Debugln("not enough data for optimistic provide taking classic approach")
return dht.classicProvide(ctx, keyMH)
}
return err
}
return dht.classicProvide(ctx, keyMH)
I thought of falling back to the classic approach if we don't have enough data points for the optimistic one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jorropo IMO a second method makes sense as OptimisticProvide
isn't exactly the same use case as the Provide
operation (see this comment). WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guillaumemichel will we write code anytime soon that is Optimistic Provide aware ?
By that I mean will we have code that use different publish strategy for different workloads ?
I don't belive so, thus no need to make the API more complex and require all consumers to purposely call the right method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will we write code anytime soon that is Optimistic Provide aware ?
I don't think so, as IMO kubo isn't the best use case for Optimistic Provide.
However, if other applications want to use it (the goal of this feature is still to be useful), let's not screw their reprovide operation, otherwise no one will want to use this feature at all.
IMO the job achieved by optimistic provide is different as the one achieved by the normal Provide, and each have their use cases, let users use both.
lookup_estim.go
Outdated
case es.dht.optProvJobsPool <- struct{}{}: | ||
// We were able to acquire a lease on the optProvJobsPool channel. | ||
// Consume doneChan to release the acquired lease again. | ||
go es.consumeDoneChan(rpcCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why launch a goroutine here ?
Can't putProviderRecord
do the:
// Release acquired lease for other's to get a spot
<-es.dht.optProvJobsPool
// If all RPCs have finished, close the channel.
if int(es.putProvDone.Add(1)) == until {
close(es.doneChan)
}
by itself ? If that is because it used twice (once in the non optimistic path and once in the optimistic path).
Then just make two functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe with optimistic vs. non-optimistic code paths you mean
- optimistic: the
putProviderRecord
call in thestopFn
- non-optimistic: the
putProviderRecord
call afterrunLookupWithFollowup
has returned
When we spawn the putProviderRecord
go-routines we don't know in either case (optimistic/non-optimistic code path) if this will need to release the acquired lease. So I don't think we can put your proposed lines in there.
Imagine a very long-running putProviderRecord
call in the optimistic code path. This will eventually acquire a lease on the job pool and need to release it. On the other hand, a short-running call would not need to do that.
lookup_estim.go
Outdated
go es.consumeDoneChan(rpcCount) | ||
case <-es.doneChan: | ||
// We were not able to acquire a lease but an ADD_PROVIDER RPC resolved. | ||
if int(es.putProvDone.Add(1)) == rpcCount { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do the inverse, set es.PutProvDone
to rpcCount
at the start, and have workers do:
var minusOne uint32
minusOne-- // go doesn't want to constant cast -1 to an uint32, that does the same thing, this also gets folded.
if es.putProvDone.Add(minusOne) == 0 {
Also I guess putProvDone
should be renamed putProvInflight
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the challenge with both your suggestions is that we don't know ahead of time if we are going to acquire a lease for an inflight putProviderRecord
call. We would need to inflight communicate to it that we acquired a lease for it so that it'll release all resources for us.
I think I see where you want to go with both of your suggestions but I can't see a simpler way to do that.
lookup_estim.go
Outdated
es.peerStatesLk.Unlock() | ||
|
||
// wait until a threshold number of RPCs have completed | ||
es.waitForRPCs() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happen if less than some minimum of puts fails ?
Let's say all of the puts fail, I expect to return an error the consumer (the network is probably broken), I would expect es.waitForRPCs
to synchronise that and have it return an error
. (this doesn't have to be waitForRPCs
precisely)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behaviour I have implemented here mimics how the classic approach handles failed put - it does not handle them.
We could indeed define a lower limit for when we consider a provider operation successful. But what would be that number, 1, 5, 10?
Then should the implementation try its best to increase the number if it detects that it only provided to less than this threshold?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would involve quite a bit of work to implement this behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My 2c: I think it's fine to stay consistent with the classic approach and treat this as a separate issue.
tests are failing because |
You can add an implementation for older versions of Go, using a build tag |
If you rebase the PR on the master you will have tests for go1.19 and go1.20. This has been updated in #812 |
Sorry if this was mentioned here before and I missed it, but do we need to reconsider anything here in light of the recent DHT issues? I remember discussion about how the impact might have been much worse if much of the network was using optimistic provide for publishing records. |
Sharing the relevant graph from Engelberg: These graphs show the differences between Optimistic Providing vs. Classic Providing for the IPFS and Filecoin networks. The context was that >60% of nodes in the IPFS network were not behaving well (unreachable). The graphs show that the classic approach was generally able to store more provider records with nodes in the network. However, at the cost of a higher distance to the target key. The following graph shows that: So, the classic approach stores more records but at potentially unsuitable locations that don't lie on the path to the target key. I'm hesitant to jump to the conclusion that it probably isn't that bad. I would love to back any conclusion up with some data, but I don't know how we would do that. @guillaumemichel to the rescue? :D So, I think it's hard to judge if optimistic provide is actually much worse. @FotiosBistas did a study for his bachelor's thesis on the retrieval success rate using optimistic provide. He performed his study during the "challenging" network times. He did a test with 1.5k random CIDs. This is his finding: So, "almost all" provider records stayed retrievable. "Almost all" is not super reassuring, but he also suggests that this might be a measurement issue. I also think that with these two additions: we wouldn't run into the problem that we observed back then. |
Okay if you and @guillaumemichel are on board with this, in light of those findings, then I'm happy to proceed. Should we land #811 before this? |
I agree that there is much room for improvement in the provide process. What is the specific goal for optimistic provide? Accelerating the provide operation can be decomposed in multiple components:
1 Makes a lot of sense, and I think that Optimistic Provide is doing it right (didn't look at the implementation yet though). 2 represents return once 15 out of the 20 Provider Records have been allocated. I think that the benefit of a function returning fast depends on applications. e.g if you want to advertise a file temporarily for a one time transfer, it doesn't make sense to wait for minutes before the remote peer can start looking for the provider record. However, when periodically reproviding content, no one is waiting for the operation to terminate. So here, providing more information wrt. the status of the provide operation would make sense (e.g with a 3 I don't think this is a good approach, because the responsiveness of a peer depends on the vantage point. The provider records must be stored at the very least on the X closest peers to the CID location (even if X<20), otherwise it may not be discoverable in the DHT (at least with using the current lookup implementation). So I don't think that optimistic provide should stop after allocating 20, 22, or 25 provider records, but once the node cannot find new remote peers that are closer to CID. It means that the request will certainly time out, but it isn't an issue, as applications that are time sensitive with the provide operation shouldn't need to wait for the end of the provide operation. |
Yes, to all three of your remarks 👍 I think number 2 could be a follow-up where we give the application layer more control over the provide process. Until now, the goal was to keep API compatibility by reusing the exposed
I'm on board 👍 |
lookup_optim.go
Outdated
// calculate distance of peer p to the target key | ||
distances[i] = netsize.NormedDistance(p, os.ksKey) | ||
|
||
// Check if we have already interacted with that peer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the wording can be more precise here, we haven't necessarily interacted with the peer when it's in peerStates
, it just means that we've scheduled interaction with the peer. (There are some other comments with this same ambiguity.)
I can't find any instances of this ambiguity resulting in an actual race, but might as well clear up the comments so that one isn't accidentally added in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to understand you correctly. Do you mean the brief moment of time when we have stored e.g. Sent
into peerStates
until the corresponding goroutine putProviderRecord
was scheduled? This would indeed be an instance where we haven't interacted with the peer but just scheduled interaction 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that's all I meant, it's definitely a nit but just trying to be precise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that optimistic provide should not replace the current provide operation.
I think that the Provide
function is mainly used to reprovide content (in addition to Providing it for the first time), and thus isn't time sensitive.
IIUC the main use case of optimistic provide, is to enable the Provide
function to return faster to the user, i.e for an instant file sharing app where A pins a file to the DHT and B immediately fetches it. The trade off here is to sacrifice some accuracy against speed, which is totally reasonable in this use case. However, I don't think that kubo would benefit from it, because we don't want to sacrifice accuracy for speed for non time sensitive operations (e.g reprovide).
I recognize the value that optimistic provide brings to IPFS, I think that it should not replace, but rather extend the DHT interface, but exposing a new OptimisticProvide
in addition to the already existing Provide
operation. This would allow files being provided with OptimisticProvide
to be reprovided using the more accurate Provide
operation. And it allows applications that benefiting from OptimisticProvide
to use it in a more controlled way.
This is how I had it before the first round of feedback. This is the relevant comment. From that comment:
I think you just gave the perfect example @guillaumemichel 👍 so, I'd also be in favour of having two separate methods or alternatively some way of letting the application signal what it wants. The latter would also result in a second exposed method or be a breaking API change. (there are also several places in libp2p where behaviour is changed down the stack by passing a value in the context down the stack - I'm not a fan of that). |
I fully agree! The new method could expose additional information (such as the number of provider records pinned so far). |
If the UX is not going to be "the same but with these different provide tradeoffs", but instead is going to be "new APIs, mix-and-match optimistic provide and traditional provide", then I think we should do a lot more due diligence about what the actual UX will be before we merge this, since that should drive the design here. I'm concerned that this is scope creep. My preference would be to ship what we have now as experimental, collect some feedback, and then continue to iterate on it until we think it meets the bar for "non-experimental". This can include additional UX conversations / design. |
I'm also quite keen to get this PR landed sooner than later because it's been sitting around for quite some time already. So, I'm totally up for postponing this discussion in favour of making some progress and gathering feedback. I also think that the current PR is minimally interfacing with the existing code, so changing it later should be quite easy. Flagging it as experimental signals that we're not making any promises regarding this feature - so it should be fine to change this behaviour later. |
I think there's consensus here to move forward with an experimental release and I'm in favour of doing so. A couple of thoughts:
Without trying to keep things back, I think it is useful to define what are the criteria to move this out of the experimental release and into the standard one. If that's already discussed somewhere (above or elsewhere), I'd appreciate a pointer :) |
When someone is pushing an update to a website, they provide a new CID, hence it is a provide not a reprovide. Please correct me if I am wrong, IPNS records are published using the I am not convinced that many applications would (and should) fully replace the Provide operation with the OptimisticProvide (or FastProvide), because some accuracy is lost. If we want this feature to be usable, IMO the best option is to extend the API with a new |
617a6ea
to
9cb79d1
Compare
I have deployed the latest commit on our DHT lookup measurement infrastructure. These are the numbers (source): p50
p90
Sample SizeOptimistic: ~875 (I just deployed the optimistic version today, hence fewer provides) Both approaches have 100% retrieval success rates What the graphs don't show:
Network Size Estimations |
@guseggert Incorporated feedback, and tests are running fine on my machine. |
Go Checks / All is failing, and isn't part of the flaky tests.
|
Thanks for pointing that out @guillaumemichel! I added the test file just recently and previously unrelated tests were failing. Here's a PR that tries to address both issues: #833 |
This is the pull request for the Optimistic Provide project. It contains
netsize
OptimisticProvide
Highlights
though that may be a bit misleading... see below
Network Size Measurements
Peers refresh their routing table periodically every 10min. This refresh-process iterates through all buckets of the routing table and in each iteration it generates a random CID that falls into that bucket and searches for the 20 closest peers to that CID. We are using the distances of the 20 closest peers to the randomly generated CIDs to derive a network size estimate. How we do this is described here.
Since the routing table maintenance happens anyways this estimate incurs no additional networking overhead
Optimistic Provide
The optimistic provide process uses the regular
Provide
algorithm but just defines an additional termination condition that gets checked every time we receive a new peers during the DHT-Walk.In this termination logic we define two thresholds:
individualThreshold
- If the distance of a peer to the CID that we provide is below this threshold we store the provider record with it right away - but continue the DHT-WalksetThreshold
- If the average distance of the currently known 20 closest peers is below that threshold we terminate the DHT-WalkAfter DHT-Walk termination we take the 20 closest peers from the query set and attempt to store the provider record with all that have not been contacted due to their distance being less than the
individualThreshold
.Important addition: If 15 of the 20
ADD_PROVIDER
RPCs finished (regardless of success) we return to the user and continue with the remaining five in the background. At this point the provider record can already be found and the probability is high that we run into at least one timeout for remaining five. In the below graphs, we differentiate these times betweenreturn
anddone
.done
is the time until the 20th provider record was stored.Measurements
Durations
Speed Up
Precision
CDF of the selected peers distances across all provide operations
Successful ADD_PROVIDER RPCs
Time to First Successfully Written Provider Record
Corresponding Speed Up:
Network Size Measurements
Compare this with the same time period here from Nebula.
Limitations
Implementation Details
stopFn
hook to implement a custom termination logicnetsize
package could be tested but I had trouble crafting CIDs/PeerIDs. I think I could just use some of the measurement above.queryPeerset
in thestopFn
without paying attention to locking etc. If that's not the case, I have a thread-safe implementation lying around somewhere.netsize
package draws inspiration from the gonum weighted mean and linear regression implementations. I wanted to avoid another dependency.lookup_estim.go
to compute the distance thresholds. So, I pulled-ingonum
...Possible Next Steps