Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread edge proposal #477

Merged
merged 10 commits into from
Feb 24, 2021

Conversation

dgtony
Copy link
Contributor

@dgtony dgtony commented Dec 24, 2020

Hi folks, here is another improvement for GetRecords.

This time we propose a new concept: thread edge. Basically it's just a 64-bit deterministic hash of all heads in a thread. Using the edge values we can easily compare two states of the same thread and decide if they are equal without fetching all thread's information from the logstore. Such optimization at first should improve request processing time on a fast path for the frequent GetRecords requests.
As a follow-up: introducing thread edges enables implementing fast checks for thread changes without getting and sending thread information on both communicating nodes (protocol extension in progress for now).

Signed-off-by: Anton Dort-Golts <dortgolts@gmail.com>
Signed-off-by: Anton Dort-Golts <dortgolts@gmail.com>
Signed-off-by: Anton Dort-Golts <dortgolts@gmail.com>
Signed-off-by: Anton Dort-Golts <dortgolts@gmail.com>
Signed-off-by: Anton Dort-Golts <dortgolts@gmail.com>
Signed-off-by: Anton Dort-Golts <dortgolts@gmail.com>
Signed-off-by: Anton Dort-Golts <dortgolts@gmail.com>
Signed-off-by: Anton Dort-Golts <dortgolts@gmail.com>
@sanderpick
Copy link
Member

Hey @dgtony, this is great! I wonder, could we also include log addresses in the hash? Updated peer addresses should also be considered info worth sharing even if the head hasn't changed.

@dgtony
Copy link
Contributor Author

dgtony commented Jan 19, 2021

Hi, Sander, sorry for the late reply. Hope to come up with solution in a couple of days.

@dgtony
Copy link
Contributor Author

dgtony commented Feb 2, 2021

Finally I've got a chance to get back to the PR )

Thanks for your corrective, adding peer addresses to the edge computation is totally reasonable for distinguishing thread states in general. But I'm not sure its worth it in this particular PR, so let me explain.

Current PR was intended as a first step of optimization of ThreadsDB-nodes communication. It makes processing of GetRecords easier for high-degree nodes, but the main idea here is compressing thread's state information into tiny hashes. So the real contribution is ThreadEdge method (btw, it seems to be better renamed into HeadsEdge).
And the next logical step should be extending internal ThreadsDB interface with a new method (lets say GetThreadEdge), enabling very lightweight difference checks between local and remote states of the thread. And at this point we can make a good use of additional information like peer addresses as you suggested. Message GetThreadEdgeResponse will contain pair of such hashes: current state of the thread's heads provided by HeadsEdge method and similar hash of all peer addresses associated with the given thread's logs. The latter will be fetched with unimplemented yet method AddrsEdge.
Having this two hashes in the response, node can compare it with its local values and decide if any of full GetRecords or GetLogs calls are neccessary in order to ensure local state is up-to-date.

Regarding this PR, it looks like this small improvement preserves current GetRecords behaviour. At the moment if some logID was included into request you'll get only new records for it and no log information (and peer address updates) in the response. Otherwise fresh LogInfo will be returned in the response whenever you omit said logID in the request, but currently it may happen only if the log was previously unknown to the requester. But in this case local and remote edge values will be different and request processing will move along the slow path as usual.
So there are two options I can see. Rename ThreadEdge -> HeadsEdge, implement AddrsEdge and PR could be merged in order to make the next one (with GetThreadEdge) smaller. Or we can close current PR and I'll use this branch as a ground for the entire feature, but future merge could be pretty massive.
@sanderpick What do you think?

@sanderpick
Copy link
Member

Hey @dgtony, thanks for the summary. I agree doing this change stepwise makes sense. So, first the renaming (ThreadEdge -> HeadsEdge), then deal with addresses.

BTW, have you experimented with testground? We're thinking of spinning that up soon to get a better idea of network performance and bottlenecks.

Signed-off-by: Anton Dort-Golts <dortgolts@gmail.com>
Signed-off-by: Anton Dort-Golts <dortgolts@gmail.com>
@dgtony
Copy link
Contributor Author

dgtony commented Feb 10, 2021

Ok, so current PR now contains HeadsEdge implementation and I'll submit AddrsEdge in a separate one.
As for Testground it seems to be very promising for identifying network/performance/consistency issues. It'd be great to make use of it and we've been looking at it during the last couple of months, but unfortunately have no hands-on experience yet.

Copy link
Member

@sanderpick sanderpick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carsonfarmer may be interested in taking a look at this. He mentioned this idea sounds something like a vector clock. Maybe there are some ideas from that realm that can also be applied here. Or maybe we can call the edges clocks?

In any case, I'll merge this now and we can follow up with additional ideas.

}
return hs[i].LogID < hs[j].LogID
})
hasher := fnv.New64a()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice use of FNV 💪

@sanderpick sanderpick merged commit 7a964ca into textileio:master Feb 24, 2021
@carsonfarmer
Copy link
Member

Yeh this is neat. Almost like a vector clock, except more like a vector clock "summary" or digest.

To really take advantage of a vector clock approach, we'd need to "track" more than just the head of the logs here. We'd also have to track an update logical clock/integer, that would get incremented on updates. It'd only be a few extra bytes, but the benefit with that approach is it'd tell us which logs were off, whether they were behind (or ahead), and by how far off they are. This would make it possible to do really fine-grained syncing. The vector clock approach is something I've been thinking about for a while, and I just need to write some stuff down. At the end of the day, a thread really is just a vector clock of logs. Almost a perfect fit. But it also has me thinking about the approach outlined here. Its a super cool way to create a fixed-length digest of a vector clock as well. This is cool because you get the vector clock on the one hand, and you get a digest to use as a sorting mechanism.

The vector clock thing is worth thinking seriously about and writing up a new proposal discussion for implementing it. But of course, it would require some non-trivial code changes, whereas the approach in this PR answers the question are they off, yes or no pretty quickly and easily. So in the mean/shorter term, this is a super clever move in an awesome direction! Love it.

@dgtony dgtony deleted the improvement/get-records-handler branch February 25, 2021 11:33
@dgtony dgtony mentioned this pull request Mar 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants