-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Next generation of GC #4149
Next generation of GC #4149
Conversation
@whyrusleeping direct pins are a bit of a problem currently. I can introduce this concept to tricolor set (make it four color) but it complicates things a lot. If we want to do it, I would also have to introduce one more color for internal pins (recursive but stopping on direct pins). This version of GC should reduce the pause time by about 50% (or even more) but I can't make the sweep phase incremental without tackling the direct pins. What do you think? Do you think it is ok to treat direct pins as best effort pins? |
4d6702e
to
4e8803f
Compare
Looks like Short live feature branches. 😄 |
@Kubuxu so are direct pins being handled correctly right now? That is if there is nothing but a direct pin pinning a node, will all of its children still be garbage collected? |
Yes, I am currently using the same "cheat" as the older GC was using. |
pin/gc/gc.go
Outdated
finished, err := tri.EnumerateStep(ctx, bestEffortGetLinks, getLinks) | ||
if err != nil { | ||
output <- Result{Error: err} | ||
criticalError = ErrCannotFetchAllLinks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we dont want to halt here? Whats the point in continuing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To collect all errors like that. Instead of doing it one by one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add that this is how I modified the GC to work in #3712. It make it easier for the user to figure out and fix the problem in one go.
pin/gc/gc.go
Outdated
output <- Result{Error: &CannotFetchLinksError{cid, err}} | ||
// Reenumerate, fast as most will be duplicate | ||
for { | ||
finished, err := tri.EnumerateStep(ctx, getLinks, bestEffortGetLinks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldnt we technically loop back up to the top of the function? technically you're supposed to just keep doing all that until you end up with an empty gray set, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand, Line 156 loops this until finished is true L162.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm, i see that we do loop up above.
pin/gc/gc.go
Outdated
c, err := cid.Cast([]byte(v)) | ||
if err != nil { | ||
// this should not happen | ||
panic("error in cast of cid, inpossibru " + err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol, probably want to un-meme this panic.
// colors are per triset allowing fast color swap after sweep, | ||
// the update operation in map of structs is about 3x as fast | ||
// as insert and requres 0 allocations (keysize allocation in case of insert) | ||
white, gray, black trielement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this different from just having constants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
colmap: make(map[string]trielement), | ||
} | ||
|
||
tr.freshColor = tr.white |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something feels too complicated here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is future proofing for concurrent/incremental GC and lower overhead for re-GCing.
// InsertFresh inserts fresh item into a set | ||
// it marks it with freshColor if it is currently white | ||
func (tr *triset) InsertFresh(c *cid.Cid) { | ||
e := tr.colmap[c.KeyString()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e, ok := ....
, right? i guess its just a uint8, and 0 is colorNull?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, 0 is null value
pin/gc/triset.go
Outdated
cl := e.getColor() | ||
// conditions are: | ||
// 1. empty | ||
// 2. while |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/while/white/ ?
// select getLinks method | ||
gL := getLinks | ||
if strict { | ||
gL = getLinksStrict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the strict thing wrap the passed in getLinks? or maybe strict gets passed into getLinks?
pin/pin.go
Outdated
{ | ||
Get: nilErr(p.InternalPins), | ||
Strict: true, | ||
Inernal: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
pin/pinsource.go
Outdated
Strict bool | ||
// Direct marks the pinned object as the final object in the traversal | ||
Direct bool | ||
// Inernal marks the pin source which recursive enumeration should be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Inernal/Internal/
pin/gc/gc.go
Outdated
close(output) | ||
return output | ||
} | ||
addRoots(tri, pn, bestEffortRoots) | ||
|
||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this goroutine should probably be its own function now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am working on refactoring the GC to not be just a function but to have structure, so it will be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it will be a really great improvement. Needs some cleanup I think, and probably more testing.
ref #3679 |
cafb902
to
d292196
Compare
@Kubuxu update here? |
I will fix the conflict and it should be good for merge IIRC. |
af03696
to
deec07d
Compare
Introduce GC rework with concurrent mark phase, tricolor set and rework that in future will allow to incremental and/or concurrent mark and sweep phases. Currently it is made a bit difficult by direct pins. It will also allow for reuse of colorset for improved performance and no set reallocation. From observations pre-rewrok it should reduce GC lock time by half as about 50% of the time is spent in mark phase. This is consistent with observations made after the implementation. License: MIT Signed-off-by: Jakub Sztandera <kubuxu@protonmail.ch>
dc27017
to
e26bae6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass. Mostly naming gripes (not really your fault but we should fix them. As-is, they'll confuse everyone who looks at this code except @whyrusleeping). I'll make another pass at the actual logic later tonight.
func (kr *Root) PinSource() *pin.Source { | ||
return &pin.Source{ | ||
Get: func() ([]*cid.Cid, error) { | ||
err := kr.Flush() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this won't trigger a recursive GC. We take a lock, right? However, this can cause us to go over the GC limit even more. We should probably document the fact that the "max repo size" isn't really the max.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also get in a bad state here where we can't flush (no disk space) and can't GC (can't flush). Not sure if we can do something about it now but...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't much we can do here to recoverm I think, I need the actual state of the tree to perform the GC without data loss and I can't get it unless I flush the tree so it propagates the DAG updates.
Direct bool | ||
// Internal marks the pin source which recursive enumeration should be | ||
// terminated by a direct pin | ||
Internal bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have better names for these? I'm still confused, even after reading the comments.
- I understand
Strict
and can't think of a better name. Direct
, IIRC, means non-recursive. That comment is technically correct but leads one to believe that it's a terminal pin in a DAG (Internal
). Maybe invert this and useRecursive
?Internal
->Terminal
? Or, maybe,Never
(i.e., never pin)? Technically, it's a pin that's "internal" to a recursive pin but the name doesn't indicate anything about it being a "terminal" pin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, "internal" is a rather big hammer. I guess it's a decent starting point (for, e.g., excluding large files globally) but, really, I'd like to be able to say, e.g., "pin A recursively but stop at X and Y" and have this pin not mess with other pins (that may want X/Y). Are pins like this expressible with this GC without major surgery? This isn't a reason to hold this PR (future work) up but we're going to want to be able to express constraints like this (and, eventually, arbitrary IPLD selectors for, e.g., ipfs-cluster).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe invert this and use Recursive?
Direct
works much nicer with default values (false).
"pin A recursively but stop at X and Y"
This might be a bit hard and we will have to think carefully about it. If we are able to sort the selectors according to their reach (some may be super-sets of others, so may have parts in common but then select different objects).
Big optimization that comes from color set (it was already implemented before but not explicit) is that we don't need to enumerate parts of the tree that were already enumerated
we're going to want to be able to express constraints like this
The color system allows for a lot of flexibility and having it here explicitly is IMO great step forward. Right now internal means "enumerate it last", which means it stops as elements that are already marked black (as they are supposedly fully enumerated but it is not the case in case of Direct
pins).
EDIT: for selectors we might just have additional marking overhead of marking parts of the tree we already have marked.
I am note sure how extensive selectors are planned to be but being able to simplify selectors as we go deeper into the DAG to would allow for some nice GC optimizations in future. Shift as in: Selector on node A is X and on node B is Y, they have both selected node C (the same node) and they both simplify to Z. Then if we marked node C (and its children) with selector Z already while marking node A with selector X we don't have to remark children of node C with while marking node B with selector Y.
I hope I was able to convey myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Direct
works much nicer with default values (false).
Right now internal means "enumerate it last", which means it stops as elements that are already marked black (as they are supposedly fully enumerated but it is not the case in case of Direct pins).
Oh, now I understand. Direct is an overloaded term/variable. Direct pins act as both non-recursive pins and stop-pins for internal pins. Is that right? At the very least, this algorithm needs to be documented somewhere. Personally, I'd prefer something like:
type PinKind int
const (
PinRecursive PinKind = iota
PinTerminal // Terminates an internal pin.
PinInternal // Better name...
PinSingle // one-of pins, add these last.
) // sort order is const order.
struct PinSource {
Kind PinKind // Can only be one type.
Strict bool
Get GetFunc
}
Big optimization that comes from color set (it was already implemented before but not explicit) is that we don't need to enumerate parts of the tree that were already enumerated
So, I know of a way to make this work but it will be a lot of work and will require storing extra data; we can deal with it later. The TL;DR is to maintain a (semi-accurate) topological sort of the DAG and operate over that. We'll have to see how much extra work maintaining this datastructure will be but it will make executing IPLD queries much faster (IMO, it makes sense to optimize for this).
However, that's not something we'll need to deal with now.
Shift as in: Selector on node A is X and on node B is Y, they have both selected node C (the same node) and they both simplify to Z.
With topological sorting, I believe we can do this. Without it, we have the problem where we may have already marked C and it's children with respect to query X rooted at A by the time we get around to processing query Y rooted at B.
if p.Direct { | ||
v |= 1 << 1 | ||
} | ||
if p.Internal { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can internal be combined with the other two options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not be combined with Direct
as it might lead to not enumerating internal tree fully. I will add runtime panic (on PinSource registration) and comment as they should never really be used together.
It can be used with Strict without any issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mostly care for documentation purposes.
} | ||
} | ||
|
||
func (tr *triset) blacken(c *cid.Cid, strict trielement) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason this isn't exported while the other functions are?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blaken
has to be handled with care (as it can lead to tree not being enumerated fully), ideally the GC itself would not be using it but it is a "workaround" for the Direct pins (in future we can make Direct pins separate color and this will allow fully concurrent or incremental GC).
elock.Done() | ||
elock = log.EventBegin(ctx, "GC.locked") | ||
emark := log.EventBegin(ctx, "GC.mark") | ||
type gctype struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a more descriptive name? like triColorGc
?
// InsertFresh inserts fresh item into a set | ||
// it marks it with freshColor if it is currently white | ||
func (tr *triset) InsertFresh(c *cid.Cid) { | ||
e := tr.colmap[c.KeyString()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to bind c.KeyString()
to a variable and reuse it.
Still reviewing, but doesnt this changeset make it so that we now hold every single cid in the blockstore in memory? (as opposed to currently just holding the pinned objects set) |
You are right, it does. This is Memory vs Time needed during locked phase compromise. In future we will have to probably keep even the pinned set on disk. I can change if you think it is needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another pass. Sorry, I still haven't fully read everything.
// 8bit color is overkill, 3 bits would be enough, but additional functionality | ||
// will probably increase it future | ||
// if ever it blows over 64bits total size change the colmap in triset to use | ||
// pointers onto this structure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd not use pointers, even if this grows large. Allocations are expensive and we already have to do a lot here.
return | ||
} | ||
for _, c := range cids { | ||
if !r.Direct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how good the go optimizer is but, assuming it isn't, I'd invert this. That is, put the if statement on the outside and have two loops, one in each branch.
tri.InsertGray(c, r.Strict) | ||
} else { | ||
// this special case prevents incremental and concurrent GC | ||
tri.blacken(c, enumStrict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we have to wait until after we've re-enumerated before we can do this? Otherwise, we can blacken a direct pin before we've explored it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
// Any calls to AddPinSource have to be done before any calls to Run. | ||
func (g *gctype) AddPinSource(s ...pin.Source) error { | ||
g.roots = append(g.roots, s...) | ||
sort.SliceStable(g.roots, func(i, j int) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of sorting these, have you considered just storing them in separate arrays/sets? IMO, that would be simpler.
func (tr *triset) EnumerateStep(ctx context.Context, getLinks dag.GetLinks, getLinksStrict dag.GetLinks) (bool, error) { | ||
var c *cid.Cid | ||
var e trielement | ||
for next := true; next; next = e.getColor() != tr.gray { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh... is this a do { } while ()
loop? Why not just use a for {}
loop and put an if e.getColor() != tr.gray { break }
at the end of it?
Direct bool | ||
// Internal marks the pin source which recursive enumeration should be | ||
// terminated by a direct pin | ||
Internal bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Direct
works much nicer with default values (false).
Right now internal means "enumerate it last", which means it stops as elements that are already marked black (as they are supposedly fully enumerated but it is not the case in case of Direct pins).
Oh, now I understand. Direct is an overloaded term/variable. Direct pins act as both non-recursive pins and stop-pins for internal pins. Is that right? At the very least, this algorithm needs to be documented somewhere. Personally, I'd prefer something like:
type PinKind int
const (
PinRecursive PinKind = iota
PinTerminal // Terminates an internal pin.
PinInternal // Better name...
PinSingle // one-of pins, add these last.
) // sort order is const order.
struct PinSource {
Kind PinKind // Can only be one type.
Strict bool
Get GetFunc
}
Big optimization that comes from color set (it was already implemented before but not explicit) is that we don't need to enumerate parts of the tree that were already enumerated
So, I know of a way to make this work but it will be a lot of work and will require storing extra data; we can deal with it later. The TL;DR is to maintain a (semi-accurate) topological sort of the DAG and operate over that. We'll have to see how much extra work maintaining this datastructure will be but it will make executing IPLD queries much faster (IMO, it makes sense to optimize for this).
However, that's not something we'll need to deal with now.
Shift as in: Selector on node A is X and on node B is Y, they have both selected node C (the same node) and they both simplify to Z.
With topological sorting, I believe we can do this. Without it, we have the problem where we may have already marked C and it's children with respect to query X rooted at A by the time we get around to processing query Y rooted at B.
|
||
// this const is used to enable runtime check for things that should never happen | ||
// will cause panic if they happen | ||
const pedantic = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not used.
gcs.Add(k) | ||
// Reenumerate, fast as most will be duplicate | ||
for { | ||
finished, err := tri.EnumerateStep(ctx, getLinks, bestEffortGetLinks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are getLinks
and bestEffortGetLinks
flipped?
@Kubuxu said:
This concerns me. For very large repositories this could be a problem. This could also be a problem if the in-memory data structures are larger. Do we have some sort of estimate on the memory usage with the orig and this GC for a large repository that is say 20% pinned, 50% pinned, 80% pinned? |
Kinda makes me think we should have a |
return unlocker, elock | ||
} | ||
|
||
func addRoots(tri *triset, pn pin.Pinner, bestEffortRoots []*cid.Cid) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cruft, removed in working branch (not pushing yet because I am afraid of loosing code comments).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd just push it now. The comments should stay.
Worst case scenario it will be |
So in case of sha-256 and CIDv1 it would be about 28M keys per GiB. |
if !ok { | ||
break loop | ||
} | ||
tri.InsertFresh(c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a bug here (well, not here but this is where I started writing the comment...). Specifically, what happens if I add a node (e.g., from the network) after the above EnumerateStep
calls? I believe it will be garbage collected even if it shouldn't be.
The solution, as far as I can tell, is to leave unexplored gray nodes in the gray set and try to re-explore them after taking the GC lock.
Ideally, we'd have some nice way to subscribe to events from the blockstore so we could just process new blocks...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InsertFresh
in this inserts as white. I will change it to InsertWhite
for time being.
what happens if I add a node (e.g., from the network) after the above EnumerateStep calls
If it is pinned it will be enumerated and marked black by enumeration when the lock is taken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You sure? I'm pretty sure those roots won't be re-enumerated because they're already black. I'm worried about the a (pinned) -> b -> c (added late)
case. That is, a
is a pin root and we only add c
to the repo after the initial enumeration. I believe, in this case, we'll never mark c
as black because a
is already black.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b can't be black as it references a white (or null, treated as white) node c.
If a is pinned strictly (by recursive pin) then marking b would fail with an error.
if a is pinned non-strictly (by Files API) then it is still better than it was before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b can't be black as it references a white (or null, treated as white) node c.
No, if you have b
, you'll definitely blacken it. getLinks
will succeed because you have it (even if you don't have c
). Yes, if you're pinning strictly you'll fail out.
if a is pinned non-strictly (by Files API) then it is still better than it was before.
It's still not correct. Actually, can't you just add the white nodes first (before you enumerate reachable nodes)? IIRC, that's how this algorithm usually works (for this reason).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, if you have b, you'll definitely blacken it. getLinks will succeed because you have it (even if you don't have c). Yes, if you're pinning strictly you'll fail out.
Yes, then the c
will be marked gray and processed in the next EnumerateStep.
can't you just add the white nodes first (before you enumerate reachable nodes)?
I treat every node that doesn't exist in the tricolor set as white. They are being added here as it reduced time needed what we take the repo's GC lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not if it isn't strict. In that case, it will be marked gray, in the first EnumerateStep
and then removed (and assumed to have no children).
They are being added here as it reduced time needed what we take the repo's GC lock.
That doesn't mean you can't add them before the first enumerate step. Then, you'd follow the standard algorithm of moving nodes from the white set to the gray set to the black set until the gray set is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is true, let's do that.
return err | ||
defer close(output) | ||
|
||
bestEffortGetLinks := func(ctx context.Context, cid *cid.Cid) ([]*node.Link, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we combine this with the below getLinks
function, using a flag to handle the err != dag.ErrNotFound
case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, two types of GetLinks is cruft from previous (or currently used GC), I should had eliminated it.
if len(tr.grays) == 0 { | ||
return true, nil | ||
} | ||
// get element from top of queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we mean to say get element from stack
here - based on line 54 of this file.
@whyrusleeping @kevina for future we might just switch to using something like badger for GC, especially if we nail down incremental/concurrent GCing. In case of extremely bit repos the GC process could take days so having it done incrementally and being resumable would be a requirement. |
} | ||
|
||
for _, k := range pn.DirectKeys() { | ||
gcs.Add(k) | ||
// Reenumerate, fast as most will be duplicate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not true for Internal
pins as far as I can tell. Do we use them a lot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there will be very little as they are used only for building sharding of the pinset.
|
||
func getGCLock(ctx context.Context, bs bstore.GCBlockstore) (bstore.Unlocker, *logging.EventInProgress) { | ||
elock := log.EventBegin(ctx, "GC.lockWait") | ||
unlocker := bs.GCLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually need to lock the entire repo or can we just lock the pins/roots?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, to prevent for example blocks for being added. In future we can have incremental GC where those pauses (this is what taking major lock by GC is called) are short but spread over time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or to highlight the problem more cleanly, to prevent ipfs add
from adding blocks that are referencing some white marked nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I was thinking that that case might be equivalent to doing the GC before adding the block but that's not true because the fact that we had both blocks at the same time can be observed. Having a GCed system where something can go from unreachable (by a root) to reachable in the middle of a GC run is weird.
seems like there is some overlap with #5765 |
Possibly. The primary issue with this PR is that it loads the entire pin-set into memory. That's not going to scale. |
@Stebalien that makes sense let's close this with that explanation, unless you or @Kubuxu have objections there's a lot of interest in improved GC, but it seems that this branch has diverged enough that basing any further work on it will probably be counterproductive |
SGTM. However, let's not delete this branch (the logic is still good). |
Introduce GC rework with concurrent mark phase, tricolor set and rework
that in future will allow to incremental and/or concurrent mark and
sweep phases.
Currently it is made a bit difficult by direct pins.
It will also allow for reuse of colorset for improved performance and no
set reallocation.
From observations pre-rewrok it should reduce GC lock time by half as
about 50% of the time is spent in mark phase.
This is consistent with observations made after the implementation.