Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'pin rm' in the context of broken pins #3796

Open
kevina opened this issue Mar 18, 2017 · 13 comments
Open

'pin rm' in the context of broken pins #3796

kevina opened this issue Mar 18, 2017 · 13 comments
Labels
need/community-input Needs input from the wider community

Comments

@kevina
Copy link
Contributor

kevina commented Mar 18, 2017

From IRC (Starting around 16:00 EDT on Mar 17, 2017):

kevina: whyrusleeping: would you mind if I take a pass on the pinning code and try to make pinning operations more more robust in general?
kevina: I will do it in small steps and explain what I did to make reviewing easier.
kevina: The GC should fail if there are pinned objects can't be retrieved, but other operations don't need to fail...
whyrusleeping: kevina: hrm... what changes are you thinking?
kevina: like when we scan for an indirect pin, don't abort if a pinned object is unavailable, instead issue a warning and continue
kevina: and possible if necessary add a "--force" option to pin rm
whyrusleeping: kevina: hrm... on the pinning stuff
whyrusleeping: the problem here is that if even a single indirectly pinned block is missing, it invalidates any guarantees we might have about the rest of the pinset
whyrusleeping: we essentially at that point have a set of things that are for sure pinned
whyrusleeping: and everything else might be pinned
kevina: whyrusleeping: correct, but in some cases it may be okay to continue on a best effort bases.
kevina: in the G.C. no
whyrusleeping: yeah
kevina: listing pins maybe
kevina: removing pins, yes
kevina: etc.
whyrusleeping: Yeah, if we're trying to check if a thing is pinned
whyrusleeping: we can return true or ERROR CANT TELL
whyrusleeping: which might be acceptable for some usecases
kevina: something like that
whyrusleeping: what would be useful though, would be to say "recursive pin X references a missing block indirectly"
whyrusleeping: and have ways to remedy that
kevina: correct
* whyrusleeping thinks about the UX of an 'ipfs pin repair'
kevina: it should always be possible to remove a pin
whyrusleeping: as long as you always understand the consequences of doing so
kevina: I do. I wrote the G.C. code to be very careful
kevina: really case by case, what would happen if we ignore the missing/bad block
whyrusleeping: i didnt mean we as devs
whyrusleeping: i meant we as users
whyrusleeping: at any time a user is going to remove a pin like this, it needs to be very apparent the consequences
kevina: I understand, that why I suggested a "--force" option if it becomes required
kevina: actually, for removal I can't think of anything bad that can happen by removing a recursive pin where part of the pin is missing
kevina: the worst that can happen is we might not detect a block is pinned directly and report PIN NOT FOUND instead of reporting the recursive pin pinning it
kevina: so now that I think of it a "--force" option it not needed for removal expect to save time by avoiding the indirect check...
whyrusleeping: kevina: The issue is that the recursive pin we force remove might be a block that is referenced by a pinned block that is missing
whyrusleeping: and we won't be able to tell that
whyrusleeping: so while removing one broken pin might seem fine, it doesnt give the context that its removal also might result in the loss of data from other pins

@kevina kevina added the need/community-input Needs input from the wider community label Mar 18, 2017
@kevina
Copy link
Contributor Author

kevina commented Mar 18, 2017

@whyrusleeping okay, then we can use a --force option that basically says try to remove the pin if possible, don't scan for indirect pins.

Although I am still unsure what to think of "the recursive pin we force remove might be a block that is referenced by a pinned block that is missing". Offhand I don't necessary see this as a problem although I can't articulate exactly why at this time. @Kubuxu (and others) thoughts?

@whyrusleeping
Copy link
Member

Yeah, would love to get thoughts from @Kubuxu and @lgierth and @hsanjuan here. This is tricky failure case UX we need to make sure to get right

@Kubuxu Kubuxu self-assigned this Mar 19, 2017
@hsanjuan
Copy link
Contributor

Thoughts:

  • If a pin is missing, all its parents should show "error" state. (prob "error_indirect", "error_recursive" etc.) because that whole tree (the whole pin set) is compromised. GC cannot proceed then.
  • It should never be possible to remove a pin who has a pinned parent (force or not, errored or not). All parents must be unpinned before.
  • Indirect pins without parents (orphans - both in error or not) should only be removed with a --force option when there are other pins in error state (otherwise it is just like GC-ing).
  • Removal of pins in error state should follow the same rules as now (their children have a pinned parent then they remain pinned).

The whole idea is using --force for orphans (indirect pins without parents). For everything else, you need to unpin from the root. It it has a known pinned parent, there should be no way to delete a pin, even when it is in error state because some child is missing.

@kevina
Copy link
Contributor Author

kevina commented Mar 23, 2017

I may be misunderstanding how pins are designed to work but @hsanjuan and to some extent @whyrusleeping concern strike me as odd. For clarity lets assume we have this:

       A
    b     C
  D  e     f
g

where nodes in uppercase are recursive pins. I don't see any problem with deleting the recursive pin for D or C. The user may not even be aware that D is also indirectly pinned. If there where not aware where the only bad thing that will happen is that D will remain pinned when they expected the data to be unpinned. If the where aware then maybe they where removing redundant pins. In either case I don't see why it should be disallowed.

If somehow the node for b becomes corrupt or missing, I still don't see the problem with deleting the recursive pin for D. Some data could potentially get lost, but as I see the data is already gone, for the indirect pinned node e could be considered lost because b is unreadable and hence we have no way of knowing it is pinned.

@whyrusleeping
Copy link
Member

@kevina my concern is in the case where b is missing and unpinning D causes us to unnecessarily lose g.

Really, when data is borked like this bad things are going to happen. This is about making it as clear as possible to the user what the consequences of this are.

In response to @hsanjuan:

If a pin is missing, all its parents should show "error" state. (prob "error_indirect", "error_recursive" etc.) because that whole tree (the whole pin set) is compromised. GC cannot proceed then.

Agreed, any pin with missing children should should show an error. Though we need to consider what to do in this state, What happens if we have data loss like this and the user needs to fix it and free up space? Blanket disabling GC isnt a good option. We need to have a way for the user to proceed.

It should never be possible to remove a pin who has a pinned parent (force or not, errored or not). All parents must be unpinned before.

I think you mean it should never be possible to remove an indirectly pinned block. In kevins example its perfectly fine to remove the pin 'D' in a normal, non-corrupted, repo.

Indirect pins without parents (orphans - both in error or not) should only be removed with a --force option when there are other pins in error state (otherwise it is just like GC-ing).

The thing is, if their parents are missing, we have no way of telling that they are indirectly pinned.

@hsanjuan
Copy link
Contributor

hsanjuan commented Mar 23, 2017

hence we have no way of knowing it is pinned

Ah ok! I assumed that indirect pins were tracked as such somewhere (as in a shallow dag or something). If this is not true, there is nothing stopping e from being GC'ed.

So yeah, I agree with both of you then.

Though we need to consider what to do in this state

Well, if you don't stop GC data loss may be worse and repair (i.e. running pin add on errored pins) might be slower (and has more chances of not working if a larger part of the pinset has been GC'ed).

@kevina
Copy link
Contributor Author

kevina commented Mar 23, 2017

Blanket disabling GC isnt a good option. We need to have a way for the user to proceed.

With the way pins are currently stored I see disabling GC as the only option to prevent data lose. A user needs to get involved and manually remove the broken pins before GC can be safe again. (@hsanjuan our messages crossed but I thing we agree)

I personally think that indirect pins should somehow be marked directly to prevent these sort of problems. I believe the datastore should be expanded to allow storing metadata with a key, and that the metadata should store a reference count of the number of times a block is pinned. But perhaps that is a discussion for another issue.

@kevina
Copy link
Contributor Author

kevina commented Mar 23, 2017

Okay. I think a non-controversial first step would be to create a command that will simply find all broken pins and present the output in as clear a manor as possible. @whyrusleeping agree.

@whyrusleeping
Copy link
Member

Marking indirect pins is what we used to do. The construction we had then was really inefficient and had a fairly significant disk space overhead.

Just thinking out loud, not proposing anything:
We could have a better notion of pin security if we had a link caching database as @Voker57 proposed. But the obvious downside there is a very significant disk overhead on every single object in the repo.

@kevina Yeah, having a command to print out broken pins would be good. Could be ipfs pin check, or maybe a flag on pin ls?

@Kubuxu
Copy link
Member

Kubuxu commented Mar 23, 2017

In my opinion (with force) flag it should be possible to remove any pin, as lone as user is notified about possible data loss. User might want to even remove the pin that is causing the pin tree to be broken (imagine scenario where lost block can't be recovered and the only option is to remove recursive pin that references lost block).

@Kubuxu Kubuxu removed their assignment Mar 24, 2017
@kevina
Copy link
Contributor Author

kevina commented Mar 24, 2017

@whyrusleeping I am thinking a separate command the command will also be able to point out if there are redundant recursive pins. For example giving my example the output if b is missing could be:

recursive pin A broken
  unable to read child b
recursive pin C ok
  also indirectly pinned by A
recursive pin D ok

there is no way to tell that D is also indirectly pinned by A since b is missing.

This should be fairly self contained and easy to do so I can work on this Friday if you want.

@kevina
Copy link
Contributor Author

kevina commented Mar 28, 2017

@whyrusleeping please note, that this may not be intentional, but right now you can remove a pin even if some of the blocks to that (or another pin) is bad, this is because if the pin exists "pin rm" won't scan and try to explain why, if something like #3600 goes through pin rm should never error due to broken pins.

I tested this by hacking t0087-repo-robust-gc.sh.

@kevina kevina mentioned this issue Mar 28, 2017
@kevina kevina self-assigned this Apr 30, 2017
@magik6k
Copy link
Member

magik6k commented Feb 4, 2019

There is now pin rm --force

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/community-input Needs input from the wider community
Projects
None yet
Development

No branches or pull requests

5 participants