Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adder: Fix multi-file add so it works as expected. #3258

Closed
wants to merge 4 commits into from

Conversation

kevina
Copy link
Contributor

@kevina kevina commented Sep 25, 2016

If neither "-w" or "-r" is specified don't use an mfs-root. Add and
pin each file separately.

Closes #2811

Part of #2634

License: MIT
Signed-off-by: Kevin Atkinson k@kevina.org

@Kubuxu Kubuxu added the status/in-progress In progress label Sep 25, 2016
@kevina kevina added topic/commands:add Topic commands:add need/review Needs a review and removed status/in-progress In progress labels Sep 25, 2016
@whyrusleeping whyrusleeping self-assigned this Sep 28, 2016
@kevina kevina force-pushed the kevina/multi-file-add-fix branch from 2d016a6 to 551f540 Compare September 30, 2016 15:13
…dually

Add (currently failing and marked as such) test that each file
specified on the command line is pinned individually when using
"ipfs add".

License: MIT
Signed-off-by: Kevin Atkinson <k@kevina.org>
@kevina kevina force-pushed the kevina/multi-file-add-fix branch from 8355c74 to 31add0f Compare September 30, 2016 15:20
@kevina
Copy link
Contributor Author

kevina commented Sep 30, 2016

Added test case.

@whyrusleeping
Copy link
Member

whyrusleeping commented Sep 30, 2016

This doesnt handle the case where i ipfs add -r A B and A and B are directories.

I think that instead of skipping using the mfs root, we rework how the 'pin finalize' works. We need to be able to convey what we want eventually pinned to the adder.

Maybe we do a logically separate add for each input? (that seems to make the most sense to me)

@kevina
Copy link
Contributor Author

kevina commented Sep 30, 2016

You are right it doesn't. Handling that case will be very tricky due to how the request is sent to the server. For now, I think we should detect and disallow adding multiple directories at once.

I don't particularly like the idea of using an mfs root when it is not necessary. In a script for the filestore (add-dir, see the README (section Adding all files in a directory) for info on using it) I am adding 1000s of unrelated files (about as many as I can get away with before running into system limits on the size of the command line) at the same time, the mfs root in those cases is completely irrelevant.

@kevina
Copy link
Contributor Author

kevina commented Oct 1, 2016

Also, we talked a lot about this in #2811. I tried one request per file in that commit and it turned out to me more code so I went with what I was currently using.

To see how an alternative GetLinks() is provided see filestore/support/dagservice.go in #2634.

License: MIT
Signed-off-by: Kevin Atkinson <k@kevina.org>
@kevina kevina force-pushed the kevina/multi-file-add-fix branch from 4033959 to b831b3c Compare October 3, 2016 04:11
If neither "-w" or "-r" is specified don't use an mfs-root.  Add and
pin each file separately.

Closes #2811

License: MIT
Signed-off-by: Kevin Atkinson <k@kevina.org>
@kevina kevina force-pushed the kevina/multi-file-add-fix branch from b831b3c to 543f611 Compare October 3, 2016 17:55
@kevina
Copy link
Contributor Author

kevina commented Oct 3, 2016

@whyrusleeping I recently pushed another test case to cover the other problem caused by using an MFS Root as summarized in #2811

For now, I think we should detect and disallow adding multiple directories at once.

Due to the way "-r" is handled this will need to be done outside of the add command. The directory content are already expanded and there is no way for the add command to see which ones where specified on the command line.

@whyrusleeping
Copy link
Member

@kevina the Files object we get in the command request is an array of all the top level files the user is trying to add right?

This needs to be verified, but i think the structure looks something like:

ipfs add -r dir
Files = [
    dir{...}
]

ipfs add file
Files = [
    file={...}
]

ipfs add fileA fileB
Files = [
    fileA = {...}
    fileB = {...}
]

ipfs add -r dirA dirB
Files = [
    dirA = {...}
    dirB = {...}
]

(still need to confirm that, but) If thats the case, we can just iterate over the immediate children of the Files object and run 'add and pin' on each of them individually.

@kevina
Copy link
Contributor Author

kevina commented Oct 4, 2016

@whyrusleeping I did some checks to double check. The Files object in the PreRun is as you described. However, when the request is sent to the server the directory contents are expended and there is no way to tell what was specified on the command line.

For now the easiest thing would be to detect the situation in the PreRun and abort with an clean error. Agreed?

Adding multiple directories without "-w" will not produce the expected
result, so for now it is best to disallow it.

This also added a NumFiles() method to SliceFile.

License: MIT
Signed-off-by: Kevin Atkinson <k@kevina.org>
@kevina
Copy link
Contributor Author

kevina commented Oct 5, 2016

@whyrusleeping, okay, as discussed over IRC, I added a test to disallow multiple directories and filed an new issue.

Hopefully this is good to go.

@kevina kevina added this to the Filestore implementation milestone Oct 19, 2016
@whyrusleeping whyrusleeping added the status/blocked Unable to be worked further until needs are met label Oct 29, 2016
@kevina
Copy link
Contributor Author

kevina commented Nov 8, 2016

@whyrusleeping I would like you to consider giving this pull request some serious considering. Not using a MFS root for single file adds turns out to required for my filestore to function. Here is what happens in a single file add without this patch:

  1. A small file is added (let say the hash is QmXXYZ) and it is added to the blockstore as a FilestoreNode node, so far no problem. But then:
  2. PinRoot is then called, this calls adder.RootNode() that retrieves the same node from the blockstore (with the hash QmXXYZ), however now the node is a ProtoNode.
  3. PinRoot readds the block to the blockstore. If the blockstore is the special filestore blockstore it will error out because it doesn't know what to do with it.

If the MFS root is not used than 2 and 3 do not happen when adding single files and there is no problem.

For now, as a hack, I will ignore a ProtoNode if the block already exists in the filestore.

(And in case there is any confusion I can't just blindly accept FilestoreNode and pass on ProtoNode as that will lead to storing the same block twice, once in the filestore and once in the normal datastore.)

Steps (2) and (3) seam like a horrible waste of time even without the problems it causes the filestore.

@whyrusleeping
Copy link
Member

In which cases will the mfs root actually be a file? I think most cases it will be a directory node, or at least an intermediate node. And thats fine to have in the normal blockstore anyway. You only really care about having leaf nodes (data containing nodes) in the filestore, right?

@kevina
Copy link
Contributor Author

kevina commented Nov 8, 2016

adder.RootNode() will return the hash of the leaf node when adding a single small file with just a Leaf. https://github.com/ipfs/go-ipfs/blob/master/core/coreunix/add.go#L147:

func (adder *Adder) RootNode() (node.Node, error) {
...
    // if not wrapping, AND one root file, use that hash as root.
    if !adder.Wrap && len(root.Links()) == 1 {
        nd, err := root.Links()[0].GetNode(adder.ctx, adder.dagService)
        if err != nil {
            return nil, err
        }

        root = nd
    }
...
}

@kevina
Copy link
Contributor Author

kevina commented Nov 8, 2016

Also, although not strictly relevant to this problem, I also store non-leaf nodes related to a file in the filestore. They are used for maintenance operations and to be able to easily list the files stored in it by only listing the roots. I do not store directory nodes in the filestore.

@whyrusleeping
Copy link
Member

Lets loop back around and see about getting this issue fixed

@magik6k
Copy link
Member

magik6k commented Jun 19, 2019

This code was rewritten at least once since this PR was opened, feel free to reopen if it's still relevant.

@magik6k magik6k closed this Jun 19, 2019
@hacdias hacdias deleted the kevina/multi-file-add-fix branch May 9, 2023 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/review Needs a review status/blocked Unable to be worked further until needs are met topic/commands:add Topic commands:add
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multi file add does not work as I would expect
4 participants