-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement basic filestore 'no-copy' functionality #3629
Conversation
6176525
to
f49f52e
Compare
filestore/filestore.go
Outdated
if has { | ||
return nil | ||
} | ||
|
||
switch b := b.(type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@whyrusleeping I don't know what you have in mind for the filestore. But I very specially designed my filestore so that writes go though. This way if a file is moved it can simply be readded and any invalid blocks are automatically fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That becomes problematic, What then happens if i add dataset A, which contains some subset C. Then later, i add dataset B that also contains some subset C. Then i change my mind and remove dataset B. By your description, the original dataset references would now be lost, with no indication to the user that this has happened.
I think this scenario is worse UX than the reverse (deleting A causing bad refs in B). In either case, repairs are going to have to be manual, and some things will get sticky.
cc @jbenet for his thoughts on this scenario.
From an inline comment above:
Yes this was originally a problem. In my implementation I fixed this. See ipfs-filestore#12 and ipfs-filestore#23. However, that solution also requires writes to go through.
Not if my solution in ipfs-filestore#12 is implemented.
|
Another option is to only replace the block if it is invalid. I considered that, but decided that it was better to support allowing multiple files to point to the same block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One major thing I noticed. A few other minor things that I comment on later.
filestore/fsrefstore.go
Outdated
out, err := f.readDataObj(&dobj) | ||
if err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly recommend you verify the block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah. NewBlockWithCid
only does that check if u.Debug
is true. good catch
I see that "--nocopy" is allowed when the daemon is online without any sort of additional checks. If the blocks are ever not verified this could become a security problem. Even without the security problems this could lead to strange results if files are added on another machine with an identical path. |
For this to be usable outside of |
var FilestorePrefix = ds.NewKey("filestore") | ||
|
||
type FileManager struct { | ||
ds ds.Batching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note. In my early implementations I wrote the filestore on top of the datastore also. However, I changed this to directly use the leveldb as I found little value in the abstraction and extra layers. What actually prompted it was ipfs-filestore#10. In particular querying the datastore was very very slow, however I then latter find it very useful to use the features of the LevelDB in order to provide safe verification of the filestore while the deamon was running by using the LevelDB snapshot feature. With my recent improvements to the datastore query the performance may no longer be a problem, but I still see little value in the extra indirection.
@whyrusleeping concerning replacing blocks. How about for now we make it an option? There needs to be a way to fix broken blocks due to files moving. One option would be to remove the block and then readd it, but pinning makes that a more complicated operation. As far as allowing multiple blocks for a hash (ipfs-filestore#12 and ipfs-filestore#23), I agree that we should aim for that but to start with it might be a bit much to review. |
filestore/fsrefstore.go
Outdated
return nil, err | ||
} | ||
|
||
out := make(chan *cid.Cid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest you use a buffered channel. Use dsq.KeysOnlyBufSize.
filestore/filestore.go
Outdated
return nil, err | ||
} | ||
|
||
out := make(chan *cid.Cid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest you use a buffered channel. Use dsq.KeysOnlyBufSize.
d9c40b5
to
299f5c9
Compare
Is this branch ready for me to test on my 30 TB of webcam videos or should I wait? |
@jefft0 Its almost ready, should be merged in a couple days. |
299f5c9
to
1a94991
Compare
Rebased, waiting on a 👍 from @Kubuxu and CI |
At @jefft0 I am really not sure how usable this code is going to be for you as it does not allow for files "outside the root handler". For example if you directory layout is:
and you want to share files in /aux/media you won't be able to and will get an error. |
@whyrusleeping I would like this comment, to at least be addressed:
|
@kevina @jefft0 In this case, simply set your @kevina Re the comment, blocks are checked now (thanks for pointing that out), and this feature should not be used with client and daemon on separate machines. We can probably implement a check of some sort for this... any ideas on how to do that off the top of your head? |
License: MIT Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT Signed-off-by: Jeromy <jeromyj@gmail.com>
License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>
License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>
License: MIT Signed-off-by: Jeromy <jeromyj@gmail.com>
1aaf351
to
9ec4c25
Compare
I'm confused. I have several webcam videos totalling 60 GB. With a fresh installation of go-ipfs (master branch), I did |
@whyrusleeping anything on this? |
Ah, theres a check i forgot to add, erroring out if @jefft0 take a look at "How to enable" here: #3397 (comment) |
@whyrusleeping Much better, thanks! Now |
@jefft0 You can report them here, or open a new issue to discuss. Either way :) |
@whyrusleeping I want to keep my main repo in |
@jefft0 hrm... unsure about the implications of this. I'll have to think about the security model of using/allowing symlinks like that. My first thought though is that it should be fine |
@jefft0 you can also put the symbolic link in just |
every time I get
|
@atommixz Wanna open a new issue for this? |
This helps to me:
This is don't make sence to me |
I'm trying to download many images from filestore. Everytime it stops at the same place. Last commit on two pc.
|
Can you report it in separate issue. It is much easier to track this way. |
whyrusleeping said "You can report them here, or open a new issue to discuss. Either way :)" |
I think he meant results as it, it worked, performance, small problems. |
Yeah, Keeping perf and general 'how it went' things here is fine, but actual problems need to have their own issues so we can address them |
As per this design doc: https://gist.github.com/whyrusleeping/5565651011b49ddc9ddeec9ffd169050
License: MIT
Signed-off-by: Jeromy why@ipfs.io