Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/commands/ls: wrap NewDirectoryFromNode error #5166

Merged
merged 1 commit into from
Aug 28, 2018
Merged

core/commands/ls: wrap NewDirectoryFromNode error #5166

merged 1 commit into from
Aug 28, 2018

Conversation

schomatis
Copy link
Contributor

@schomatis schomatis commented Jun 28, 2018

Fixes #5165.

@schomatis schomatis requested a review from Kubuxu as a code owner June 28, 2018 13:36
@ghost ghost assigned schomatis Jun 28, 2018
@ghost ghost added the status/in-progress In progress label Jun 28, 2018
@schomatis schomatis added topic/docs-ipfs Topic docs-ipfs topic/files Topic files topic/UnixFS Topic UnixFS and removed status/in-progress In progress labels Jun 28, 2018
@schomatis schomatis added this to the Files API Documentation milestone Jun 28, 2018
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more correct to say "dag node is not a unixfs directory"?

@schomatis
Copy link
Contributor Author

Depends on how you see the IPFS stack I guess, in my mental model (I haven't seen any literature supporting or contradicting this, would love to hear how you think) I see UnixFS as a different (upper) layer on top of the DAG (node) layer.

For me the node it's always a node (not a directory or any other entity for that matter), it just happens to carry data that the UnixFS layer knows how to interpret (as a directory in this case). Many parts of the code refer to nodes as a file system objects (e.g., FSNode) but (as you recently showed me) a directory doesn't neccessarily need to be a single node, it can be partitioned in any DAG topology, so I don't equate DAG nodes with files or directories.

@Stebalien
Copy link
Member

"is" isn't the same as "is and only is". I can say "this node is a node and a directory and a unixfs node".

However, I misread the patch. This is the case where the error is something other than "not a dir". I'd say something like: fmt.Errorf("invalid unixfs node %s at %q: %s", dagnode.Cid(), paths[i], err). That is, give the user something they can understand (and something we can understand when they complain).

@schomatis
Copy link
Contributor Author

"is" isn't the same as "is and only is".

Yes, I understand and agree with that, but

"this node is a node and a directory and a unixfs node"

what I meant before was that I can't even use the comparison "is" (or "isn't") between UnixFS and node (DAG) because they are in different categories (layers), one contains the other (or doesn't contain it). Associating them together like in

fmt.Errorf("invalid unixfs node

makes me think that there may be something wrong with the DAG node, e.g., the digest doesn't match, the encoding isn't supported, etc., when the error is actually at the UnixFS layer, and only there, the DAG node is valid, the UnixFS format found inside the DAG node was invalid. My objection is about associating the term node with UnixFS (#5058). I understand that colloquially we can just say the term unixfs node and mean all said above but I'd rather show the (new) user a more informative description (e.g., "invalid UnixFS directory format inside DAG node") to help form a more clear model of what's happening inside IPFS.

However, I can't put my foot down here as this was just my interpretation when I was reading the code, and unixfs node may be the correct (or de facto) term, I don't want to force my own view upon the reader so I can just use that term in the error message if you deem it the most appropriate.

dagnode.Cid(), paths[i], err). That is, give the user something they can understand (and something we can understand when they complain).

👍 to adding more information to the error.

@Stebalien
Copy link
Member

UnixFS and node (DAG) because they are in different categories (layers), one contains the other (or doesn't contain it).

We call them all nodes. For example, at the filesystem level, you just have "files". However, the file browser will likely interpret these "files" at a higher layer as "JPEG", "PNG", "DOC", etc. UnixFS Node just means an IPLD Node that's a valid (wellformed) node that can appear in a UnixFS filesystem tree.

makes me think that there may be something wrong with the DAG node, e.g., the digest doesn't match, the encoding isn't supported, etc., when the error is actually at the UnixFS layer, and only there, the DAG node is valid, the UnixFS format found inside the DAG node was invalid.

Exactly. The node is a valid IPLD Node, just not a valid UnixFS Node.

My objection is about associating the term node with UnixFS (#5058).

IIRC, we used to call them "objects" however, that comes from the day when we didn't have IPLD. Objects refer to MerkleDAG (DagProtobuf) IPLD Nodes. However, UnixFS can now have, e.g., Raw IPLD Nodes and, in the future, will have other IPLD nodes as well.

Now, we could continue to use the term object or unixfs object to refer to IPLD Nodes used by UnixFS, but I'd rather not add even more terminology.

"invalid UnixFS directory format inside DAG node"

To me, that's even more confusing. You don't say "invalid PNG format inside file", you say "file isn't a valid PNG (file)".

@ghost ghost added the status/in-progress In progress label Jul 1, 2018
@schomatis schomatis removed the status/in-progress In progress label Jul 1, 2018
@schomatis
Copy link
Contributor Author

@Stebalien Corrected the error message to the text you suggested.


To me, that's even more confusing. You don't say "invalid PNG format inside file", you say "file isn't a valid PNG (file)".

The PNG is a file format encoded inside a file, it's not a file itself. But yes, I definitely don't use the convoluted "invalid PNG format inside file" phrase because I already have a clear mental model of how files work and how we use file formats to represent different entities.

To use short phrases that omit information (already known by us but not -IMO- by the standard user) we need to provide the user (a lot of) education about what we're talking about.

@schomatis
Copy link
Contributor Author

/cc @Mr0grog

@schomatis
Copy link
Contributor Author

but I'd rather not add even more terminology.

I personally would like to, IPFS is by no means an easy concept to grasp, and names (if used correctly) help differentiate separate (even if closely related) concepts.

@Stebalien
Copy link
Member

The PNG is a file format encoded inside a file, it's not a file itself. But yes, I definitely don't use the convoluted "invalid PNG format inside file" phrase because I already have a clear mental model of how files work and how we use file formats to represent different entities.

But a PNG file is a the file itself. It's a file, but, more specifically, it's a PNG file. In this case, we have an IPLD Node. However, more specifically, it's a unixfs node.

To use short phrases that omit information (already known by us but not -IMO- by the standard user) we need to provide the user (a lot of) education about what we're talking about.

Unnecessary information is a guaranteed way to confuse a user.

I personally would like to, IPFS is by no means an easy concept to grasp, and names (if used correctly) help differentiate separate (even if closely related) concepts.

In this case, it'll just confuse users. We want to make it clear that UnixFS Nodes are just a subclass of IPLD Nodes. If we switch to a term like object, we loose this association.

@Stebalien
Copy link
Member

To be more explicit, we could say "ipld node $cid at $path is not a valid unixfs node". That would indicate that it's a valid IPLD node but not a valid UnixFS node.

@schomatis
Copy link
Contributor Author

Unnecessary information is a guaranteed way to confuse a user.

Good point. 👍

@Mr0grog
Copy link
Contributor

Mr0grog commented Jul 6, 2018

Wow, I’m glad our error messages are get this level of consideration, at least! :P

I think my mental model of this aligns more with @Stebalien’s — a UnixFS node is a MerkleDAG node is an IPLD node (after all, the UnixFS data isn’t just the content of the MerkleDAG node’s Data field, it’s also in the Links field).

But I also agree with @schomatis’s concern about “validity” being confusing here. I get that the only thing that technically separates a UnixFS node from any old IPLD node is its validity/parse-ability, but I think that idea is a confusing and unclear one for someone who is new to IPFS or even just not an expert. And once someone is an expert, they’ll already understand this is about validity anyway.

So, I like @Stebalien’s last formulation, but might remove the word “valid.” I also think the “IPLD node” terminology might be unclear, too, since the general nature of IPLD is still unclear (*cough* ipld/ipld#39 *cough*), so might suggest also dropping that:

node $cid at $path is not a unixfs node

or

the data in $cid at $path is not a unixfs node

or just

$cid at $path is not a unixfs node

If we are also going to keep the underlying error, could we add some more explanatory text? e.g:

$cid at $path is not a unixfs node (while attempting to parse it as one, we encountered the error: $error)

Or more terse:

$cid at $path is not a unixfs node (error parsing node as unixfs: $error)

A couple other minor niggles:

  • Since this is written text, not a package name or symbol in code, shouldn’t the message use capitalization? e.g. “UnixFS node,” not “unixfs node.”

  • Could we drop the “at $path” if the path is just the CID? Or maybe just use the path instead of ever showing the CID at all? ‘node xyz at "xyz"’ seems confusing to me. (But I totally understand how this is useful in the more complex case of ‘node xyz at "/ipfs/abc/some/directory"’.)

@Mr0grog
Copy link
Contributor

Mr0grog commented Jul 6, 2018

(But also, no matter what, the message we have now is a vast improvement. I am 👍 for whatever we go with here!)

@schomatis
Copy link
Contributor Author

Thanks for the input @Mr0grog.

(after all, the UnixFS data isn’t just the content of the MerkleDAG node’s Data field, it’s also in the Links field).

Good point.

but I think that idea is a confusing and unclear one for someone who is new to IPFS or even just not an expert.

Agreed, my main concern here is explaining to the user where is the error located.

but might remove the word “valid.”

👍

I also think the “IPLD node” terminology might be unclear, too, since the general nature of IPLD is still unclear (cough ipld/ipld#39 cough),

☝️ ☝️ ☝️

If we are also going to keep the underlying error, could we add some more explanatory text?

We should keep it, but fixing that needs to be resolved in a separate PR as it is beyond the control of this function (the error may or may not be a parse error, we can't assume that, the called function has to clarify it).

shouldn’t the message use capitalization? e.g. “UnixFS node,” not “unixfs node.”

Agreed, "UnixFS" is how we're introducing the concept to the user in the documentation (well, if we had UnixFS documentation, getting there :))

Could we drop the “at $path” if the path is just the CID? Or maybe just use the path instead of ever showing the CID at all? ‘node xyz at "xyz"’ seems confusing to me.

Yes, it is confusing, there should be an error processing function (maybe there is and I don't know it) that takes care of things like omitting a node's path if it's equal to the CID. That should be handled elsewhere.

(But also, no matter what, the message we have now is a vast improvement. I am for whatever we go with here!)

Yes, let's get this merged, I'm pushing an error message with your suggestions, any error message is definitely better than no error message here, I have a few more questions but I'll continue the discussion in #5058, as most of what we've talked about here exceeds this particular case.

@schomatis schomatis closed this Jul 9, 2018
License: MIT
Signed-off-by: Lucas Molas <schomatis@gmail.com>
@schomatis
Copy link
Contributor Author

Accidentally closed this branch.

@schomatis schomatis reopened this Jul 9, 2018
@ghost ghost added the status/in-progress In progress label Jul 9, 2018
@schomatis
Copy link
Contributor Author

schomatis commented Jul 9, 2018

So, I finally changed the "node" term to "directory", as this is what the command is expecting from NewDirectoryFromNode to lists its contents and is a concept much more familiar to the user than "node". The rest of the error information is still there.

@schomatis
Copy link
Contributor Author

@Stebalien PTAL

@Stebalien Stebalien merged commit 2362c6d into ipfs:master Aug 28, 2018
@ghost ghost removed the status/in-progress In progress label Aug 28, 2018
@schomatis schomatis deleted the fix/commands/ls-error-unixfs branch August 29, 2018 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/docs-ipfs Topic docs-ipfs topic/files Topic files topic/UnixFS Topic UnixFS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants