Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Vocabulary for Datasets #422

Closed
dustmop opened this issue May 30, 2018 · 1 comment · Fixed by #462
Closed

New Vocabulary for Datasets #422

dustmop opened this issue May 30, 2018 · 1 comment · Fixed by #462
Assignees
Milestone

Comments

@dustmop
Copy link
Contributor

dustmop commented May 30, 2018

There's unavoidable confusion whenever we talk about data (what is "data", what does the word mean, how is it distinct from similar terms) but even still, I believe we can improve some of the terms used in Qri to help tell better stories. One particular example involves Dataset, which is distinct from DatasetPod, which itself has a field called Data. DatasetPod is part of the recent move to naming things "Pod" when they are Plain-Old-Data, using all basic types, that can be trivially serialized. Yet, it and Dataset are still distinct from the field Data, and in fact it's possible to have either Dataset or DatasetPod without any Data, which is strange.

Furthermore, there are other terms like Meta and Structure that both talk about Datasets, but in different ways. It's important to have good explanations for what each of these terms mean relative to one another.

My suggestion is to strongly embrace analogies to html, a well-understood and often talked about technology. The main change is to rename the field Data to Body, matching how an html document may have meta tags, but that the body is the main thing a consumer cares about reading. This leads to the following comparisons in Qri:

  • Structure :: XSD (xml schema definition, describes an xml / html file's structure)
  • Meta :: meta tags (contains descriptions, keywords, search engine stuff)
  • Body :: body tag (the document's real contents)
  • Dataset :: webpage

Most importantly, this last point reveals we can talk about Datasets that don't have bodies, perhaps because they haven't been loaded yet, or because we only have a reference to them. This is similar to the HTTP method "head", which retrieves information about a webpage without getting the full body. No more talking nonsensically about "Datasets without Data"; saying instead "Datasets without Bodies" makes much more sense.

@dustmop
Copy link
Contributor Author

dustmop commented May 30, 2018

Some implications:

qri add --data=stuff.json

should become:

qri add --body=stuff.json

And

qri data me/test_name

should become:

qri body me/test_name

or

qri getbody me/test_name

maybe even

qri get me/test_name

which again resembles HTTP.

@b5 b5 added this to the 0.4.1 milestone Jun 8, 2018
b5 added a commit to qri-io/dataset that referenced this issue Jun 11, 2018
After a bunch of discussion, we've decided to rename "Data" to "Body", our logic is outlined here: qri-io/qri#422. Basically it's just a rename, but it renames a required field. This'll break all sorts of stuff, but considering we're still _very_ early days, I'd rather get it merged now.
To me this dataset-as-html-page metaphor is worth a breaking change, so let's make one.

BREAKING CHANGE: dataset.DataPath is renamed to dataset.BodyPath.
b5 added a commit that referenced this issue Jun 13, 2018
closes #422

BREAKING CHANGE: this change will break hashes. `dataPath` is now `bodyPath`.
b5 added a commit that referenced this issue Jun 13, 2018
closes #422

BREAKING CHANGE: this change will break hashes. `dataPath` is now `bodyPath`.
@ghost ghost assigned b5 Jun 13, 2018
@b5 b5 closed this as completed in #462 Jun 13, 2018
@ghost ghost removed the in progress label Jun 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants