Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation error thrown for number that should be recognized as a number? #290

Closed
nathantchan opened this issue Mar 2, 2018 · 2 comments · Fixed by #306
Closed

Validation error thrown for number that should be recognized as a number? #290

nathantchan opened this issue Mar 2, 2018 · 2 comments · Fixed by #306
Assignees
Labels
bug code that is not behaving as expected
Milestone

Comments

@nathantchan
Copy link

I have a csv with two columns, the first of which is the number of seconds since the unix epoch began, with nanosecond decimals, and a float after it.

timestamp, value
1519955200.898057929, 1.0
1519955200.899679433, 2.0
1519955200.900753377, 3.0

I added it to qri with the following command:

qri add --data <file.csv> <dataset name>

It threw a validation error:

qri validate <dataset name>
0: expected "1519955200.898057929" to be of type number

Seems like this should be recognized as a number, though.

@b5 b5 added the bug code that is not behaving as expected label Mar 2, 2018
@b5 b5 added this to the 0.2.0 milestone Mar 2, 2018
@b5
Copy link
Member

b5 commented Mar 2, 2018

right you are, this is most likely a bug in our csv schema detection. Given that you've provided example data we have everything we need to replicate see if we can't ship a fix in the next release (we're aiming for Monday).

Thanks for filing @nathantchan!

@ramfox
Copy link
Member

ramfox commented Mar 7, 2018

Great! So on further investigation, it looks like this is an issue with the dsio package.

In order to use the jsonschema validator on CSV data, we manipulate the csv data into a slice of bytes using the dsio.NewValueReader and dsio.NewValueBuffer functions.

NewValueReader, unfortunately, does not "encode" the way json.Marshal does. So, we end up with an array of strings, rather than an array of the most naturally fitting type. This creates the weird error in jsonschema. The schema specifies that it should be a number, but the data passed in has encoded it as a string.

b5 added a commit that referenced this issue Mar 9, 2018
So, turns out a *bunch* of stuff needed fixing when it comes to valiation.
We've gone through and cleanup up the following to make validation a better experience.

jsonpointer:
* added a new "Descendant" method on a pointer to generate child paths

jsonschema:
* complete overhaul of Validation interface to support multiple errors, consistent error
  formatting, and added a jsonpointer path to the location of each error

dataset/dsio:
* added a new "Entry" struct to represent top-level entries in a dataset
* overhauled dsio to remove vals package. dsio now works with go prinitive types
* CSVReader has been completely reworked to convert entry rows to the types specified
  by the Structure schema of the reader (this was the root cause of #290)

dataset/validate:
* added a new Data method that consumes a dsio.EntryReader to validate data that auto-converts
  incoming data to JSON for validation (a necessary step to work with JSONSchemas)

dataset/dsfs:
* dsfs.CreateDataset now uses validate.Data

dataset:
* updated all packages to use new dsio interfaces

qri/core:
* dataset.Validate now is mainly a wrapper around validate.Data

This sets the stage for some exciting stuff in the future, and makes qri work properly / better today.
I'm particularly excited to work with the new dsio Reader/Writer interfaces, as they set the stage for
filtering & sorting subsections of datasets on the fly

closes #290
@ghost ghost assigned b5 Mar 9, 2018
@ghost ghost added the in progress label Mar 9, 2018
@b5 b5 closed this as completed in #306 Mar 9, 2018
@ghost ghost removed the in progress label Mar 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug code that is not behaving as expected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants