Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(csv): treat omitted values with no defaults as null in csv #691

Merged
merged 1 commit into from
Jan 8, 2019

Conversation

wolffcm
Copy link

@wolffcm wolffcm commented Jan 8, 2019

Fixes #632.

Done checklist

  • docs/SPEC.md updated
  • Test cases written

panic(fmt.Errorf("invalid value: %s", t.KeyValues[j]))
var v values.Value
if t.KeyValues[j] == nil {
v = values.NewNull(flux.SemanticType(t.ColMeta[idx].Type))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to make it so values.New(nil) works instead of adding a new function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of that too, but a values.Value has to have a type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want null values to have a type, so I don't see a way to do it just by passing nil. If New accepted a pointer to a value, then it would be possible by passing a null pointer... but that seems kind of hacky to me.

Copy link
Contributor

@nathanielc nathanielc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one note about how null and default should work together.

csv/result.go Outdated
@@ -397,6 +397,8 @@ func readMetadata(r *csv.Reader, c ResultDecoderConfig, extraLine []string) (tab
return tableMetadata{}, errors.Wrapf(err, "column %q has invalid default value", label)
}
defaultValues[j] = v
} else {
defaultValues[j] = values.NewNull(flux.SemanticType(cols[j].ColMeta.Type))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a constant for the nullValue and set it to "" for now? We want to make it clear how the null value interacts with the default value. The if statement above it would become switch statement with three cases:

switch defaults[j] {
    case nullValue:
        // Set explicit null as default
        defaultValues[j] = values.NewNull(flux.SemanticType(cols[j].ColMeta.Type))
    case "":
        // Do nothing there is no default
    default:
        v, err := decodeValue(defaults[j], cols[j])
        // ...
}

Technically the second case will not be possible until nullValue is dynamic. But it makes it clear that its still possible to not have a default value.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't write the code like you've done it here because Go complains about duplicate cases, but I will add nullValue and comments to this effect.

csv/result.go Outdated
@@ -1104,23 +1104,38 @@ func encodeValue(value values.Value, c colMeta) (string, error) {
}
}

func encodeValueFrom(i, j int, c colMeta, cr flux.ColReader) (string, error) {
func encodeValueFrom(i, j int, c colMeta, cr flux.ArrowColReader) (string, error) {
var v string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the nullValue constant here too

Copy link
Contributor

@aanthony1243 aanthony1243 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, I have 2 comments about how we might enhance this more, to make the code more robust. but that shouldn't stop this PR.

@@ -1192,6 +1192,10 @@ func (b *ColListTableBuilder) SetValue(i, j int, v values.Value) error {
}

func (b *ColListTableBuilder) AppendValue(j int, v values.Value) error {
if v.IsNull() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so useful and safety, I want to propose making AppendBool, AppendInt, etc. all private functions, and forcing us to use AppendValue(j, values.New(false)), etc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we'll still want to expose the methods that accept raw types for performance reasons, since values.Value uses interface{} and reflection under the hood. Were it not for this I would totally agree with you.

@@ -148,6 +152,13 @@ func New(v interface{}) Value {
}
}

func NewNull(t semantic.Type) Value {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great. I'm getting closer to voting that we can only add values.Value types to table builders, to make sure we don't have any errors of omission re: Nulls.

Copy link
Contributor

@nathanielc nathanielc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. When we add support for the null annotation we can revisit some of these edge cases.

@wolffcm wolffcm merged commit 9fb276b into master Jan 8, 2019
@wolffcm wolffcm deleted the feat/csv-nulls branch January 8, 2019 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants