-
Notifications
You must be signed in to change notification settings - Fork 73
JSON reading: unified numbers #1073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…nifyNumbers argument. Used by json reading
7fe4d6c
to
b07b7a1
Compare
b07b7a1
to
0c25a04
Compare
@Jolanrensen may be add |
definitely, that sounds like a good idea :) I will make it |
# Conflicts: # core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/util/deprecationMessages.kt
14b7321
to
576c65b
Compare
@@ -383,7 +410,7 @@ class JsonTests { | |||
).alsoDebug("df:") | |||
|
|||
val res = DataFrame.readJsonStr(df.toJson()).alsoDebug("res:") | |||
res shouldBe df | |||
res shouldBe df.convert { colsOf<Double?>() }.toFloat() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that we are reducing Double -> Float what is possible, but why not Float -> Double as a most common type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no number unification happening in this test at all; in fact, we're not reducing Double to Float on purpose, however because we're writing a Double
"1.0" and "3.0" to JSON it can be read back from JSON as Float
. If any of the numbers were too large to fit in a Float
, or if the column contained both floats and integers, the result would have been Double
.
v.floatOrNull != null -> collector.add(v.float) | ||
|
||
v.doubleOrNull != null -> collector.add(v.double) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zaleslaw It's because of this change that we can now get floats out of json too instead of just doubles.
Fixes #557
Helps #961 by preventing
Number
columns from appearing in the first place, and by making it easier to create columns with unified number types.Makes behavior also more consistent with other reading options, like CSV.
Float
. This was done for consistency with old behavior in the past (double parsing was checked before float parsing), but it can halve the memory usage of DataFrame in a lot of cases.unifyNumbers
parameter toguessValueType
andcreateColumnGuessingType
.unifyNumbers = true