-
Notifications
You must be signed in to change notification settings - Fork 71
Added some name repairing strategies #386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
package org.jetbrains.kotlinx.dataframe.io | ||
|
||
/** | ||
* This strategy defines how the repeatable name column will be handled | ||
* during the creation new dataframe from the IO sources. | ||
*/ | ||
public enum class NameRepairStrategy { | ||
/** No actions, keep as is. */ | ||
DO_NOTHING, | ||
|
||
/** Check the uniqueness of the column names without any actions. */ | ||
CHECK_UNIQUE, | ||
|
||
/** Check the uniqueness of the column names and repair it. */ | ||
MAKE_UNIQUE | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
package org.jetbrains.kotlinx.dataframe.io | ||
|
||
/** | ||
* This strategy defines how the repeatable name column will be handled | ||
* during the creation new dataframe from the IO sources. | ||
*/ | ||
public enum class NameRepairStrategy { | ||
/** No actions, keep as is. */ | ||
DO_NOTHING, | ||
|
||
/** Check the uniqueness of the column names without any actions. */ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume this would then throw an exception? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The proposed solution don't do that, and throwing of exception is happened in totally another place, this is why I could not guarantee that it happens (but for now happens) |
||
CHECK_UNIQUE, | ||
|
||
/** Check the uniqueness of the column names and repair it. */ | ||
MAKE_UNIQUE | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear that it targets duplicate names just by looking at the enum class name. Maybe call it
DuplicateNameStrategy
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In many dataframe frameworks it's named as NameRepairing, better to keep naming to reduce learning curve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "repairing" insinuates a name is "broken", e.g. unsupported characters etc. This is definitely different IMO. Can you give an example from another library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://readr.tidyverse.org/reference/read_delim.html#arguments
go to the name_repair
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see! Well, if in the future we also add "universal", which fixes name syntax too, I'm fine with the name. Then it won't be just about uniqueness