-
Notifications
You must be signed in to change notification settings - Fork 71
Added some name repairing strategies #386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Need to say, that this is a solution for one IO only to solve user needs, but it's a good idea to make it generic and move later to API module to be available for other IO, because it's not a guarantee that other IO can handle properly the empty names, for example |
* This strategy defines how the repeatable name column will be handled | ||
* during the creation new dataframe from the IO sources. | ||
*/ | ||
public enum class NameRepairStrategy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear that it targets duplicate names just by looking at the enum class name. Maybe call it DuplicateNameStrategy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In many dataframe frameworks it's named as NameRepairing, better to keep naming to reduce learning curve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "repairing" insinuates a name is "broken", e.g. unsupported characters etc. This is definitely different IMO. Can you give an example from another library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://readr.tidyverse.org/reference/read_delim.html#arguments
go to the name_repair
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see! Well, if in the future we also add "universal", which fixes name syntax too, I'm fine with the name. Then it won't be just about uniqueness
core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/NameRepairStrategy.kt
Outdated
Show resolved
Hide resolved
/** No actions, keep as is. */ | ||
NO, | ||
|
||
/** Check the uniqueness of the column names without any actions. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this would then throw an exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed solution don't do that, and throwing of exception is happened in totally another place, this is why I could not guarantee that it happens (but for now happens)
dataframe-excel/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/xlsx.kt
Show resolved
Hide resolved
dataframe-excel/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/xlsx.kt
Outdated
Show resolved
Hide resolved
dataframe-excel/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/xlsx.kt
Show resolved
Hide resolved
* when the functionality will be enabled for all IO sources. | ||
*/ | ||
private fun repairNameIfRequired(nameFromCell: String, columnNameCounters: MutableMap<String, Int>, nameRepairStrategy: NameRepairStrategy): String { | ||
return when(nameRepairStrategy) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what problem with linting do you mean? I run it on my machine, there is no problems, also on TC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no space after when
, same with the if
s a bit below. No clue why the linter is not picking those up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, will check twice, thank you
dataframe-excel/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/xlsx.kt
Outdated
Show resolved
Hide resolved
Thanks for the review, good points @Jolanrensen |
Fixed #342