Add convert asColumn operation as compiler plugin friendly variant oа replace with #1143

koperagen · 2025-04-22T17:27:15Z

I think it's more or less common to perform column-wide operations in async / parallel context, so having such compiler plugin friendly operation is useful
Another use case, although not as handy, is creating a ColumnGroup like
df.convert { col }.asColumn { dataFrameOf("a" to listOf(123), "b" to listOf(321).asColumnGroup() }

zaleslaw · 2025-04-23T11:32:37Z

docs/StardustDocs/topics/convert.md

+
+```kotlin
+df.convert { name }.asColumn { col ->
+    col.toList().parallelStream().map { it.toString() }.collect(Collectors.toList()).toColumn()


Parallel streams should be avoided in single-threaded libraries because they can introduce race conditions, synchronization issues, and unnecessary overhead.

However, in this case, the use of parallelStream is localized and safe, as it only transforms column values without affecting global state or column names.

But! Even though this use of parallelStream is logically safe, it can unexpectedly increase CPU load, especially on weaker machines. Parallel streams use the shared ForkJoinPool, which may cause performance issues if the system has limited resources or is already running other parallel tasks. This can lead to slowdowns or contention for threads, impacting the overall responsiveness of the application.

When running in Kotlin Notebooks, parallel streams can compete with notebook execution and UI rendering for limited CPU resources. This may cause lags or freezes, especially in constrained environments like containers or shared servers. Therefore, parallelism in notebooks should be used carefully to avoid degrading the interactive experience.

Usually i wouldn't recommend using parallel streams for any trivial operations, but there can be a situation when it's needed. For example, one time i used a library that performs IO, parses file into kind of AST. So

df.add("data") { Library.parse(file) }

In my case single threaded execution took literally minutes of real time because CPU was loaded 5%. Opting to use parallel made it 20 times faster. #723. But in any case, we should minimize number of operations that plugin can't understand. replace with is one of them, convert asColumn would be an alternative should users need it

Jolanrensen · 2025-04-23T12:33:35Z

How is asColumn different from public fun <T, C> Convert<T, C>.to(columnConverter: DataFrame<T>.(DataColumn<C>) -> AnyBaseCol): DataFrame<T> we already have? Just an extra type R right?

Then I'd either add R to convert.to { } or deprecate to {} in favor of the better named asColumn {}. wdyt?

Jolanrensen · 2025-04-23T12:37:35Z

Actually, why don't we just add the type R to ReplaceClause.with if you want compiler plugin support for it?

core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/convert.kt

core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/replace.kt

plugins/kotlin-dataframe/src/org/jetbrains/kotlinx/dataframe/plugin/impl/api/convert.kt

koperagen · 2025-04-23T14:49:45Z

Then I'd either add R to convert.to { } or deprecate to {} in favor of the better named asColumn {}. wdyt?

The major difference is ignoring name changes. So, let's say df.replace { col }.with { produceRandomColumnWithRandomName(it) } is a valid code - but impossible for the plugin to interpret. So convert asColumn renames all new columns to original name

… replace with

koperagen · 2025-04-24T10:21:13Z

@Jolanrensen I think because convert { }.to { } works differently, let's postpone decision whether it should be removed or not

Jolanrensen · 2025-04-24T11:56:33Z

@koperagen ooh right I see. Actually I don't think many people would mind if we restricted that names cannot change for the entire convert operation. For that they'd need replace. That way we could make convert {}.to {} and convert {}.asColumn {} the same and replace {}.with {} different.

koperagen added the enhancement New feature or request label Apr 22, 2025

koperagen added this to the 1.0.0-Beta1 (0.16) milestone Apr 22, 2025

koperagen requested review from zaleslaw and Jolanrensen April 22, 2025 17:27

koperagen self-assigned this Apr 22, 2025

zaleslaw reviewed Apr 23, 2025

View reviewed changes

Jolanrensen requested changes Apr 23, 2025

View reviewed changes

Add convert asColumn operation as compiler plugin friendly variant of…

8756bb9

… replace with

koperagen force-pushed the convert-as-column branch from 8fc1fab to 8756bb9 Compare April 24, 2025 10:18

koperagen requested a review from Jolanrensen April 24, 2025 10:19

koperagen mentioned this pull request Apr 25, 2025

Consider changing convert { }.to { } operation #1150

Closed

koperagen merged commit 483054d into master Apr 25, 2025
4 of 5 checks passed

koperagen deleted the convert-as-column branch April 30, 2025 10:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add convert asColumn operation as compiler plugin friendly variant oа replace with #1143

Add convert asColumn operation as compiler plugin friendly variant oа replace with #1143

Uh oh!

koperagen commented Apr 22, 2025

Uh oh!

zaleslaw Apr 23, 2025

Uh oh!

koperagen Apr 23, 2025 •

edited

Loading

Uh oh!

Jolanrensen commented Apr 23, 2025

Uh oh!

Jolanrensen commented Apr 23, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

koperagen commented Apr 23, 2025 •

edited

Loading

Uh oh!

koperagen commented Apr 24, 2025

Uh oh!

Jolanrensen commented Apr 24, 2025

Uh oh!

Uh oh!

Uh oh!

Add convert asColumn operation as compiler plugin friendly variant oа replace with #1143

Add convert asColumn operation as compiler plugin friendly variant oа replace with #1143

Uh oh!

Conversation

koperagen commented Apr 22, 2025

Uh oh!

zaleslaw Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

koperagen Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jolanrensen commented Apr 23, 2025

Uh oh!

Jolanrensen commented Apr 23, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

koperagen commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

koperagen commented Apr 24, 2025

Uh oh!

Jolanrensen commented Apr 24, 2025

Uh oh!

Uh oh!

Uh oh!

koperagen Apr 23, 2025 •

edited

Loading

koperagen commented Apr 23, 2025 •

edited

Loading