Release v1.0.0-Beta2: On our way to 1.0! · Kotlin/dataframe

We are working hard to make DataFrame 1.0 and the compiler plugin happen, but due to KotlinConf our time is limited, so detailed release notes will be given later.
You can track our progress for the next beta here.

Try this release in Kotlin Notebook which is now bundled with IntelliJ IDEA 2025.1 and available in IntelliJ IDEA Community:

%useLatestDescriptors
%use dataframe(v=1.0.0-Beta2)

Known issues

The Gradle plugin might be broken... Something went wrong during publishing, resulting in Could not find org.jetbrains.kotlinx.dataframe:symbol-processor-all:1.0.0-Beta1-dev-7097-0.12.0.429. This will be fixed as soon as possible.
In Kotlin Notebook (in K1 mode), min, max, median, and percentile might not work. This is because of a bug in REPL. Will be solved in K2 mode.
- #1116
median and percentile require explicit type arguments for non-numeric columns
- #1189
Using DataFrame on Android might cause compilation issues #1217
- Fixed by #1218 on 1.0.0-dev-7277+
IntelliJ support for the DataFrame Kotlin compiler plugin is on its way! But it might not be available in your IDE just yet. It will work out-of-the-box from version 2025.2 (in K2 mode) and be ready for testing once the EAP builds become available.
The documentation website might not be up-to-date yet
Example notebooks need to be updated: #1216
See the 1.0.0-Beta3 milestone

Deprecations and important notes

OpenAPI 3.0 support is turned experimental: #1115
We are in the progress of deprecating the KProperties- and Column Accessor Access APIs in favor of the DataFrame Kotlin compiler plugin. A migration guide will follow.
Statistics functions have been rewritten, some types might have changed
dataframe-jupyter is now a separate module, which means:
- The dataframe.json descriptor has changed, so if something works unexpectedly in your notebook, add %useLatestDescriptors before %use dataframe.
- When running your notebook with your project as its dependency (and you're not using %use dataframe), make sure the notebook has access to the dataframe-core and dataframe-jupyter dependencies. You can do the latter, for instance, by adding USE { dependencies("org.jetbrains.kotlinx:dataframe-jupyter:1.0.0-Beta1") } to the notebook.
dataframe-json is now a separate module, no longer part of dataframe-core, but included with dataframe by default.
DataFrame can now read Float from JSON. Careful, this means type inference might change for you.
dataframe-csv is now included with dataframe by default. DataFrame.readCSV() is deprecated in favor of the new DataFrame.readCsv().
JDBC support is still in progress. This means that the API can still change or we could decide to not include it by default with dataframe.
Lot's more smaller things, see below
Later:
- While @DataSchema column accessor generation via the KSP/Gradle plugin will still work for now, this will also be replaced in favor of the DataFrame Kotlin compiler plugin later on. You don't need to worry about created data schemas though, they will work exactly the same :).
- Schema inference by data sample (so using @file:ImportDataSchema in .kt files, or dataframes { schema { data = } } in Gradle) is still up for debate. We will probably remove it in the future to replace it with something more stable. Remember that you can always call df.generateInterfaces().print() to get a copy-pastable data schema interface from a dataframe instance.

What's Changed (GitHub autogenerated)

Change toHTML rendering to allow custom HTML content inside cells and not just text by @koperagen in #967
Clarify the situation when generated extension property causes ClassCast or NPE exceptions by @koperagen in #965
org.jetbrains.kotlinx.dataframe.io.Csv clashes with CSV on windows... by @Jolanrensen in #984
Add guide for custom SQL database support with HSQLDB by @zaleslaw in #986
Miscellaneous 0.15 fixes by @Jolanrensen in #989
0.15 example notebook by @Jolanrensen in #981
post-0.15 release documentation version bumps by @Jolanrensen in #983
Add df.properties() function to improve visibility of generated API for typed column access by @koperagen in #957
[Compiler plugin] Add warning to avoid confusion with missing extension properties by @koperagen in #979
Add kdoc about "Open in browser" from IDEA that helps to streamline working with HTML rendering by @koperagen in #968
Bumping gradle to 8.11 by @Jolanrensen in #1001
toDataFrame() column order fix with @ColumnName annotations by @Jolanrensen in #1004
Prepare compilation pipeline that hides API overloads - KProperty and ColumnAccessor, for compiler plugin workflow by @koperagen in #959
Keywords generator plugin: moved and fixed for Kotlin 2.1 by @Jolanrensen in #996
Documentation mention of experimental dataframe-csv by @Jolanrensen in #994
Convert DataFrameHtmlData to normal class by @Jolanrensen in #1009
Bump to KoDEx 0.4.0 by @Jolanrensen in #1010
Documentation for date-time pattern by @DestBro in #1015
KoDEx 0.4.2 by @Jolanrensen in #1017
Renaming all KoDEx documentation arguments to SCREAMING_SNAKE_CASE by @Jolanrensen in #1018
remove docs and deprecating df - cols by @AndreiKingsley in #1022
Rename to camelCase support in compiler plugin by @Jolanrensen in #1029
Add some AccessApiOverload annotations by @koperagen in #1031
Post 0.15 deprecations update by @Jolanrensen in #1033
Removing exceptNew: Option 1 by @Jolanrensen in #1030
Add documentation metadata and overloads for distinct API by @zaleslaw in #1023
More variations of "move", AddDsl.group, CreateDataFrameDsl String.invoke by @koperagen in #1035
Allow running Gradle tasks with -PskipKodex to publish locally faster by @koperagen in #1034
Add @CandidateForRemoval annotation by @koperagen in #1032
Sort df.compileTimeSchema() columns according to df.schema() so they're easier to compare by @koperagen in #990
[Compiler plugin] Fix toDataFrame for different visibilities by @koperagen in #1043
Map transformed calls and generated properties to their origin by @koperagen in #1041
Implicit receiver fix by @koperagen in #1048
Enhanced GroupBy support in compiler plugin by @koperagen in #1050
Add checkers that report compile time schema as info warnings to observe implicit schema generation by @koperagen in #1051
Add more operation to compiler plugin by @koperagen in #1052
Add an example notebook with analyses for the Top 12 German companies by @devcrocod in #1024
All except fix: Option 2 by @Jolanrensen in #1038
Stabilize FastDoubleParser part 1 by @Jolanrensen in #1040
[Compiler plugin] Support more GroupBy shortcuts by @koperagen in #1055
Deephaven csv as default by @Jolanrensen in #1057
add percentile and p25 and p75 to describe by @AndreiKingsley in #1060
Added KDocs for cumSum operation with Kodex by @zaleslaw in #1063
Add KDocs for the flatten operation by @zaleslaw in #1064
Unified number types by @Jolanrensen in #1070
Move toLeft -> toStart, toRight -> toEnd, add move KDocs by @AndreiKingsley in #1071
Remove public statistical extensions for Iterable by @AndreiKingsley in #1065
generalize toListImpl to support conversions into lists and sequences by @martinitus in #1046
fix camelCase by @AndreiKingsley in #1072
inlines isSingleColumnWithGroup() and fixes test for colsAtAnyDepth+colsOfKind by @Jolanrensen in #1076
JSON reading: unified numbers by @Jolanrensen in #1073
Add util classes for custom java type renderers by @koperagen in #1080
add SXSS writing by @AndreiKingsley in #1075
Extend CS DSL support in compiler plugin by @koperagen in #1079
Setting java version to 11 by @Jolanrensen in #1083
Prepare plugin to be moved to Kotlin repository by @koperagen in #1084
DynamicDataFrameBuilder improvements by @AndreiKingsley in #1082
Post Java 11 version bumps by @Jolanrensen in #1087
Fixed valuesAreComparable() and tests by @Jolanrensen in #1089
To dataframe improvements by @AndreiKingsley in #1081
[Compiler plugin] Support DataFrame.aggregate by @koperagen in #1088
Add Compiler Plugin support for statistics on GroupBy by @zaleslaw in #1077
Notebooks update by @AndreiKingsley in #1085
Geo update by @AndreiKingsley in #1086
Aggregator implementation rework by @Jolanrensen in #1078
Disable large logs in tests for FastDoubleParser by @Jolanrensen in #1094
Publish compiler plugin :core subset from another module by @koperagen in #1096
Splitting off Jupyter yet again by @Jolanrensen in #1095
Map unfold by @AndreiKingsley in #1097
Reverted back to the "good" jdk 8 wherever possible by @Jolanrensen in #1101
Move/Insert After fix by @AndreiKingsley in #1092
Mean statistics fixes by @Jolanrensen in #1091
Sum statistics and aggregator improvements by @Jolanrensen in #1103
rename impl fix by @AndreiKingsley in #1104
Set version to 1.0.0 by @Jolanrensen in #1106
Ensure a predictable order of columns after aggregation of GroupBy by @koperagen in #1110
add AccessApiOverload annotation by @AndreiKingsley in #1109
OpenAPI -> experimental by @Jolanrensen in #1115
[Compiler plugin] Support df.asGroupBy by @koperagen in #1114
concatWithKeys impl by @AndreiKingsley in #1107
Aggregator dependency injection, min/max, and skipNaN by @Jolanrensen in #1108
Support convert shortcuts by @koperagen in #1118
Overhaul for std by @Jolanrensen in #1119
Functions inline by @AndrewKis in #1123
Functions inline by @AndreiKingsley in #1111
Removed digitize functionality by @Jolanrensen in #1125
JDBC logging and safety improvements by @zaleslaw in #1117
[Compiler plugin] Remove outdated code by @koperagen in #1128
[Compiler plugin] Support unfold by @koperagen in #1127
added small note in add docs to link to insert by @Jolanrensen in #1131
Fix bug that prevented SQL Server row limts by @TomMicheline in #1136
Add compiler support for the distinct operation by @zaleslaw in #1137
[Compiler plugin] Support move to and insert at by @koperagen in #1130
reorganized read csv/tsv/delim parameters by @Jolanrensen in #1142
[Compiler plugin] Support SingleColumn.select by @koperagen in #1145
Median overhaul by @Jolanrensen in #1122
Add a constructor to create a nested dataframe from columns inplace by @koperagen in #1144
Add convert asColumn operation as compiler plugin friendly variant oа replace with by @koperagen in #1143
Percentile by @Jolanrensen in #1149
CumSum by @Jolanrensen in #1152
Add support for DataFrame sum operation with tests by @zaleslaw in #1148
[Compiler plugin] join operations support by @koperagen in #1139
Json module extraction by @Jolanrensen in #1147
Add statistics on DataFrame support by @zaleslaw in #1153
Convert KDocs by @AndrewKis in #1134
Simplifies some dependencies, especially jupyter by @Jolanrensen in #1163
json module docs by @Jolanrensen in #1164
statistics documentation update by @Jolanrensen in #1165
Migrate usages of AccessOverload APIs in inline functions to public alternatives by @koperagen in #1162
Remove column accessors API from documentation by @koperagen in #1171
Deprecate @AccessApiOverload functions by @koperagen in #1172
Add a common interface for exceptions that should be reported by the plugin by @koperagen in #1173
rename to valid camel case and remove deprecated URL() by @AndrewKis in #1175
Remove some column accessor / KProperties usages from examples by @koperagen in #1174
Remove logging from the API used for the compiler plugin by @koperagen in #1177
Generate fields for markers in REPL so compiler plugin can extract the schema by @koperagen in #1154
Prepare next version of dataframe-compiler-plugin-core by @koperagen in #1182

New Contributors

@DestBro made their first contribution in #1015
@martinitus made their first contribution in #1046
@TomMicheline made their first contribution in #1136

Full Changelog: v0.15.0...v1.0.0-Beta2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v1.0.0-Beta2: On our way to 1.0!

Known issues

Deprecations and important notes

What's Changed (GitHub autogenerated)

New Contributors

Contributors

Uh oh!