We are working hard to make DataFrame 1.0 and the compiler plugin happen, but due to KotlinConf our time is limited, so detailed release notes will be given later.
You can track our progress for the next beta here.
Try this release in Kotlin Notebook which is now bundled with IntelliJ IDEA 2025.1 and available in IntelliJ IDEA Community:
%useLatestDescriptors
%use dataframe(v=1.0.0-Beta2)
Known issues
The Gradle plugin might be broken... Something went wrong during publishing, resulting inCould not find org.jetbrains.kotlinx.dataframe:symbol-processor-all:1.0.0-Beta1-dev-7097-0.12.0.429
. This will be fixed as soon as possible.- In Kotlin Notebook (in K1 mode), min, max, median, and percentile might not work. This is because of a bug in REPL. Will be solved in K2 mode.
median
andpercentile
require explicit type arguments for non-numeric columns- Using DataFrame on Android might cause compilation issues #1217
- Fixed by #1218 on 1.0.0-dev-7277+
- IntelliJ support for the DataFrame Kotlin compiler plugin is on its way! But it might not be available in your IDE just yet. It will work out-of-the-box from version 2025.2 (in K2 mode) and be ready for testing once the EAP builds become available.
- The documentation website might not be up-to-date yet
- Example notebooks need to be updated: #1216
- See the 1.0.0-Beta3 milestone
Deprecations and important notes
- OpenAPI 3.0 support is turned experimental: #1115
- We are in the progress of deprecating the KProperties- and Column Accessor Access APIs in favor of the DataFrame Kotlin compiler plugin. A migration guide will follow.
- Statistics functions have been rewritten, some types might have changed
dataframe-jupyter
is now a separate module, which means:- The
dataframe.json
descriptor has changed, so if something works unexpectedly in your notebook, add%useLatestDescriptors
before%use dataframe
. - When running your notebook with your project as its dependency (and you're not using
%use dataframe
), make sure the notebook has access to thedataframe-core
anddataframe-jupyter
dependencies. You can do the latter, for instance, by addingUSE { dependencies("org.jetbrains.kotlinx:dataframe-jupyter:1.0.0-Beta1") }
to the notebook.
- The
dataframe-json
is now a separate module, no longer part ofdataframe-core
, but included withdataframe
by default.- DataFrame can now read
Float
from JSON. Careful, this means type inference might change for you. dataframe-csv
is now included withdataframe
by default.DataFrame.readCSV()
is deprecated in favor of the newDataFrame.readCsv()
.- JDBC support is still in progress. This means that the API can still change or we could decide to not include it by default with
dataframe
. - Lot's more smaller things, see below
- Later:
- While
@DataSchema
column accessor generation via the KSP/Gradle plugin will still work for now, this will also be replaced in favor of the DataFrame Kotlin compiler plugin later on. You don't need to worry about created data schemas though, they will work exactly the same :). - Schema inference by data sample (so using
@file:ImportDataSchema
in .kt files, ordataframes { schema { data = } }
in Gradle) is still up for debate. We will probably remove it in the future to replace it with something more stable. Remember that you can always calldf.generateInterfaces().print()
to get a copy-pastable data schema interface from a dataframe instance.
- While
What's Changed (GitHub autogenerated)
- Change toHTML rendering to allow custom HTML content inside cells and not just text by @koperagen in #967
- Clarify the situation when generated extension property causes ClassCast or NPE exceptions by @koperagen in #965
- org.jetbrains.kotlinx.dataframe.io.Csv clashes with CSV on windows... by @Jolanrensen in #984
- Add guide for custom SQL database support with HSQLDB by @zaleslaw in #986
- Miscellaneous 0.15 fixes by @Jolanrensen in #989
- 0.15 example notebook by @Jolanrensen in #981
- post-0.15 release documentation version bumps by @Jolanrensen in #983
- Add
df.properties()
function to improve visibility of generated API for typed column access by @koperagen in #957 - [Compiler plugin] Add warning to avoid confusion with missing extension properties by @koperagen in #979
- Add kdoc about "Open in browser" from IDEA that helps to streamline working with HTML rendering by @koperagen in #968
- Bumping gradle to 8.11 by @Jolanrensen in #1001
toDataFrame()
column order fix with@ColumnName
annotations by @Jolanrensen in #1004- Prepare compilation pipeline that hides API overloads - KProperty and ColumnAccessor, for compiler plugin workflow by @koperagen in #959
- Keywords generator plugin: moved and fixed for Kotlin 2.1 by @Jolanrensen in #996
- Documentation mention of experimental dataframe-csv by @Jolanrensen in #994
- Convert
DataFrameHtmlData
to normal class by @Jolanrensen in #1009 - Bump to KoDEx 0.4.0 by @Jolanrensen in #1010
- Documentation for date-time pattern by @DestBro in #1015
- KoDEx 0.4.2 by @Jolanrensen in #1017
- Renaming all KoDEx documentation arguments to SCREAMING_SNAKE_CASE by @Jolanrensen in #1018
remove
docs and deprecatingdf - cols
by @AndreiKingsley in #1022- Rename to camelCase support in compiler plugin by @Jolanrensen in #1029
- Add some AccessApiOverload annotations by @koperagen in #1031
- Post 0.15 deprecations update by @Jolanrensen in #1033
- Removing
exceptNew
: Option 1 by @Jolanrensen in #1030 - Add documentation metadata and overloads for
distinct
API by @zaleslaw in #1023 - More variations of "move", AddDsl.group, CreateDataFrameDsl String.invoke by @koperagen in #1035
- Allow running Gradle tasks with -PskipKodex to publish locally faster by @koperagen in #1034
- Add @CandidateForRemoval annotation by @koperagen in #1032
- Sort df.compileTimeSchema() columns according to df.schema() so they're easier to compare by @koperagen in #990
- [Compiler plugin] Fix toDataFrame for different visibilities by @koperagen in #1043
- Map transformed calls and generated properties to their origin by @koperagen in #1041
- Implicit receiver fix by @koperagen in #1048
- Enhanced GroupBy support in compiler plugin by @koperagen in #1050
- Add checkers that report compile time schema as info warnings to observe implicit schema generation by @koperagen in #1051
- Add more operation to compiler plugin by @koperagen in #1052
- Add an example notebook with analyses for the Top 12 German companies by @devcrocod in #1024
- All except fix: Option 2 by @Jolanrensen in #1038
- Stabilize FastDoubleParser part 1 by @Jolanrensen in #1040
- [Compiler plugin] Support more GroupBy shortcuts by @koperagen in #1055
- Deephaven csv as default by @Jolanrensen in #1057
- add percentile and p25 and p75 to describe by @AndreiKingsley in #1060
- Added KDocs for cumSum operation with Kodex by @zaleslaw in #1063
- Add KDocs for the
flatten
operation by @zaleslaw in #1064 - Unified number types by @Jolanrensen in #1070
- Move toLeft -> toStart, toRight -> toEnd, add move KDocs by @AndreiKingsley in #1071
- Remove public statistical extensions for
Iterable
by @AndreiKingsley in #1065 - generalize toListImpl to support conversions into lists and sequences by @martinitus in #1046
- fix camelCase by @AndreiKingsley in #1072
- inlines isSingleColumnWithGroup() and fixes test for colsAtAnyDepth+colsOfKind by @Jolanrensen in #1076
- JSON reading: unified numbers by @Jolanrensen in #1073
- Add util classes for custom java type renderers by @koperagen in #1080
- add SXSS writing by @AndreiKingsley in #1075
- Extend CS DSL support in compiler plugin by @koperagen in #1079
- Setting java version to 11 by @Jolanrensen in #1083
- Prepare plugin to be moved to Kotlin repository by @koperagen in #1084
- DynamicDataFrameBuilder improvements by @AndreiKingsley in #1082
- Post Java 11 version bumps by @Jolanrensen in #1087
- Fixed
valuesAreComparable()
and tests by @Jolanrensen in #1089 - To dataframe improvements by @AndreiKingsley in #1081
- [Compiler plugin] Support DataFrame.aggregate by @koperagen in #1088
- Add Compiler Plugin support for statistics on GroupBy by @zaleslaw in #1077
- Notebooks update by @AndreiKingsley in #1085
- Geo update by @AndreiKingsley in #1086
- Aggregator implementation rework by @Jolanrensen in #1078
- Disable large logs in tests for
FastDoubleParser
by @Jolanrensen in #1094 - Publish compiler plugin :core subset from another module by @koperagen in #1096
- Splitting off Jupyter yet again by @Jolanrensen in #1095
- Map unfold by @AndreiKingsley in #1097
- Reverted back to the "good" jdk 8 wherever possible by @Jolanrensen in #1101
- Move/Insert After fix by @AndreiKingsley in #1092
- Mean statistics fixes by @Jolanrensen in #1091
- Sum statistics and aggregator improvements by @Jolanrensen in #1103
- rename impl fix by @AndreiKingsley in #1104
- Set version to 1.0.0 by @Jolanrensen in #1106
- Ensure a predictable order of columns after aggregation of
GroupBy
by @koperagen in #1110 - add AccessApiOverload annotation by @AndreiKingsley in #1109
- OpenAPI -> experimental by @Jolanrensen in #1115
- [Compiler plugin] Support df.asGroupBy by @koperagen in #1114
- concatWithKeys impl by @AndreiKingsley in #1107
Aggregator
dependency injection,min
/max
, andskipNaN
by @Jolanrensen in #1108- Support convert shortcuts by @koperagen in #1118
- Overhaul for std by @Jolanrensen in #1119
- Functions inline by @AndrewKis in #1123
- Functions inline by @AndreiKingsley in #1111
- Removed digitize functionality by @Jolanrensen in #1125
- JDBC logging and safety improvements by @zaleslaw in #1117
- [Compiler plugin] Remove outdated code by @koperagen in #1128
- [Compiler plugin] Support
unfold
by @koperagen in #1127 - added small note in add docs to link to insert by @Jolanrensen in #1131
- Fix bug that prevented SQL Server row limts by @TomMicheline in #1136
- Add compiler support for the
distinct
operation by @zaleslaw in #1137 - [Compiler plugin] Support
move to
andinsert at
by @koperagen in #1130 - reorganized read csv/tsv/delim parameters by @Jolanrensen in #1142
- [Compiler plugin] Support SingleColumn.select by @koperagen in #1145
- Median overhaul by @Jolanrensen in #1122
- Add a constructor to create a nested dataframe from columns inplace by @koperagen in #1144
- Add convert asColumn operation as compiler plugin friendly variant oа replace with by @koperagen in #1143
- Percentile by @Jolanrensen in #1149
- CumSum by @Jolanrensen in #1152
- Add support for DataFrame
sum
operation with tests by @zaleslaw in #1148 - [Compiler plugin] join operations support by @koperagen in #1139
- Json module extraction by @Jolanrensen in #1147
- Add statistics on DataFrame support by @zaleslaw in #1153
- Convert KDocs by @AndrewKis in #1134
- Simplifies some dependencies, especially jupyter by @Jolanrensen in #1163
- json module docs by @Jolanrensen in #1164
- statistics documentation update by @Jolanrensen in #1165
- Migrate usages of AccessOverload APIs in inline functions to public alternatives by @koperagen in #1162
- Remove column accessors API from documentation by @koperagen in #1171
- Deprecate
@AccessApiOverload
functions by @koperagen in #1172 - Add a common interface for exceptions that should be reported by the plugin by @koperagen in #1173
- rename to valid camel case and remove deprecated URL() by @AndrewKis in #1175
- Remove some column accessor / KProperties usages from examples by @koperagen in #1174
- Remove logging from the API used for the compiler plugin by @koperagen in #1177
- Generate fields for markers in REPL so compiler plugin can extract the schema by @koperagen in #1154
- Prepare next version of dataframe-compiler-plugin-core by @koperagen in #1182
New Contributors
- @DestBro made their first contribution in #1015
- @martinitus made their first contribution in #1046
- @TomMicheline made their first contribution in #1136
Full Changelog: v0.15.0...v1.0.0-Beta2