Skip to content

add SXSS writing #1075

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 28, 2025
Merged

add SXSS writing #1075

merged 7 commits into from
Feb 28, 2025

Conversation

AndreiKingsley
Copy link
Collaborator

@AndreiKingsley AndreiKingsley commented Feb 25, 2025

Fixes #1016.
Tested with profiler on big datasets - works great :).
Looks like SXSSFWorkbook restrictions don't affect on DF writing into Excel (except for the case for writing into existing file - in this case it can't just be used and XSSFWorkbook is used).

WorkBookType.XLSX -> XSSFWorkbook(file.inputStream())
// Write to an existing file with `keepFile` flag
if (keepFile && file.exists() && file.length() > 0L) {
file.inputStream().use { fis ->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check if it's ok to use use here, it will close input stream right after workbook is created

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU Apache POI loads data from the InputStream into memory when the Workbook is created. I tested it with keepFile=true and it works as expected.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw looks like using fis increases memory:

Opening a XSSFWorkbook from a file has a lower memory footprint than opening from an InputStream

May be use file directly?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's do it, sure :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work 🤣 🤣 🤣 . Known APOI problem.

@koperagen
Copy link
Collaborator

koperagen commented Feb 26, 2025

Worth to mention that keepFile = true will lead to increased memory usage due to XSSFWorkbook compared to false (default)

@AndreiKingsley
Copy link
Collaborator Author

AndreiKingsley commented Feb 27, 2025

Added KDocs for writeExcel. Didn't find issue for it btw.

* @param sheetName The name of the sheet in the Excel file. If null, the default name will be used.
* @param writeHeader A flag indicating whether to write the header row in the Excel file. Defaults to true.
* @param workBookType The [type of workbook][WorkBookType] to create (e.g., XLS or XLSX). Defaults to XLSX.
* @param keepFile If `true` and the file already exists, a new sheet will be appended instead of overwriting the file.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if sheetName already exists in the workbook, will it be overwritten? maybe comment needs to be adjusted

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throws:
IllegalArgumentException – if the name is null or invalid or workbook already contains a sheet with this name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the best thing to write in our doc?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm.. maybe @throws

Copy link
Collaborator

@Jolanrensen Jolanrensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you've tested all mentioned edge cases, it looks like a nice drop-in replacement

@AndreiKingsley AndreiKingsley merged commit 9d0fe3d into master Feb 28, 2025
6 checks passed
@AndreiKingsley AndreiKingsley deleted the sxss_writing branch February 28, 2025 04:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider using SXSSWorkbook for reading xlsx files
3 participants