Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loadWorkbook removing formatting #207

Closed
mdorchuckuva opened this issue Jun 24, 2021 · 34 comments
Closed

loadWorkbook removing formatting #207

mdorchuckuva opened this issue Jun 24, 2021 · 34 comments
Labels
bug Something isn't working Style issue waiting for answer If not answered, the issue will be closed in 7 days.

Comments

@mdorchuckuva
Copy link

mdorchuckuva commented Jun 24, 2021

When using loadWorkbook, formatting that has been applied by a user in excel directly is being stripped.

With the attached excel file in my working directory, running the following code opens a file with cell C4 in black, unbolded text instead of the red, bold text as originally in the file. The formatting remains stripped if I save the wb object back out and then open from file explorer.

library(openxlsx)
wb = loadWorkbook("testopenxlsx.xlsx")
openXL(WB)

System information
Windows 10 Enterprise 10.0.19042
Have tested on R 4.0.3 and R 4.1.0, in both cases the issue persists
Openxlsx version 4.2.4
Office 365 version 2102
testopenxlsx.xlsx

@mdorchuckuva
Copy link
Author

I found that the issue no longer occurs if I back up my openxlsx version to 4.0.0, in case that helps narrow it down.

@ycphs
Copy link
Owner

ycphs commented Jun 24, 2021

In your example is you open a different workbook than the one you loaded

library(openxlsx) 
wb = loadWorkbook("testopenxlsx.xlsx")
openXL(WB)

Please try the following:

library(openxlsx) 
wb = loadWorkbook("testopenxlsx.xlsx")
openXL(wb)

@mdorchuckuva
Copy link
Author

Sorry the wb got capitalized when I pasted over for some reason. I was running the same workbook. see screenshot below and thanks so much for taking a look!

Top workbook is what opens from OpenXL, bottom workbook is the workbook that was originally saved.

openxlsx_screenshot

@JanMarvin
Copy link
Collaborator

I've seen similar issues before. What I have found is that the style order is changed by openxlsx. We always add our default style to styles.xml, which is picked here and all the other styles are re-sorted and do no longer match with the style ids in the worksheet. Iirc the styles file has no style id field and they are picked in the order of appearance. Kinda the same issue as #81.

@Seviks
Copy link

Seviks commented Jun 25, 2021

I can confirm this issue on 4.2.4. When I loadWorkbook, do my changes (or not) an then saveWorkbook, the formatting is gone.

Formats persisted in 4.2.3. After roll back to 4.2.3 the issue is gone.

Tested on:
openxlsx 4.2.4
Windows 10
R 4.1.0
Office 2019

@ycphs ycphs added bug Something isn't working Style issue labels Jun 25, 2021
@patrick-gerland
Copy link

Indeed, by upgrading to version 4.2.4 this new issue happens. Rolling back to the previous version fixes the problem.
I spent most of 1 day troubleshooting what happened to all my workbooks (>200 with multiple worksheets).

Just loading and saving back the file causes the problem with loosing the existing settings for the default column headers.
wb <- loadWorkbook(input.file)
saveWorkbook(wb, input.file, overwrite = T)

The initial wb simply contains 1 worksheet like
addWorksheet(wb, sheet = "myInput")
writeDataTable(wb, sheet = "myInput", x = myInput, colNames = TRUE, rowNames = FALSE, headerStyle = hs1, tableStyle = "TableStyleLight2", withFilter = FALSE, bandedRows = TRUE)

with
hs1 <- createStyle(fgFill = "#4F81BD", halign = "CENTER", textDecoration = "Bold", border = "Bottom", fontColour = "white")

with the newer version fontColour setting gets lost to some dark default instead of white.

@derekunderwood
Copy link

Similar issue here although format is being altered, not removed entirely. Seemed appropriate to post here but please move if separate issue.

Cells formatted single accounting underline using this option:
image

Appear to load with normal underline formatting:
openxlsx

Tested on:
openxlsx 4.2.3
Windows 10
R 3.6.3
Office 2016

underline_test.xlsx

@JanMarvin
Copy link
Collaborator

A potential fix is in #237 could any of you test this please?

@JanMarvin JanMarvin added the waiting for answer If not answered, the issue will be closed in 7 days. label Aug 2, 2021
@patrick-gerland
Copy link

patrick-gerland commented Aug 5, 2021 via email

@mdorchuckuva
Copy link
Author

Thanks so much @JanMarvin. Sorry for the delayed response here.

I built #237 on Windows. Results are shown in the screenshot. Left book is the input, right book is what opens from openxlsx::openXL. We keep font changes, bolding and underlines. We still lose merged cell, color changes, and strikethrough. Input workbook also attached for your reference.

Please let me know if I can do anything to help that involves R (do not know cpp, sorry).

image
[testopenxlsx.xlsx](https://github.com/ycphs/openxlsx/files/6941882/testopenxlsx.xlsx

@JanMarvin
Copy link
Collaborator

Thanks for testing (the pull request isn't yet merged with the master branch, I wanted to get feedback first).

@mdorchuckuva do you remember if these used to work with prior releases or did you list them for being important?

@mdorchuckuva
Copy link
Author

mdorchuckuva commented Aug 5, 2021

@JanMarvin thanks!

Yes we had a report that contained some of these features (particularly merged cells) that broke after an update to openxlsx.

I can confirm the test workbook works correctly across all items except strikethrough if I roll back to 4.0.0.

image

@mdorchuckuva
Copy link
Author

Edited above comment for clarity.

@ycphs
Copy link
Owner

ycphs commented Aug 6, 2021

The commit for version 4.0.0 is e99f707
At the previous repo (awalker89/openxlsx) at 5 Jan 2017.

It will be tricky to find the cause, but it is good to know when it worked the last time.

@JanMarvin
Copy link
Collaborator

I assume it might be a more recent regression. I'll try to look into it.

@JanMarvin
Copy link
Collaborator

Screenshot_2021-08-06_21-46-32

Hi @mdorchuckuva, I assume I've fixed it. Please have another look. I've even added <strike/> (because it was easy). Both files look identical for me. Could you please have a look?

@mdorchuckuva
Copy link
Author

@JanMarvin you looking at PR #237? I checked that PR out - can build on windows but am getting a fatal error in RStudio w/ no message or other info when I try to execute loadworkbook. Any advice? Am I in the right PR?

Initial (uninformed) guess is that it may be related to which version of rcpp I'm using - I'm on 1.0.7.

@JanMarvin
Copy link
Collaborator

JanMarvin commented Aug 6, 2021

yeah, that's the one. not sure what's going on. it works on the ci (all green on GitHub actions). No clue why it should fail locally and I assume it's related to your setup. 😞

Something like this should work

library(remotes)
install_github("ycphs/openxlsx", ref = github_pull(237))

library(openxlsx)
wb <- loadWorkbook(file = system.file("extdata", "loadExample.xlsx", package = "openxlsx"))

Edit: Okay, I can confirm the crash on windows 10.

@JanMarvin
Copy link
Collaborator

I've rebased the pull request on master and can now run it on windows (maybe the issue was caused by some unrelated pending fix or an RStudio bug?). I freshly installed R and Rstudio and a bunch of packages (usually use my Windows only for gaming 😄 ).

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] openxlsx_4.2.4.9000

loaded via a namespace (and not attached):
[1] compiler_4.1.0 tools_4.1.0    Rcpp_1.0.7     stringi_1.7.3  zip_2.2.0

@JanMarvin
Copy link
Collaborator

Should be fixed in 0fc1c08

@mdorchuckuva
Copy link
Author

@JanMarvin sorry didn't see this over the weekend. FYI I can execute the prior test code successfully after installing via remotes, w/out updating any of the referenced packages or re-installing R/Rstudio/packages. So it seems to me that the fail issue was isolated to building that PR via Rstudio on Win10, not should occur if installed normally once pushed to CRAN.

Thanks so much for your help w/ this, really appreciate it!

@JanMarvin
Copy link
Collaborator

Thanks for the feedback! I assume that a new Cran release will follow soon

@neros
Copy link

neros commented Oct 25, 2021

Thanks for the feedback! I assume that a new Cran release will follow soon

Thank you @JanMarvin and others for the fix. Is there a timeline for this fix on Cran? Having upgraded to 4.2.4, existing code breaks that uses styles (filtering data using FontColour format, specifically). This is an important fix for existing code base within the package community.

@JanMarvin
Copy link
Collaborator

Hi @neros , I'm not involved in the CRAN releasing process. That's @ycphs part, but he was involved with personal matters. For now the current master is what I consider the next release, therefore if you need it, you can build from there. Otherwise a new release will follow soon™.

@mendy-friedman
Copy link

I downloaded the latest master branch. Most of the formatting issues are now fixed, but there are still a few formatting pieces that are not carried over. Can you please look into this before the next release?

  • Tab formatting: If a tab in the original workbook has a color instead of the default gray, this color is lost upon loading and then saving the workbook. Additionally, hidden tabs become unhidden upon loading and saving the workbook.
  • Form controls (action buttons): some of these resize upon saving the workbook
  • Hidden columns: If you hide a single column, it remains hidden upon loading and saving the workbook. But if you hide all columns from column X to the end of the workbook, those columns unhide after saving.

Some screenshots are below. I can't share the full workbook for confidentiality reasons, but it should be easy to create a reproducible example based on these comments.

Original (tabs are colored, some tabs are hidden, buttons are a certain size, and some columns are hidden):
image
image
image

After Saving (tabs are all gray, and all are unhidden. Columns unhide, and buttons are wider):
image
image
image

@JanMarvin
Copy link
Collaborator

Hi @mendy-friedman , I currently don't have time to look into this and personally think these can be fixed in follow up releases. It might be a good idea to open additional issues for these and just out of curiosity, did any of these work in previous releases? For instance the hidden/unhidden part seems like it never worked and the button size is afaik unhandled too. Also it would be helpful if you could provide pre and post xlsx-files.

@JanMarvin
Copy link
Collaborator

JanMarvin commented Oct 26, 2021

Late night test. I created a simple workbook with colored tabs with MS 365 in the web editor. Downloaded the xlsx file. Ran the following lines and the resulting test.xlsx sheet had colored tabs with LibreOffice and MS 365. Meaning I couldn't reproduce the issue. Same thing happend with hidden columns. If I hid all from C:XFD in MS 365 and ran the lines below, everything looked the same (with expected errors on LibreOffice, but fine in MS 365). Therefore if something isn't working please open new issues and provide steps and files to reproduce the issues.

library(openxlsx)
fl <- "~/ColorTabs.xlsx"

wb <- loadWorkbook(fl)
saveWorkbook(wb, "/tmp/test.xlsx")

Edit: hidden sheets unhide. That's unfortunate, but nothing to serious.

@mendy-friedman
Copy link

appreciate the quick reply - I'll try to create a reproducible example that I can share. I'll reply to the separate issue that you opened.

@mendy-friedman
Copy link

In my actual use case, the worksheet tabs do lose their colors, but when I create a (simpler) reproducible example, the tabs maintain their colors. Wondering if one of the steps that I do before saving (writing new data to the workbook, etc.) is impacting this. I'll keep researching.

The issue with attributes that apply to an entire column or row being deleted is still happening unfortunately. I will open a new issue for this and share an example. Thank you for your help.

@mdorchuckuva
Copy link
Author

@mendy-friedman not a dev on this project, but a frequent user. I have experienced what you're experiencing (tabs losing colors, some columns/rows losing attributes) if there is an xml error w/ the workbook generally. when you do saveworkbook on the one you're seeing issues w/ in R are you getting any warning messages back?

do you have comments in the workbook (the new office 365 style)?

@mendy-friedman
Copy link

mendy-friedman commented Oct 27, 2021

@mdorchuckuva - I'm not getting any warning messages back. No office 365 comments in the workbook, although it does include some other attributes such as some vba code, a user form, etc.

@JanMarvin - I was able to create a reproducible example that doesn't include company confidential information. When I read in the first file and save it back out, I get the second file (the tab color of the first tab is stripped). It's an xlsm file, so I can't upload the file the normal way to this chat (only .xls and .xlsx are supported), but I was able to put the input and output files into a .zip file and uploaded that below. Let me know if you have troubles accessing the files, or if you'd like me to open this as a separate issue.
testfiles.zip

@JanMarvin
Copy link
Collaborator

Hi @mendy-friedman , long time no see. I opened the file with openxlsx2 which should be able to handle your file just fine. I ran the following code and the files look identical, but the sheets are not colored. If you want to give it a try, see the code below. I was using the current openxlsx2 main branch.

# remotes::install_github ...
library(openxlsx2)
library(curl) # for the download

### prepare
tmp <- tempdir()
input_zip <- paste0(tmp, "/Desktop.zip")

input <- paste0(tmp, "/test2.xlsm")
output <- paste0(tmp, "/test_openxlsx.xlsm")

### get file
curl::curl_download(
  "https://github.com/ycphs/openxlsx/files/7427381/Desktop.zip",
  destfile = input_zip)

unzip(input_zip, exdir = tmp)
dir(tmp)

### actual file work
#  load the workbook
wb <- wb_load(input)
# wb$open()
wb$save(output)

### compare
xl_open(input)
xl_open(output)

@mendy-friedman
Copy link

Thanks @JanMarvin - I'll check it out. I didn't even know that openxlsx2 existed (we've still been using openxlsx for our existing pipelines). I'll have to explore it. Reading the vignette the enhancements look exciting. Thanks for the heads up!

@JanMarvin
Copy link
Collaborator

Development is still quite active. Our first CRAN release 0.3 is available since last month and a bugfix 0.3.1 release shall follow this month. For all openxlsx2 questions please use the other bug tracker and discussion boards.

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Jan 4, 2025
# openxlsx 4.2.7.1

* It's now possible to insert a hyperlinked image by passing a URL,
  relative or absolute file path, or mailto string to the new
  `address` parameter of `insertImage()`.

# openxlsx 4.2.7

* Fixed warning on `dataValidation(..., type = "list")`
  ([#342](ycphs/openxlsx#342))

* Added optional argument to `loadWorkbook` to decide if empty/blank
  cells should be converted to NA_character_ (the default) or left
  blank as is

* `saveWorkbook()` now succeeds when called after the user has set
  column widths for a range of columns (e.g. 1:2), saved the workbook,
  then set column widths for a new range that is inclusive of the
  previous one (e.g. 1:5)
  ([#493](ycphs/openxlsx#493)).

## Improvements

* Improve detectDates
  ([#288](ycphs/openxlsx#288))

* Preserve window size and position, also `getWindowSize()` and
  `setWindowSize()`
  ([466](ycphs/openxlsx#466))

# openxlsx 4.2.6

* Fix external links
  ([#410](ycphs/openxlsx#410))

* Do not add unneccessary sheetPr node
  ([#409](ycphs/openxlsx#409))

* Add support for `namedRegion`s having dots and other special
  characters ([#338](ycphs/openxlsx#338)).

* Add type blanks and not blanks to conditional formatting
  ([#311](ycphs/openxlsx#311))

# openxlsx 4.2.5

## Fixes

* `openxlsx_setOp()` now works with named list
  ([#215](ycphs/openxlsx#215))

* `loadWorkbook()` imports `inlineStr`. Values remain `inlineStr` when
  writing the workbook with `saveWorkbook()`. Similar `read.xlsx` and
  `readWorkbook` import `inlineStr`.

* `read.xlsx()` no longer changes random seed
  ([#183](ycphs/openxlsx#183))

* fixed a regression that caused fonts to be read in incorrectly
  ([#207](ycphs/openxlsx#207))

* add option to save as read only recommended
  ([#201](ycphs/openxlsx#201))

* fixed writing hyperlink formulas
  ([#200](ycphs/openxlsx#200))

* `write.xlsx()` now throws an error if it doesn't have write
  permissions ([#190](ycphs/openxlsx#190))

* `write.xlsx()` now again uses the default of `overwrite = TRUE` for
  saving files ([#249](ycphs/openxlsx#249))

* `as.character.formula()` exported to warn about potential conflicts
  with other packages
  ([#312](ycphs/openxlsx#312),
  [#315](ycphs/openxlsx#315))

## Improvements

* `options()` are more consistently set in functions (see:
  [#289](ycphs/openxlsx#262))

* `Workbook$show()` no longer fails when called in a 0 sheet
  workbook([#240](ycphs/openxlsx#240))

* `read.xlsx()` again accepts `.xlsm` files
([#205](ycphs/openxlsx#205),
[#209](ycphs/openxlsx#209))

* `makeHyperlinkString()` does no longer require a sheet argument
  ([#57](ycphs/openxlsx#57),
  [#58](ycphs/openxlsx#58))

* improvements in how `openxlsx` creates temporary directories (see
  [#262](ycphs/openxlsx#262))

* `writeData()` calls `force(x)` to evaluate the object before options
  are set ([#264](ycphs/openxlsx#264))

* `createComment()` now correctly handles `integers` in `width` and
  `height` ([#275](ycphs/openxlsx#275))

* `setStyles()` accepts `halign="justify"`
  ([#305](ycphs/openxlsx#305))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Style issue waiting for answer If not answered, the issue will be closed in 7 days.
Projects
None yet
Development

No branches or pull requests

8 participants