Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling encrypted spreadsheets? #101

Closed
revcom opened this issue Apr 6, 2020 · 24 comments
Closed

Handling encrypted spreadsheets? #101

revcom opened this issue Apr 6, 2020 · 24 comments
Assignees
Labels
enhancement New feature or request

Comments

@revcom
Copy link

revcom commented Apr 6, 2020

Hi there,

I'm assuming you can't use this library to read encrypted sheets with a password.
Is this feature easy to add?

Thanks...

Robert

@MaxDesiatov MaxDesiatov added the enhancement New feature or request label Apr 6, 2020
@MaxDesiatov
Copy link
Collaborator

Hi @revcom, the library currently doesn't support it, but thanks for the feature request, it definitely puts it higher on the roadmap 👍 I need to investigate how it would work and I'll post any updates here when any are available

@MaxDesiatov
Copy link
Collaborator

The most comprehensive description of how it works I found so far is this one: https://blog.didierstevens.com/2018/06/07/encrypted-ooxml-documents/

Please feel free to post if you find more information about the internals

@revcom
Copy link
Author

revcom commented Apr 6, 2020 via email

@revcom
Copy link
Author

revcom commented Apr 25, 2020 via email

@MaxDesiatov
Copy link
Collaborator

Hey @revcom, I'm glad that it worked for you, I'll have another look at the encryption support and I'll see what I can do, will keep you updated.

@MaxDesiatov
Copy link
Collaborator

MaxDesiatov commented Apr 26, 2020

@revcom do you have a sample file you could attach here or send to hello@corexlsx.org by any chance? It doesn't have to contain any real data, just empty worksheets would work. I just need to understand which of the encryption types you're likely to work with, and I also would hope it can be attached to the test suite.

@revcom
Copy link
Author

revcom commented Apr 26, 2020

Hi Max

Sample spreadsheets sent to you privately. Thanks again for your great support

Robert

@revcom revcom closed this as completed Apr 26, 2020
@MaxDesiatov MaxDesiatov reopened this Apr 27, 2020
@MaxDesiatov
Copy link
Collaborator

Thanks, I'll keep the issue open and will close it after the feature is implemented and merged, if you don't mind.

@MaxDesiatov
Copy link
Collaborator

Due to the amount of new dependencies the encryption feature requires, this was implemented in a separate library. Could you have a look at CryptoOffice and let me know how that works for you?

@revcom
Copy link
Author

revcom commented May 31, 2020 via email

@revcom
Copy link
Author

revcom commented May 31, 2020 via email

@MaxDesiatov
Copy link
Collaborator

Thanks for having a look at this so quickly. You don't have to download XMLCoder separately. If you add CryptoOffice as any other package depedency in Xcode to your application's target, it should be pulled automatically. I know that error could appear due to Xcode bugs, maybe restarting Xcode and rebuilding could help?

@revcom
Copy link
Author

revcom commented May 31, 2020 via email

@MaxDesiatov
Copy link
Collaborator

Hey Robert, is the package added via the Xcode SwiftPM integration as described here? If so, is this issue reproducible just for your single app or any other apps too? I'm not able to reproduce the issue when I create a new app in Xcode and add both CryptoOffice and CoreXLSX to the dependencies list as shown on that Apple's documentation page.

@revcom
Copy link
Author

revcom commented Jun 1, 2020 via email

@MaxDesiatov
Copy link
Collaborator

@revcom awesome, glad to hear that the issue is fixed!

In the last 0.11 version of CoreXLSX, which you already use, as it's required to make CryptoOffice working, there's a new parseWorksheetPathsAndNames function. You no longer need to call parseWorkbooks separately, here's an example::

let filteredName = "Passwords"
guard 
  let (name, worksheetPath) = try file.parseWorksheetPathsAndNames()
    .first(where: { $0.name == filteredName })
else { fatalError("sheet with name \(filteredName) not found") }

let worksheet = try file.parseWorksheet(at: worksheetPath)

@revcom
Copy link
Author

revcom commented Jun 1, 2020 via email

@MaxDesiatov
Copy link
Collaborator

I don't think there's a more concise way, there are already a few implicit loops in my version with a few flatMaps and maps around:

  /// Parse and return an array of worksheets in this XLSX file with their corresponding names.
  public func parseWorksheetPathsAndNames(
    workbook: Workbook
  ) throws -> [(name: String?, path: String)] {
    return try parseDocumentPaths().map {
      try parseDocumentRelationships(path: $0)
    }.flatMap { (path, relationships) -> [(name: String?, path: String)] in
      let worksheets = relationships.items.filter { $0.type == .worksheet }

      guard !path.isRoot else { return worksheets.map { (name: nil, path: $0.target) } }

      // .rels file has paths relative to its directory,
      // storing that path in `pathPrefix`
      let pathPrefix = path.components.dropLast().joined(separator: "/")

      let sheetIDs = Dictionary(uniqueKeysWithValues: workbook.sheets.items.compactMap { sheet in
        sheet.name.flatMap { (sheet.relationship, $0) }
      })

      return worksheets.map { (name: sheetIDs[$0.id], path: "\(pathPrefix)/\($0.target)") }
    }
  }

And it depends on whether you'd like to optimise for conciseness, readability, correctness or efficiency. My version is certainly not very efficient if you know there's only a single woksheet with such name. But this is probably not always the case with all different varieties of .xlsx files 🙂

@revcom
Copy link
Author

revcom commented Jun 1, 2020 via email

@MaxDesiatov
Copy link
Collaborator

CoreXLSXError.dataIsNotAnArchive is probably the best indicator unfortunately. It doesn't look like the cryptography library raises an error when the password is incorrect, it just tries to decrypt with a wrong password, which leads to a garbage output. Maybe there's some other way, but at least currently it doesn't look like the CryptoSwift depedency that CryptoOffice uses can give a direct indication that a password input is incorrect.

@revcom
Copy link
Author

revcom commented Jun 1, 2020 via email

@MaxDesiatov
Copy link
Collaborator

Thanks for sharing, that's certainly another possible option, I think it would be equivalent to CoreXLSXError.dataIsNotAnArchive in behaviour. Although catching that error probably would be more efficient, since str.contains would traverse the whole decrypted data blob in search of a substring, while .dataIsNotAnArchive error is thrown when ZIPFoundation fails to open the zip archive, which is what decrypted .xlsx essentialy is. And ZIPFoundation already does some tricks to efficiently check whether a given data blob is a valid zip archive.

@revcom
Copy link
Author

revcom commented Aug 12, 2020

Hi Max

Code has been working well but there seems to be a bug when adding additional rows with Excel. After a number of additions the library suddenly throws ArchiveEntryNotFound and refuses to allow access to the encoded spreadsheet data.
Excel has no problem opening and editing the same spreadsheet.

I've send you a minimal app and spreadsheet that exhibits the problem privately.

Thanks again for your help...

Robert

MaxDesiatov added a commit to CoreOffice/CryptoOffice that referenced this issue Aug 27, 2020
As reported in CoreOffice/CoreXLSX#101, the decrypted files are corrupted. This was caused by AES decryption code applying incorrect padding.
@revcom
Copy link
Author

revcom commented Aug 27, 2020

Thanks Max, the latest version seems to fix my ArchiveEntryNotFound problem. Appreciate your help (again)

Robert

@revcom revcom closed this as completed Aug 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants