Skip to content
This repository was archived by the owner on Nov 9, 2020. It is now read-only.

DOCM corrupt after docbleach #5

Closed
joesecurity opened this issue Apr 18, 2017 · 5 comments
Closed

DOCM corrupt after docbleach #5

joesecurity opened this issue Apr 18, 2017 · 5 comments
Labels
Milestone

Comments

@joesecurity
Copy link
Contributor

joesecurity commented Apr 18, 2017

I run a simple test with a Docm containing a macro:

java -jar docbleach.jar -in Doc1.docm -out out.docm -vv
[main] DEBUG xyz.docbleach.Main - Log Level: TRACE
[main] DEBUG xyz.docbleach.Main - Checking output name : out.docm
[main] DEBUG xyz.docbleach.Main - Checking input name : Doc1.docm
[main] DEBUG xyz.docbleach.BleachSession - First 8 bytes: [80, 75, 3, 4, 20, 0,
6, 0]
[main] DEBUG xyz.docbleach.BleachSession - Found bleach for this file type: Offi
ce Bleach
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - File opened
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /_rels/.rels
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.op
enxmlformats-package.relationships+xml for part /_rels/.rels
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /docProps/app.xml
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.op
enxmlformats-officedocument.extended-properties+xml for part /docProps/app.xml
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /docProps/core.xml
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.op
enxmlformats-package.core-properties+xml for part /docProps/core.xml
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /word/_rels/document.
xml.rels
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.op
enxmlformats-package.relationships+xml for part /word/_rels/document.xml.rels
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /word/_rels/vbaProjec
t.bin.rels
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.op
enxmlformats-package.relationships+xml for part /word/_rels/vbaProject.bin.rels
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /word/document.xml
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.ms
-word.document.macroEnabled.main+xml for part /word/document.xml
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Found and removed suspicious con
tent type: 'application/vnd.ms-word.document.macroEnabled.main+xml' in '/word/do
cument.xml' (Size: -1)
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /word/fontTable.xml
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.op
enxmlformats-officedocument.wordprocessingml.fontTable+xml for part /word/fontTa
ble.xml
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /word/settings.xml
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.op
enxmlformats-officedocument.wordprocessingml.settings+xml for part /word/setting
s.xml
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /word/styles.xml
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.op
enxmlformats-officedocument.wordprocessingml.styles+xml for part /word/styles.xm
l
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /word/stylesWithEffec
ts.xml
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.ms
-word.stylesWithEffects+xml for part /word/stylesWithEffects.xml
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /word/theme/theme1.xm
l
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.op
enxmlformats-officedocument.theme+xml for part /word/theme/theme1.xml
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /word/vbaData.xml
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.ms
-word.vbaData+xml for part /word/vbaData.xml
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Found and removed suspicious con
tent type: 'application/vnd.ms-word.vbaData+xml' in '/word/vbaData.xml' (Size: -
1)
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /word/vbaProject.bin
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.ms
-office.vbaProject for part /word/vbaProject.bin
[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Found and removed suspicious con
tent type: 'application/vnd.ms-office.vbaProject' in '/word/vbaProject.bin' (Siz
e: -1)
[main] TRACE xyz.docbleach.bleach.OOXMLBleach - Part name: /word/webSettings.xml

[main] DEBUG xyz.docbleach.bleach.OOXMLBleach - Content type: application/vnd.op
enxmlformats-officedocument.wordprocessingml.webSettings+xml for part /word/webS
ettings.xml
[main] WARN xyz.docbleach.Main - Sanitized file has been saved, 3 potential thre
at(s) removed.

Word cannot open the resulting out.docm anymore:

untitled

Any ideas what to fix?

@punkeel punkeel added the bug label Apr 18, 2017
@punkeel
Copy link
Contributor

punkeel commented Apr 19, 2017

Hello there,

Thanks for your detailed issue! I have been able to reproduce this bug, and thanks to your title I found the root of the issue:
Found and removed suspicious content type: 'application/vnd.ms-word.document.macroEnabled.main+xml' in '/word/document.xml' (Size: -1)

The /word/document.xml file is mandatory in Word (and innocuous), but DocBleach removes it.
I wrongly assumed that removing the parts whose content type contained "macro" or "vba" was safe.

This should be fixed in 19db8f6.

There are now two filters: one for the relations (a "vbaProject" relation is removed), another matching the exact content types.

I still need to add a content-type mapper to "transform" macroEnabled documents into normal documents, cf issue #6.

➡️ This issue sould be fixed in Release v0.0.2
(linking to v0.0.3, a regression was introduced in v0.0.2)
Could you please test on your side, and tell me if this looks fine for you too?

@punkeel punkeel added this to the v0.0.2 milestone Apr 19, 2017
@joesecurity
Copy link
Contributor Author

joesecurity commented Apr 19, 2017

With v0.0.2 I get now:

java -jar doc
bleach.jar -in Doc1.docm -out out.docm
plugins directory does not exist
Exception in thread "main" java.lang.IllegalArgumentException: partName
at org.apache.poi.openxml4j.opc.OPCPackage.removePart(OPCPackage.java:10
07)
at org.apache.poi.openxml4j.opc.OPCPackage.deletePart(OPCPackage.java:11
05)
at xyz.docbleach.modules.ooxml.OOXMLBleach.sanitize(OOXMLBleach.java:179
)
at xyz.docbleach.modules.ooxml.OOXMLBleach.sanitize(OOXMLBleach.java:88)

    at xyz.docbleach.api.BleachSession.sanitize(BleachSession.java:51)
    at xyz.docbleach.cli.Main.sanitize(Main.java:68)
    at xyz.docbleach.cli.Main.main(Main.java:36)

Also get the same error with v0.0.3

@punkeel
Copy link
Contributor

punkeel commented Apr 19, 2017

This exception (Exception in thread "main" java.lang.IllegalArgumentException: partName) should have been fixed in v0.0.3
Are you sure you're using v0.0.3 ? 😉

@joesecurity
Copy link
Contributor Author

You are right, mixed up version! Thank you for your help!

@punkeel
Copy link
Contributor

punkeel commented Apr 19, 2017

Let's close this, then o/

Thanks again for your issue!

@punkeel punkeel closed this as completed Apr 19, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant