Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Decoding XML values with CDATA, missing style information. #2872

Open
sh4dowb opened this issue Aug 25, 2022 · 6 comments
Open

[BUG] Decoding XML values with CDATA, missing style information. #2872

sh4dowb opened this issue Aug 25, 2022 · 6 comments

Comments

@sh4dowb
Copy link

sh4dowb commented Aug 25, 2022

Information

  1. Apktool Version (apktool -version) - 2.6.1
  2. Operating System (Mac, Linux, Windows) - Linux
  3. APK From? (Playstore, ROM, Other) - Other

Stacktrace/Logcat

W: /tmp/.../.../res/..../strings.xml:12345: error: Error parsing XML: not well-formed (invalid token)

Incorrect XML line is as follows:

<string name="..........">&lt;! [CDATA[some text here &lt;a href=\"%1$s\">link&lt;/a> more text here]]></string>

As you can see, > character isn't escaped, creating an invalid XML file.

Steps to Reproduce

  1. apktool d app.apk
  2. apktool b app
    (also tried with aapt2, same problem)

APK

Private app

Questions to ask before submission

  1. Have you tried apktool d, apktool b without changing anything? Yes
  2. If you are trying to install a modified apk, did you resign it? -
  3. Are you using the latest apktool version? Yes
@sh4dowb
Copy link
Author

sh4dowb commented Aug 25, 2022

in ResXmlEncoders.java, replacing line 29 to return StringUtils.replace(StringUtils.replace(StringUtils.replace(str, "&", "&amp;"), "<", "&lt;"), ">", "&gt;"); will probably fix this

I'm currently using this python script to fix them

import glob, re
for fn in glob.glob("path/to/res/*/strings.xml"):
	xml = open(fn).read()
	newxml = xml
	for match in re.findall(r'<string name="(.*?)">(.*?)</string>', xml, re.MULTILINE | re.DOTALL):
		if ">" in match[1] or "<" in match[1]:
			newxml = newxml.replace(match[1], match[1].replace('>', '&gt;').replace('<', '&lt;'), 1)

	open(fn, 'w').write(newxml)

@iBotPeaches
Copy link
Owner

Thanks - I find it incredible odd that its been years on years and we haven't been completely following the encoded xml spec of characters so before I just add another character to that array - I want to dig into this.

@iBotPeaches
Copy link
Owner

Yeah as I vaguely remembered. This is intentional. The resource only wants < and & encoded - https://developer.android.com/guide/topics/resources/string-resource#String

The issue here is we parse that XML file to read specific resources and this assumption does not work if the parser cannot understand the XML file, but Android can.

So probably a bug, but unsure of a resolution at this time. Since I don't think adapting our XML parser to allow invalid XML is going to be easy.

@VD171
Copy link

VD171 commented Sep 27, 2022

Same for me.
Unescaped for both: new line "\n" and greater than ">".

@iBotPeaches
Copy link
Owner

So I am going to close this because to the best of my knowledge its resolved. However, since there was no sample or ability to fully confirm the original.

I followed the apk-mitm issue till I hit this issue on their side: niklashigi/apk-mitm#105. It had a sample attached with same form of error. That is now resolved with the upcoming 2.8.2 release.

➜  2872 apktool d 2872.apk -s -f
I: Using Apktool 2.8.2-22eb80-SNAPSHOT on 2872.apk
I: Loading resource table...
I: Decoding file-resources...
I: Loading resource table from file: /home/ibotpeaches/.local/share/apktool/framework/1.apk
I: Decoding values */* XMLs...
I: Decoding AndroidManifest.xml with resources...
I: Regular manifest package...
I: Copying raw classes.dex file...
I: Copying raw classes2.dex file...
I: Copying raw classes3.dex file...
I: Copying assets and libs...
I: Copying unknown files...
I: Copying original files...
I: Copying META-INF/services directory
➜  2872 apktool b 2872 --use-aapt2
I: Using Apktool 2.8.2-22eb80-SNAPSHOT
I: Copying 2872 classes.dex file...
I: Copying 2872 classes2.dex file...
I: Copying 2872 classes3.dex file...
I: Checking whether resources has changed...
I: Building resources...
I: Copying libs... (/lib)
I: Copying libs... (/kotlin)
I: Copying libs... (/META-INF/services)
I: Building apk file...
I: Copying unknown files/dir...
I: Built apk into: 2872/dist/2872.apk
➜  2872 

Will see if I can add a test case to suite, but I vaguely remember last time that since CDATA is not persisted into the compiled application - its not possible to do a 1/1 comparison of the decoded and plaintext attribute.

@iBotPeaches
Copy link
Owner

iBotPeaches commented Jul 30, 2023

<string name="incorrect_issue_2872"><![CDATA[<a href="https://apktool.org">Apktool</a> and more text here.]]></string>

Okay will leave this open for now. The above string when decoded loses the double quotes, but I'm not even sure if I made a valid string or not.

@iBotPeaches iBotPeaches changed the title [BUG] Decoding XML values incorrectly, not escaping > symbol. [BUG] Decoding XML values with CDATA, missing style information. Jul 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants