Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@base64d fails on pdf files #3061

Closed
pkoppstein opened this issue Mar 9, 2024 · 3 comments
Closed

@base64d fails on pdf files #3061

pkoppstein opened this issue Mar 9, 2024 · 3 comments
Labels

Comments

@pkoppstein
Copy link
Contributor

The good news is that replacing jq by gojq results in success.

Here's an example file:

wget https://legiscan.com/AK/text/HR2/id/1477219/Alaska-2017-HR2-Enrolled.pdf
base64 < Alaska-2017-HR2-Enrolled.pdf > /tmp/pdf.base64

Using jq:

jq -Rr '@base64d'  /tmp/pdf.base64 > /tmp/pdf.base64d.pdf        # cannot be opened by Adobe Acrobat

Replacing jq by gojq in the above line results in a file that Adobe Acrobat opens successfully.

Further details:

$ uname -a
Darwin Mac-mini.mynetworksettings.com 21.6.0 Darwin Kernel Version 21.6.0: Thu Sep 29 20:12:57 PDT 2022; root:xnu-8020.240.7~1/RELEASE_X86_64 x86_64

$ jq --version
jq-1.7.1

$ ls -l /tmp/pdf.*.pdf
-rw-r--r--  1 ....  41887 Mar  9 03:56 /tmp/pdf.base64d.gojq.pdf
-rw-r--r--  1 ....  71816 Mar  9 03:56 /tmp/pdf.base64d.pdf

@itchyny
Copy link
Contributor

itchyny commented Mar 9, 2024

Looks like dup of #1931.

@emanuele6
Copy link
Member

emanuele6 commented Mar 9, 2024

Yes, @base64d can only decode to utf-8 strings, not binary data.

Note that even with gojq that can preserve non-utf8 data as long as you don't perform string operatorions on it, you should use gojq -jR @base64d <b64 >decoded not -rR or it will add an extra newline (0x0a byte) at the end of the file; for PDFs evidently that is fine though.

@emanuele6 emanuele6 closed this as not planned Won't fix, can't repro, duplicate, stale Mar 9, 2024
@emanuele6 emanuele6 added the dup label Mar 9, 2024
@pkoppstein
Copy link
Contributor Author

pkoppstein commented Mar 9, 2024

@emanuele6 - Thanks for the reminder about -j.

My implicit point was that if it's good enough for gojq, maybe it should be good enough for jq.
Or would "binary strings" /pull/2314 help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants