Handle very small pdf's #580

eric-yuan-vanta · 2023-02-10T02:43:20Z

Handle tiny pdf's, and add a test pdf < 1350 bytes (the test pdf is MIT licensed and from https://brendanzagaeski.appspot.com/0004.html)

just fyi generating an empty pdf from google drive results in a pdf of similar size (~700 bytes)

resolves #579

eric-yuan-vanta · 2023-02-13T19:54:19Z

Hey @sindresorhus would appreciate a review on this when you get a chance. Thanks!

core.js

eric-yuan-vanta · 2023-02-14T17:19:46Z

I rerequested review and github seemed to automatically remove @Borewit. Not sure why!

core.js

throw errors lint lint simplify wrap all AI reads in try catch move only ingore to try catch, early return pdf Revert "move only ingore to try catch, early return pdf" This reverts commit 3b90419.

Borewit · 2023-02-18T12:30:39Z

core.js

-			const buffer = Buffer.alloc(Math.min(maxBufferSize, tokenizer.fileInfo.size));
-			await tokenizer.readBuffer(buffer, {mayBeLess: true});
+			try {
+				await tokenizer.ignore(1350);


This remains a sensitive point, but is not introduced by this PR.
Maybe we should drop support for specialized PDF formats as text based formats are out of scope.

eric-yuan-vanta changed the title ~~Handle tiny pdf's~~ Handle very small pdf's Feb 13, 2023

sindresorhus reviewed Feb 14, 2023

View reviewed changes

core.js Show resolved Hide resolved

sindresorhus requested a review from Borewit February 14, 2023 05:58

eric-yuan-vanta requested review from sindresorhus and removed request for Borewit February 14, 2023 17:18

Borewit requested changes Feb 17, 2023

View reviewed changes

core.js Show resolved Hide resolved

eric-yuan-vanta requested review from Borewit and removed request for sindresorhus February 17, 2023 17:05

Borewit requested changes Feb 17, 2023

View reviewed changes

core.js Show resolved Hide resolved

eric-yuan-vanta requested a review from Borewit February 17, 2023 17:31

handle tiny pdfs

c10a194

throw errors lint lint simplify wrap all AI reads in try catch move only ingore to try catch, early return pdf Revert "move only ingore to try catch, early return pdf" This reverts commit 3b90419.

eric-yuan-vanta force-pushed the min-pdf branch from 5f1435e to c10a194 Compare February 17, 2023 17:33

Borewit reviewed Feb 18, 2023

View reviewed changes

Borewit approved these changes Feb 18, 2023

View reviewed changes

Borewit merged commit edf59f8 into sindresorhus:main Feb 18, 2023

mnathsnyk mentioned this pull request Mar 15, 2024

[Snyk] Upgrade file-type from 8.1.0 to 19.0.0 mnathsnyk/nodejs-goof#7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle very small pdf's #580

Handle very small pdf's #580

eric-yuan-vanta commented Feb 10, 2023 •

edited

Loading

eric-yuan-vanta commented Feb 13, 2023

eric-yuan-vanta commented Feb 14, 2023

Borewit Feb 18, 2023

Handle very small pdf's #580

Handle very small pdf's #580

Conversation

eric-yuan-vanta commented Feb 10, 2023 • edited Loading

eric-yuan-vanta commented Feb 13, 2023

eric-yuan-vanta commented Feb 14, 2023

Borewit Feb 18, 2023

Choose a reason for hiding this comment

eric-yuan-vanta commented Feb 10, 2023 •

edited

Loading