Extend file type (updated) #603

FredrikSchaefer · 2023-07-17T06:28:29Z

Hey There :)

I came across this issue #340 and found that it's just what we need. As the initial PR is way outdated, I decided to start over from scratch.

Resolves #340.

FredrikSchaefer · 2023-07-24T04:31:39Z

Gentle nudge towards @sindresorhus . We are really looking forward to using this feature in the next official version.

sindresorhus · 2023-07-24T16:52:49Z

I'm on vacation, so I won't be able to review this for another couple of weeks.

core.d.ts

FredrikSchaefer · 2023-07-25T05:04:49Z

I'm on vacation, so I won't be able to review this for another couple of weeks.

Thanks a lot for squeezing in time to send change requests, even on vacation! We will then see that we don't rely on quick action from your side. Enjoy your time off!

Co-authored-by: Sindre Sorhus <sindresorhus@gmail.com>

…/file-type into extend-file-type-updated

Borewit · 2023-07-25T08:43:56Z

I am on holiday as well, this time without a laptop. Back in 2 weeks.

FredrikSchaefer · 2023-08-28T07:44:06Z

Another gentle nudge towards @sindresorhus and @Borewit , so this PR does not get lost in the the vortex of time.

FredrikSchaefer · 2023-09-12T05:34:25Z

... And yet another gentle nudge to @sindresorhus . I'd really appreciate if you found the time for a review :-)

FredrikSchaefer · 2023-09-25T09:02:05Z

Is there anything I can do to support the review of this PR?

Borewit · 2023-09-25T12:33:22Z

I will try find some time as well to review this PR @FredrikSchaefer. Please give me 2 weeks.

Borewit

Thank you so much for your patience @FredrikSchaefer.

I think you have done a fantastic job. I just have some minor comments.

I will be off the grid for a bit more then a week. So sorry.

Borewit · 2023-10-11T20:21:28Z

core.d.ts

+
+Custom detectors can be used to add new FileTypeResults or to modify return behaviour of existing FileTypeResult detections.
+
+If the detector returns `undefined`, the `tokenizer.position` should be 0 (unless it's a stream). That allows other detectors to parse the file.


Not sure if I agree with the "unless it's a stream". Essentially you can iterate to other detectors if you took a bite of the apple. Only peek is allowed, if read you you have consumed the tokenizer, which is very similar to a stream.

I fear this an area where we can expect a lot questions from users.

Hey, thanks for the comment.

Yeah, guess you're right about the lot of questions.

I just mindlessly took this information from this previous discussion.

Let me suggest a more detailed explanation here:

If the detector returns undefined, the tokenizer.position should typically be 0. This allows easy parsing by other detectors, unless subsequent custom detectors specify otherwise. Additionally, the detector shouldn't consume the tokenizer; while peeking is non-consuming, reading is.

What do you think of this?

I'm really open to anything here!

See also my other comment: https://github.com/sindresorhus/file-type/pull/603/files#r1356979704

I suggest something like this:

If the detector returns no_match it is not allowed to read from the tokenizer (the tokeinzer.position must remain 0) otherwise following scanners will read from the wrong file offset.

If the detector return undefined the scanner is certain the file type cannot be determined, neither by other scanners.

no_match represents option 1 explained in here

I agree with your point that custom detectors should be able to interrupt detection. However, I see two small downsides of the suggested approach:

The wording no_match does not really make clear to me whether it means no match for this detector, or no match at all.

The standard FileTypeParser returns undefined when no file type could be recognized. Therefore requiring the custom detectors to return something else is a bit counter intuitive.

I therefore suggest to do it the other way around:

If the detector returns undefined, it is not allowed to read from the tokenizer (the tokenizer.position must remain 0) otherwise following scanners will read from the wrong file offset.

If the detector returns file_type_undetectable, the detector is certain the file type cannot be determined, even by other scanners. The FileTypeParser interrupts the parsing and immediately returns undefined.

Okay, one could argue that file_type_undetectable also does not clearly say whether it means file type undetectable for this detector or for all detectors, but it still makes it a bit clearer in my opinion.

I agree with your point that custom detectors should be able to interrupt detection. However, I see two small downsides of the suggested approach:

The wording no_match does not really make clear to me whether it means no match for this detector, or no match at all.

The standard FileTypeParser returns undefined when no file type could be recognized. Therefore requiring the custom detectors to return something else is a bit counter intuitive.

I therefore suggest to do it the other way around:

If the detector returns undefined, it is not allowed to read from the tokenizer (the tokenizer.position must remain 0) otherwise following scanners will read from the wrong file offset.

If the detector returns file_type_undetectable, the detector is certain the file type cannot be determined, even by other scanners. The FileTypeParser interrupts the parsing and immediately returns undefined.

Sounds good to me. The second case can be also be something like, detector started reading but for some reason failed to determine the file-type. Not ideally, but it can happen. If the detector starts reading, there is no way back.

We could also check the position after each custom scanner. It may not be 0 actually, there is also an iterated use case with ID3 header. The position should be remain unchanged.

Good idea! Just pushed a commit taking care of that check.

core.d.ts

Added detectionImpossible to allow interruption by custom detectors Updated doc of fileTypeFromFile

midmarch

Almost there

core.js

core.d.ts

Borewit · 2023-10-25T09:49:36Z

Forgive me, wrong account, same person.

…ested-code-changes Suggested changes to simplify code

Borewit

Looks good to me, thanks a lot for your effort @FredrikSchaefer

For your final approval @sindresorhus.

Borewit · 2023-11-01T07:13:06Z

@sindresorhus, kind reminder, please proceed with merging if you are happy with this one

core.js

core.d.ts

test.js

core.d.ts

readme.md

Explain role of `blob` argument in function comment.

core.d.ts

readme.md

core.d.ts

test.js

Fredrik added 8 commits July 14, 2023 13:11

Allow specification of custom detectors + readme update

bda3f46

Simplify logic in runCustomDetectors

6007eff

add custom detectors to fileTypeFromStream

c3dba6e

fix linting issue

fab97ae

Execute custom detectors before default ones

c7c3190

add tests

4bcddff

fix docs

733bfac

compatibility with Node.js 14 and 16

37e1e57

FredrikSchaefer mentioned this pull request Jul 17, 2023

Allow specification of custom detectors + readme update FredrikSchaefer/file-type#1

Closed

sindresorhus requested changes Jul 24, 2023

View reviewed changes

core.d.ts Outdated Show resolved Hide resolved

core.d.ts Outdated Show resolved Hide resolved

FredrikSchaefer and others added 4 commits July 25, 2023 07:05

Remove blank space

bfd18b1

Co-authored-by: Sindre Sorhus <sindresorhus@gmail.com>

Wrap custom detectors into file type options

ee4cb2c

Merge branch 'extend-file-type-updated' of github.com:FredrikSchaefer…

ad6d44f

…/file-type into extend-file-type-updated

Adjust fileTypeFromFile(...) to recent changes

29930bf

FredrikSchaefer requested a review from sindresorhus July 25, 2023 07:28

Borewit requested changes Oct 11, 2023

View reviewed changes

Borewit reviewed Oct 12, 2023

View reviewed changes

core.d.ts Outdated Show resolved Hide resolved

Fredrik added 3 commits October 17, 2023 14:38

Moved custom detectors from function to constructor argument

7ea6efd

Added detectionImpossible to allow interruption by custom detectors Updated doc of fileTypeFromFile

fix fileTypeStream (add back fileTypeOptions)

748ffee

Update documentation

2adec69

FredrikSchaefer requested a review from Borewit October 17, 2023 13:06

add check for illegal tokenizer position change

0d1464c

FredrikSchaefer requested a review from Borewit October 25, 2023 09:31

midmarch suggested changes Oct 25, 2023

View reviewed changes

core.js Outdated Show resolved Hide resolved

core.js Outdated Show resolved Hide resolved

core.js Outdated Show resolved Hide resolved

core.d.ts Outdated Show resolved Hide resolved

Fredrik and others added 5 commits October 25, 2023 12:57

Rename stream(...) to toDetectingStream(...)

a926bf2

Fix error handling

5e2a0fd

Suggested changes to simplify code

f38565d

Merge pull request #2 from sindresorhus/extend-file-type-updated-sugg…

e25c294

…ested-code-changes Suggested changes to simplify code

Fix TypeScript declaration

080ac75

Borewit approved these changes Oct 25, 2023

View reviewed changes

This comment was marked as resolved.

Sign in to view

sindresorhus requested changes Nov 3, 2023

View reviewed changes

Borewit added 4 commits November 6, 2023 21:55

Remove comments from unit tests and redundant empty line

de706c5

Make code examples executable.

331502d

Explain role of `blob` argument in function comment.

Remove empty comment lines

9d85f05

Remove unused fileTypeOptions parameter from typings

ede94d9

Borewit requested a review from sindresorhus November 6, 2023 21:05

sindresorhus requested changes Nov 7, 2023

View reviewed changes

Borewit force-pushed the extend-file-type-updated branch 3 times, most recently from e62ca5d to 4164767 Compare November 10, 2023 17:56

Adjust number code and comment style suggestions

ca6e449

Borewit force-pushed the extend-file-type-updated branch from 4164767 to ca6e449 Compare November 10, 2023 18:03

Borewit requested a review from sindresorhus November 10, 2023 18:04

Update core.d.ts

a50e37a

sindresorhus merged commit f5b232c into sindresorhus:main Nov 10, 2023
3 checks passed

mnathsnyk mentioned this pull request Mar 15, 2024

[Snyk] Upgrade file-type from 8.1.0 to 19.0.0 mnathsnyk/nodejs-goof#7

Open

clifford-snyk-github mentioned this pull request Aug 15, 2024

[Snyk] Upgrade file-type from 8.1.0 to 19.3.0 Clifford-GitHub-Organization/nodejs-goof#39

Closed

Mark-hub-2323 mentioned this pull request Aug 22, 2024

[Snyk] Upgrade file-type from 16.5.4 to 19.3.0 Mark-hub-2323/juice-shop#12

Open

spgenaic mentioned this pull request Oct 26, 2024

[Snyk] Upgrade file-type from 18.5.0 to 18.7.0 spgenaic/canvas-server#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend file type (updated) #603

Extend file type (updated) #603

FredrikSchaefer commented Jul 17, 2023

FredrikSchaefer commented Jul 24, 2023

sindresorhus commented Jul 24, 2023

FredrikSchaefer commented Jul 25, 2023

Borewit commented Jul 25, 2023

FredrikSchaefer commented Aug 28, 2023

FredrikSchaefer commented Sep 12, 2023

FredrikSchaefer commented Sep 25, 2023

Borewit commented Sep 25, 2023

Borewit left a comment •

edited

Loading

Borewit Oct 11, 2023

FredrikSchaefer Oct 12, 2023

Borewit Oct 12, 2023

FredrikSchaefer Oct 17, 2023

FredrikSchaefer Oct 17, 2023

Borewit Oct 17, 2023

FredrikSchaefer Oct 18, 2023

midmarch left a comment

Borewit commented Oct 25, 2023

Borewit left a comment •

edited

Loading

This comment was marked as resolved.

Borewit commented Nov 1, 2023


		Custom detectors can be used to add new FileTypeResults or to modify return behaviour of existing FileTypeResult detections.

		If the detector returns `undefined`, the `tokenizer.position` should be 0 (unless it's a stream). That allows other detectors to parse the file.

Extend file type (updated) #603

Extend file type (updated) #603

Conversation

FredrikSchaefer commented Jul 17, 2023

FredrikSchaefer commented Jul 24, 2023

sindresorhus commented Jul 24, 2023

FredrikSchaefer commented Jul 25, 2023

Borewit commented Jul 25, 2023

FredrikSchaefer commented Aug 28, 2023

FredrikSchaefer commented Sep 12, 2023

FredrikSchaefer commented Sep 25, 2023

Borewit commented Sep 25, 2023

Borewit left a comment • edited Loading

Choose a reason for hiding this comment

Borewit Oct 11, 2023

Choose a reason for hiding this comment

FredrikSchaefer Oct 12, 2023

Choose a reason for hiding this comment

Borewit Oct 12, 2023

Choose a reason for hiding this comment

FredrikSchaefer Oct 17, 2023

Choose a reason for hiding this comment

FredrikSchaefer Oct 17, 2023

Choose a reason for hiding this comment

Borewit Oct 17, 2023

Choose a reason for hiding this comment

FredrikSchaefer Oct 18, 2023

Choose a reason for hiding this comment

midmarch left a comment

Choose a reason for hiding this comment

Borewit commented Oct 25, 2023

Borewit left a comment • edited Loading

Choose a reason for hiding this comment

This comment was marked as resolved.

Borewit commented Nov 1, 2023

Borewit left a comment •

edited

Loading

Borewit left a comment •

edited

Loading