-
-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve WebM detection #486
Conversation
@@ -736,7 +736,8 @@ async function _fromTokenizer(tokenizer) { | |||
while (children > 0) { | |||
const element = await readElement(); | |||
if (element.id === 0x42_82) { | |||
return tokenizer.readToken(new Token.StringType(element.len, 'utf-8')); // Return DocType | |||
const rawValue = await tokenizer.readToken(new Token.StringType(element.len, 'utf-8')); | |||
return rawValue.replace(/\00.*$/g, ''); // Return DocType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it document the maximum amount of null characters there could be? Would be nice to have a limit in place so it wouldn't hang on faulty files that has too many null characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no maximum, it's used as a kind of padding.
The maximum length is of the string read is already terminated by element.len
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah you mean check element.len
? The length can exceed the JavaScript number length and is encoded a specific way (VINT examples).
At that point the assumption is already it is EBML and starts to consume the tokenizer
and iterate through the EBML elements.
You could say, the docType must be a relative short value, but then we already matched 0x1A, 0x45, 0xDF, 0xA3
& 0x42_82
. Extremely unlikely we hit that point without the format being EBML.
Fixes recognition of WebM format. Resolves: #485
Fixes recognition of WebM format.
Fixes: #485