Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect malformed UTF-8 files and refuse to engage further #53667

Merged
merged 8 commits into from
Apr 7, 2023

Conversation

RyanCavanaugh
Copy link
Member

Belt-and-suspenders on #53114.

When a file is actually binary, or at least starts with invalid UTF-8 bytes, it's not going to be a good use of our time to try to make sense of it. This PR detects U+FFFD REPLACEMENT CHARACTER in the leading character position and, if encountered, skips to the end of the file.

This fully mitigates both the crashes caused by these enormous binaries effectively fuzzing our parser, as well as the performance penalty of generating e.g. 54,000 errors on a 1.8 MB "JS" file that is actually just line noise.

The "corrupted.ts" file is the verbatim first eight bytes of a repro we got for this.

@typescript-bot typescript-bot added Author: Team For Uncommitted Bug PR for untriaged, rejected, closed or missing bug labels Apr 5, 2023
@jakebailey
Copy link
Member

I'm paranoid

@typescript-bot test this
@typescript-bot test top100
@typescript-bot user test this
@typescript-bot user test tsserver
@typescript-bot test tsserver top100
@typescript-bot run dt
@typescript-bot perf test this faster

@typescript-bot
Copy link
Collaborator

typescript-bot commented Apr 5, 2023

Heya @jakebailey, I've started to run the diff-based top-repos suite on this PR at 0c78667. You can monitor the build here.

Update: The results are in!

@typescript-bot
Copy link
Collaborator

typescript-bot commented Apr 5, 2023

Heya @jakebailey, I've started to run the extended test suite on this PR at 0c78667. You can monitor the build here.

@typescript-bot
Copy link
Collaborator

typescript-bot commented Apr 5, 2023

Heya @jakebailey, I've started to run the diff-based user code test suite on this PR at 0c78667. You can monitor the build here.

Update: The results are in!

@typescript-bot
Copy link
Collaborator

typescript-bot commented Apr 5, 2023

Heya @jakebailey, I've started to run the diff-based user code test suite (tsserver) on this PR at 0c78667. You can monitor the build here.

Update: The results are in!

@typescript-bot
Copy link
Collaborator

typescript-bot commented Apr 5, 2023

Heya @jakebailey, I've started to run the abridged perf test suite on this PR at 0c78667. You can monitor the build here.

Update: The results are in!

@typescript-bot
Copy link
Collaborator

typescript-bot commented Apr 5, 2023

Heya @jakebailey, I've started to run the parallelized Definitely Typed test suite on this PR at 0c78667. You can monitor the build here.

Update: The results are in!

@typescript-bot
Copy link
Collaborator

typescript-bot commented Apr 5, 2023

Heya @jakebailey, I've started to run the diff-based top-repos suite (tsserver) on this PR at 0c78667. You can monitor the build here.

Update: The results are in!

@typescript-bot
Copy link
Collaborator

@jakebailey Here are the results of running the user test suite comparing main and refs/pull/53667/merge:

There were infrastructure failures potentially unrelated to your change:

  • 1 instance of "Package install failed"

Otherwise...

Everything looks good!

@typescript-bot
Copy link
Collaborator

@jakebailey Here are the results of running the user test suite comparing main and refs/pull/53667/merge:

Everything looks good!

@typescript-bot
Copy link
Collaborator

@jakebailey
The results of the perf run you requested are in!

Here they are:

Comparison Report - main..53667

Metric main 53667 Delta Best Worst p-value
Angular - node (v16.17.1, x64)
Memory used 364,816k (± 0.00%) 364,801k (± 0.01%) ~ 364,770k 364,836k p=0.572 n=6
Parse Time 3.54s (± 0.91%) 3.57s (± 1.01%) ~ 3.52s 3.61s p=0.169 n=6
Bind Time 1.18s (± 1.03%) 1.18s (± 0.71%) ~ 1.17s 1.19s p=0.227 n=6
Check Time 9.53s (± 0.18%) 9.52s (± 0.43%) ~ 9.48s 9.59s p=0.334 n=6
Emit Time 7.97s (± 0.34%) 7.97s (± 0.35%) ~ 7.93s 8.00s p=0.936 n=6
Total Time 22.22s (± 0.29%) 22.24s (± 0.37%) ~ 22.13s 22.38s p=1.000 n=6
Compiler-Unions - node (v16.17.1, x64)
Memory used 192,553k (± 0.03%) 192,538k (± 0.03%) ~ 192,449k 192,635k p=0.810 n=6
Parse Time 1.59s (± 0.76%) 1.59s (± 1.67%) ~ 1.54s 1.61s p=0.681 n=6
Bind Time 0.82s (± 0.50%) 0.82s (± 0.00%) ~ 0.82s 0.82s p=0.405 n=6
Check Time 10.38s (± 0.93%) 10.30s (± 0.38%) -0.08s (- 0.80%) 10.24s 10.36s p=0.036 n=6
Emit Time 3.00s (± 0.49%) 2.98s (± 0.51%) ~ 2.97s 3.01s p=0.187 n=6
Total Time 15.78s (± 0.64%) 15.69s (± 0.42%) ~ 15.57s 15.75s p=0.109 n=6
Monaco - node (v16.17.1, x64)
Memory used 345,677k (± 0.01%) 345,688k (± 0.00%) ~ 345,674k 345,699k p=0.378 n=6
Parse Time 2.71s (± 0.69%) 2.72s (± 0.36%) ~ 2.71s 2.73s p=0.805 n=6
Bind Time 1.08s (± 0.48%) 1.09s (± 0.37%) ~ 1.08s 1.09s p=0.112 n=6
Check Time 7.77s (± 0.33%) 7.77s (± 0.15%) ~ 7.76s 7.79s p=0.807 n=6
Emit Time 4.47s (± 0.74%) 4.45s (± 0.59%) ~ 4.41s 4.48s p=0.295 n=6
Total Time 16.04s (± 0.17%) 16.03s (± 0.21%) ~ 15.97s 16.06s p=0.421 n=6
TFS - node (v16.17.1, x64)
Memory used 300,052k (± 0.00%) 300,057k (± 0.01%) ~ 300,031k 300,076k p=0.575 n=6
Parse Time 2.16s (± 0.19%) 2.16s (± 0.48%) ~ 2.15s 2.18s p=0.390 n=6
Bind Time 1.23s (± 0.80%) 1.24s (± 0.79%) ~ 1.22s 1.25s p=0.116 n=6
Check Time 7.22s (± 0.29%) 7.25s (± 0.33%) ~ 7.20s 7.27s p=0.071 n=6
Emit Time 4.33s (± 0.43%) 4.32s (± 0.41%) ~ 4.31s 4.35s p=1.000 n=6
Total Time 14.93s (± 0.15%) 14.97s (± 0.22%) +0.05s (+ 0.32%) 14.93s 15.03s p=0.019 n=6
material-ui - node (v16.17.1, x64)
Memory used 479,408k (± 0.01%) 479,433k (± 0.01%) ~ 479,382k 479,534k p=0.575 n=6
Parse Time 3.24s (± 0.30%) 3.25s (± 0.82%) ~ 3.20s 3.27s p=0.463 n=6
Bind Time 0.95s (± 0.67%) 0.96s (± 1.67%) ~ 0.95s 0.99s p=0.340 n=6
Check Time 18.20s (± 0.21%) 18.19s (± 0.23%) ~ 18.14s 18.25s p=0.748 n=6
Emit Time 0.00s (± 0.00%) 0.00s (± 0.00%) ~ 0.00s 0.00s p=1.000 n=6
Total Time 22.39s (± 0.16%) 22.40s (± 0.22%) ~ 22.33s 22.46s p=0.871 n=6
xstate - node (v16.17.1, x64)
Memory used 559,726k (± 0.01%) 559,823k (± 0.03%) ~ 559,572k 560,068k p=0.471 n=6
Parse Time 3.98s (± 0.34%) 4.00s (± 0.37%) +0.02s (+ 0.54%) 3.98s 4.02s p=0.044 n=6
Bind Time 1.74s (± 0.57%) 1.74s (± 0.30%) ~ 1.73s 1.74s p=0.504 n=6
Check Time 3.07s (± 0.34%) 3.06s (± 0.27%) -0.02s (- 0.60%) 3.05s 3.07s p=0.011 n=6
Emit Time 0.09s (± 5.53%) 0.10s (± 5.76%) ~ 0.09s 0.10s p=0.640 n=6
Total Time 8.88s (± 0.30%) 8.89s (± 0.15%) ~ 8.87s 8.91s p=0.683 n=6
System
Machine Namets-ci-ubuntu
Platformlinux 5.4.0-135-generic
Architecturex64
Available Memory16 GB
Available Memory15 GB
CPUs4 × Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
Hosts
  • node (v16.17.1, x64)
Scenarios
  • Angular - node (v16.17.1, x64)
  • Compiler-Unions - node (v16.17.1, x64)
  • Monaco - node (v16.17.1, x64)
  • TFS - node (v16.17.1, x64)
  • material-ui - node (v16.17.1, x64)
  • xstate - node (v16.17.1, x64)
Benchmark Name Iterations
Current 53667 6
Baseline main 6

Developer Information:

Download Benchmark

@typescript-bot
Copy link
Collaborator

Hey @jakebailey, the results of running the DT tests are ready.
Everything looks the same!
You can check the log here.

@typescript-bot
Copy link
Collaborator

@jakebailey Here are the results of running the top-repos suite comparing main and refs/pull/53667/merge:

Everything looks good!

1 similar comment
@typescript-bot
Copy link
Collaborator

@jakebailey Here are the results of running the top-repos suite comparing main and refs/pull/53667/merge:

Everything looks good!

# Conflicts:
#	src/compiler/diagnosticMessages.json
#	tests/baselines/reference/api/tsserverlibrary.d.ts
#	tests/baselines/reference/api/typescript.d.ts
src/compiler/program.ts Show resolved Hide resolved
src/compiler/diagnosticMessages.json Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Author: Team For Uncommitted Bug PR for untriaged, rejected, closed or missing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants