-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustc: Fail fast when compiling a source file larger than 4 GiB #132791
Conversation
rustbot has assigned @compiler-errors. Use |
@@ -115,6 +115,10 @@ impl FileLoader for RealFileLoader { | |||
} | |||
|
|||
fn read_file(&self, path: &Path) -> io::Result<String> { | |||
if path.metadata().is_ok_and(|metadata| metadata.len() > u32::MAX.into()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we report this as an io::Error::other
rather than fataling here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this works and produces:
error: couldn't read /home/tyilo/tmp/big_source.rs: rustc does not support files larger than 4 GiB
error: aborting due to 1 previous error
@@ -115,6 +115,10 @@ impl FileLoader for RealFileLoader { | |||
} | |||
|
|||
fn read_file(&self, path: &Path) -> io::Result<String> { | |||
if path.metadata().is_ok_and(|metadata| metadata.len() > u32::MAX.into()) { | |||
eprintln!("fatal error: rustc does not support files larger than 4GB"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eprintln!("fatal error: rustc does not support files larger than 4GB"); | |
eprintln!("fatal error: rustc does not support text files larger than 4GiB"); |
rustc
supports binary files (via include_bytes!
) with no hard-coded upper limit beyond available memory, so the error message should specify only text files are affected. Also I think the right unit here is GiB
(10243): I believe GB
is usually 10003 (although it's sometimes used for 10243 making GB
somewhat ambiguous).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rustc
supports binary files (viainclude_bytes!
) with no hard-coded upper limit beyond available memory, so the error message should specify only text files are affected.
Just testing this, it doesn't seem to be true.
Also I think the right unit here is
GiB
(1024^3): I believeGB
is usually 1000^3 (although it's sometimes used for 1024^3 makingGB
somewhat ambiguous).
Changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just testing this, it doesn't seem to be true.
It is true only if the file is not valid UTF-8 (which is the case for most binary files). Due to this line of code, if the file is valid UTF-8 the full file is loaded into the source map and is subject to the 4 GiB limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok that's weird. Would it make sense to change that to use try_new_source_file
instead and also fallback to a span for the empty string, if the included files is greater than 4 GiB, but valid UTF-8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately that wouldn't fully fix the problem, as it would make the behaviour of rustc
inconsistent depending on the order source files are loaded, since the 4 GiB limit is cumulative. If two files, a.txt
and b.txt
, both exist and are both 3 GiB of valid UTF-8, and one is loaded with include_bytes!("a.txt")
and the other with include_str!("b.txt")
, then at the moment both compilation will fail regardless of which order they are loaded in as the total loaded source file size exceeds 4 GiB. If load_binary_file
is changed to use try_new_source_file
and fallback if it fails, then compilation will succeed only if include_bytes!("a.txt")
is processed after include_str!("b.txt")
: if the include_bytes!
is processed first, then try_new_source_file
will succeed and fill up 3 GiB of the source map, meaning that the later include_str!
will fail as there isn't enough space in the source map remaining.
The ultimate solution for consistent behaviour here (the 4 GiB limit only applying to files loaded as text files) would be for binary files to be tracked for dep-info separately from regular source files, meaning that binary files are not loaded into the source map at all. This would require some extra handling for expand_include_str
, however, as that is currently the only user of the span returned by load_binary_file
(and would therefore want the file to be loaded into the source map, unlike expand_include_bytes
). As this is a pre-existing issue, it's fine to leave this to a future PR.
cc #132862, which could be fixed by making the check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz also squash commit history
@@ -115,6 +115,12 @@ impl FileLoader for RealFileLoader { | |||
} | |||
|
|||
fn read_file(&self, path: &Path) -> io::Result<String> { | |||
if path.metadata().is_ok_and(|metadata| metadata.len() > SourceFile::MAX_FILE_SIZE.into()) { | |||
return Err(io::Error::other(format!( | |||
"rustc doest not support text files larger than {} bytes", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also the wording is inconsistent between here and the other errors.
@rustbot author |
c996de3
to
5caf516
Compare
@bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (ab3cf26): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (primary 2.3%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 796.131s -> 798.172s (0.26%) |
@@ -1843,6 +1843,8 @@ impl StableSourceFileId { | |||
} | |||
|
|||
impl SourceFile { | |||
const MAX_FILE_SIZE: u32 = u32::MAX - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the -1 here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #132862
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is fail fast for files LARGER than 4 GiB
. 4GiB is 2^32B is (u32::MAX + 1) bytes, so U32::MAX is 4Gi - 1, and U32::MAX - 1 is 4Gi - 2. What is the purpose of a constant for 4GiB - 2B when preventing use of files larger than 4GiB? Is that not off by 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally this PR was for enforcing that files of size 4 GiB and larger would not work, meaning the maximum file size would be 4 GiB - 1 B = u32::MAX
bytes.
However, I then discovered #132862, so files of size 4 GiB - 1 B and larger would not work, so the maximum file size is actually 4 GiB - 2 B = u32::MAX - 1
. So no the current value is correct.
I've just tested this again to make sure. Using the current stable and nightly version of rustc
, I get the following results:
File size (bytes) | stable result | nightly result |
---|---|---|
4,294,967,294 | success | success |
4,294,967,295 | ICE (#132862) | instant failure |
4,294,967,296 | failure after reading the file to memory | instant failure |
Currently if you try to compile a file that is larger than 4 GiB,
rustc
will first read the whole into memory before failing.If we can read the metadata of the file, we can fail before reading the file.