Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: memory access out of bounds when compiling a large input file with soljson.js (but not solc) #13966

Closed
Jack-Clark opened this issue Feb 13, 2023 · 12 comments
Labels
closed due inactivity The issue/PR was automatically closed due to inactivity. high effort A lot to implement but still doable by a single person. The task is large or difficult. low impact Changes are not very noticeable or potential benefits are limited. nice to have We don’t see a good reason not to have it but won’t go out of our way to implement it. performance 🐎 should compile without error Error is reported even though it shouldn't. Source is fine. solcjs stale The issue/PR was marked as stale because it has been open for too long.

Comments

@Jack-Clark
Copy link

Running solcjs --bin --optimize Test.sol produces the following:

/home/jack/.nvm/versions/node/v18.4.0/lib/node_modules/solc/wrapper.js:173
            throw e;
            ^

RuntimeError: memory access out of bounds
    at wasm://wasm/04f226b2:wasm-function[330]:0x284db
    at wasm://wasm/04f226b2:wasm-function[41956]:0x126eebf
    at ccall (/home/jack/.nvm/versions/node/v18.4.0/lib/node_modules/solc/soljson.js:295:18)
    at /home/jack/.nvm/versions/node/v18.4.0/lib/node_modules/solc/soljson.js:299:31
    at runWithCallbacks (/home/jack/.nvm/versions/node/v18.4.0/lib/node_modules/solc/wrapper.js:169:30)
    at compileStandard (/home/jack/.nvm/versions/node/v18.4.0/lib/node_modules/solc/wrapper.js:223:20)
    at Object.compileStandardWrapper [as compile] (/home/jack/.nvm/versions/node/v18.4.0/lib/node_modules/solc/wrapper.js:229:20)
    at Object.<anonymous> (/home/jack/.nvm/versions/node/v18.4.0/lib/node_modules/solc/solc.js:215:43)
    at Module._compile (node:internal/modules/cjs/loader:1112:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1166:10)

Node.js v18.4.0

However, running solc --bin Test.sol will successfully compile the program. Furthermore, the produced binary executes successfully using evmone.

Test.txt contains the solidity code. What is interesting is that I can comment out functions f4 and f5, and solcjs will still reject the program. However, if I replace the commented out functions with repeating simple line comments e.g. // // comment then solcjs will successfully compile it. Here is comment.txt.

I see that this comment suggests a memory access error is related to exceeding the Solidity maximum contract size, however, clearly there is a discrepancy between the behaviour of solc and solcjs. I don't know if this discrepancy is intended.

$ solcjs --version
0.8.16+commit.07a7930e.Emscripten.clang

$ lsb_release -a | grep "Description"
Description:	Ubuntu 20.04.5 LTS
@cameel
Copy link
Member

cameel commented Feb 13, 2023

Thanks for the report! I can confirm that this is reproducible with node v16.19.0 and Solidity 0.8.18 as well.

The discrepancy is definitely not intended. It might be coming from the fact that soljson.js is compiled to 32-bit wasm. It can address less memory so it could just be running out of it earlier than solc, which is a 64-bit binary.

What is interesting is that I can comment out functions f4 and f5, and solcjs will still reject the program. However, if I replace the commented out functions with repeating simple line comments e.g. // // comment then solcjs will successfully compile it. Here is comment.txt.

This sounds like the issue is in processing long comments. By switching to line comments, you're basically breaking a single huge multi-line comment into lots of small comments that will be processed separately. The file is 1.6 MB so the comment can't possibly be long enough to actually exhaust available memory but maybe e.g. the regex parser we use is not particularly efficient. We do have issue reports related to that already: #12208.

Furthermore, the produced binary executes successfully using evmone.

Yeah, this part is not surprising since it's a crash during compilation. Still, good to know.

However, running solc --bin Test.sol will successfully compile the program.

Interestingly, I managed to crash solc as well but only with the new codegen:

solc Test.txt --via-ir

And then only on a release build. A recent build from develop does not crash this way, so I could not check where the stack trace leads to. It looks similar to #13496 though.

In any case, this does look like a bug. I'm going to transfer the issue to the main repo because the crash is clearly inside the binary and not in the JS code.

@cameel cameel transferred this issue from ethereum/solc-js Feb 13, 2023
@cameel cameel changed the title RuntimeError: memory access out of bounds - this occurs on large files that compile successfully with solc RuntimeError: memory access out of bounds when compiling a large multi-line comment with soljson.js (but not solc) Feb 13, 2023
@cameel cameel added bug 🐛 should compile without error Error is reported even though it shouldn't. Source is fine. solcjs medium effort Default level of effort low impact Changes are not very noticeable or potential benefits are limited. should have We like the idea but it’s not important enough to be a part of the roadmap. labels Feb 13, 2023
@Jack-Clark
Copy link
Author

@cameel out of interest, which release build could you crash using the via-ir flag?

@cameel
Copy link
Member

cameel commented Feb 13, 2023

I used the Linux release binary for 0.8.18.

@cameel
Copy link
Member

cameel commented Feb 13, 2023

Just in case the initial report wasn't clear, I can still trigger this without the large comment i.e. the Test.txt file contained a program without comments.

Oh, you mean it's still reproducible without the comment? To be honest, I only quickly triaged it, I did not investigate what exactly is the cause but from your description and the effects I observed the huge comment seemed like a reasonable guess.

@Jack-Clark
Copy link
Author

Oh, you mean it's still reproducible without the comment?

Yes with solcjs it can be reproduced using just Test.txt which contains no comments. I just thought it was surprising that a) commenting out a big chunk made no difference and then b) if I changed the comments from being commented out code e.g.

// int x;
// x+= 2
//....

to

// comment
// comment
// comment

Then the problem no longer happened.

@cameel cameel changed the title RuntimeError: memory access out of bounds when compiling a large multi-line comment with soljson.js (but not solc) RuntimeError: memory access out of bounds when compiling a large input file with soljson.js (but not solc) Feb 13, 2023
@cameel
Copy link
Member

cameel commented Feb 13, 2023

ok so that's probably not just the comment then.

I investigated this now and I think we're just running into the limits of the 32-bit memory space. It seems to be simply because of now much memory solc eats to process such a big contract. Not saying it cannot be improved to use less RAM but that would be just a matter of better optimization and not really a bugfix.

I confirmed this by trying to remove as much as possible from Test.txt. At first I was able to remove quite a few blocks and then removing anything would resolve the crash. Then I started duplicating a single line to compensate for removed code and with enough repetition the crash always returned. When copies of that lime became the only content of the file I started simplifying it, each time replicating it more and more times until I got to the simple 0; repeated 200k times. This still reproduces the error:

{
    printf 'function f() {'
    for i in {1..200000}; do
        printf '    0;\n'
    done
    printf '}'
} > input.sol
npx solcjs --bin input.sol

Further proof is that simply removing the indent from the above makes the crash go away. But then increasing the loop count to 800k brings it back again. It's just the memory overhead of processing such a big file.

It's very possible you could crash solc that way too if you gave it a file that's large enough but you'd also have to have a ton of RAM. I tried giving it ~200 MB of input but I had to kill it when it ate 60 GB RAM and was still going.

So it seems to be more of a performance issue. It's not great that wasm is so limited because we always consider it as the canonical build you could use anywhere. But we might have to just live with it until 64-bit wasm becomes a thing.

@ekpyron @axic Do you think we can do anything concrete here or should we close this as "wontfix"? tl;dr is that extremely large .sol files given as input make the wasm binary run out of memory and crash.

@cameel cameel added performance 🐎 high effort A lot to implement but still doable by a single person. The task is large or difficult. nice to have We don’t see a good reason not to have it but won’t go out of our way to implement it. and removed bug 🐛 medium effort Default level of effort should have We like the idea but it’s not important enough to be a part of the roadmap. labels Feb 13, 2023
@github-actions
Copy link

This issue has been marked as stale due to inactivity for the last 90 days.
It will be automatically closed in 7 days.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label May 15, 2023
@github-actions
Copy link

Hi everyone! This issue has been automatically closed due to inactivity.
If you think this issue is still relevant in the latest Solidity version and you have something to contribute, feel free to reopen.
However, unless the issue is a concrete proposal that can be implemented, we recommend starting a language discussion on the forum instead.

@github-actions github-actions bot added the closed due inactivity The issue/PR was automatically closed due to inactivity. label May 23, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 23, 2023
@elitex45
Copy link

@cameel Thank you so much, commenting and removing the comments actually worked!!

@julianmrodri
Copy link

Hi im facing this issue with one of our repos. Has this been addressed?

@elitex45
Copy link

elitex45 commented Nov 3, 2024

Guys, I've written article on how to solve this
https://hackmd.io/@elitex45/hardhat-memory-access-out-of-bounds-during-compilation

@cameel
Copy link
Member

cameel commented Nov 3, 2024

For anyone running into this error, the actual solution is to upgrade to 0.8.28. With #15451 memory use went down significantly, even as much as 80% in some cases.

Between 0.8.21 and 0.8.27 memory usage jumped a lot due to some artifacts not being generated on demand, which we noticed only recently (see benchmarks: #15561 (comment)). Now that I think of it, it must have been the underlying cause of this issue too. I mean, the example is quite contrived and if you make it large enough, you'll still reach the bounds of available memory eventually, but it should not have been happening that easily.

And if anyone is still having memory problems even on 0.8.28+ (in real-life, non-contrived cases), please report that as a bug. The compiler is designed to cache most artifacts but it does it in a very simplistic way and they accumulate during compilation of large projects. It has never been seen as much of a problem, but some bigger projects I've seen do need 5+ GB of RAM even with the fix from #15451. With some effort the compiler stack could be redesigned to free unused artifacts earlier, but optimizing that won't be high on our priority list unless it's actually affecting users in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed due inactivity The issue/PR was automatically closed due to inactivity. high effort A lot to implement but still doable by a single person. The task is large or difficult. low impact Changes are not very noticeable or potential benefits are limited. nice to have We don’t see a good reason not to have it but won’t go out of our way to implement it. performance 🐎 should compile without error Error is reported even though it shouldn't. Source is fine. solcjs stale The issue/PR was marked as stale because it has been open for too long.
Projects
None yet
Development

No branches or pull requests

4 participants