Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault in llvmir2hll due to insufficient stack space #47

Open
jazzl0ver opened this issue Dec 18, 2017 · 19 comments
Open

Segmentation fault in llvmir2hll due to insufficient stack space #47

jazzl0ver opened this issue Dec 18, 2017 · 19 comments

Comments

@jazzl0ver
Copy link

Hi. Running retdec under Windows Server 2016, getting this:

...
 -> converting function_7ca104() ( 2052.13s )
 -> converting entry_point() ( 2052.13s )
Running phase: removing functions prefixed with [__decompiler_undefined_function_] ( 2053.03s )
decompile.sh: line 947:  5356 Segmentation fault      $LLVMIR2HLL "${LLVMIR2HLL_PARAMS[@]}"
Error: Decompilation of file '/c/retdec/bin/myfile.c.backend.bc' failed

myfile.exe can be downloaded here: https://www.dropbox.com/s/jm7xnxtto3r4jxf/myfile.zip?dl=0

Please, help!

@Convery
Copy link

Convery commented Dec 19, 2017

Maybe debug it as there's no real error information?

Still, seems to get a little further for me:

@jazzl0ver
Copy link
Author

@Convery , how to debug it?

@s3rvac
Copy link
Member

s3rvac commented Dec 19, 2017

Thank you for the report. Segmentation faults in these phases (removing functions prefixed with and signed/unsigned types fixing) are in most cases caused by insufficient space on the stack. This insufficiency may be caused by either incorrect decoding in the front-end part of the decompiler (bin2llvmir), or invalid structuring in the back-end part of the decompiler (llvmir2bin). In both cases, the input code is too big to be decompiled. This is related to #16. The reason is that back-end phases are implemented recursively and perform poorly when the generated AST in the back-end is deeply nested.

At the moment, the following workarounds are available:

  • Try to decompile only a part of the input binary, as suggested here.
  • Try increasing the maximal stack size before running the decompilation. On Linux, this can be done via e.g. ulimit -Ss 67108864. On Windows, this is more complicated as the maximal stack size is hard-coded into llvmir2bin during build (Windows works differently than Linux). To change the maximal stack size on Windows, you would have to modify the /STACK:16777216 argument in src/llvmir2hlltool/CMakeLists.txt.

For a proper fix, a rewrite of the phases will be needed (recursion -> iteration).

@s3rvac s3rvac added the bug label Dec 19, 2017
@s3rvac s3rvac changed the title Segmentation fault Segmentation fault in llvmir2hll due to insufficient stack space Dec 19, 2017
@jazzl0ver
Copy link
Author

@s3rvac , I've increased the stack size in two times (up to 32GB) and rebuilt the binary but that didn't help:

 -> converting function_7ca104() ( 2044.19s )
 -> converting entry_point() ( 2044.20s )
Running phase: removing functions prefixed with [__decompiler_undefined_function_] ( 2045.11s )
Running phase: removing functions from standard libraries ( 2659.48s )
Running phase: removing code that is not reachable in a CFG ( 2659.48s )
Warning: [NonRecursiveCFGBuilder] there is no node for an edge to `v5_408bdb = *(IntToPtrCastExpr<ptr>((v0_408bfd + 1)))` -> s
Running phase: signed/unsigned types fixing ( 2779.31s )
decompile.sh: line 947:  1308 Segmentation fault      $LLVMIR2HLL "${LLVMIR2HLL_PARAMS[@]}"
Error: Decompilation of file '/c/retdec/bin/myfile.c.backend.bc' failed

Do you think I need to double it again?

@s3rvac
Copy link
Member

s3rvac commented Dec 20, 2017

@jazzl0ver I am afraid that further increasing will not help that much. We will have to investigate this. The binary file is probably incorrectly decoded.

@jazzl0ver
Copy link
Author

@s3rvac , thank you for your help! Hope, you'll be able to fix it quickly

@MerovingianByte
Copy link

I'd just like to add that this problem can happen in other phases too:

 -> running SimpleCopyPropagationOptimizer ( 9.77s )
./decompile.sh: line 947:  3996 Segmentation fault      $LLVMIR2HLL "${LLVMIR2HLL_PARAMS[@]}"

Unfortunately I can't share the executable, not sure if I can mention it.

@jazzl0ver
Copy link
Author

hi. just tried the latest sources and the issue is still there:

 -> converting entry_point() ( 1997.59s )
Running phase: removing functions prefixed with [__decompiler_undefined_function_] ( 1998.54s )
Running phase: removing functions from standard libraries ( 2587.21s )
Running phase: removing code that is not reachable in a CFG ( 2587.22s )
Warning: [NonRecursiveCFGBuilder] there is no node for an edge to `v5_408bdb = *(IntToPtrCastExpr<ptr>((v0_408bfd + 1)))` -> skipping this edge
Running phase: signed/unsigned types fixing ( 2707.97s )
retdec-decompiler.sh: line 922:  4860 Segmentation fault      "$LLVMIR2HLL" "${LLVMIR2HLL_PARAMS[@]}"
Error: Decompilation of file '/c/retdec/bin/myfile.c.backend.bc' failed

Any chances you gonna fix that soon?

@PeterMatula
Copy link
Collaborator

@jazzl0ver This will take some time. It is not the kind of problem that you fix with a small modification. I'm already working on a new decoder in branch issue-116 that might help. But it may not be enough and other parts will have to be improved as well.

@jazzl0ver
Copy link
Author

@PeterMatula , I appreciate your quick reply, thank you!

@s3rvac s3rvac added the T-memory label Mar 5, 2018
@jazzl0ver
Copy link
Author

Hi. Just curious if you have any progress on this

@PeterMatula
Copy link
Collaborator

In general, the new decoder helped. But there is huge amount of data in text section in this sample, and we are still not as good as we should. Also, when I was working on a new decoder, I came across samples that were decoded perfectly (no data interpreted as code), they were not even that complex, and they still spent huge amount of time in llvmir2hll. It turns out our backend converter (also does code structuring) is not perfect and it can really mess things up. There is an undocumented value new for --backend-llvmir2bir-converter that you can try to use if you run into this kind of trouble. But it is undocumented for a reason, it is not finished - sometimes it works better than default orig, sometimes it is worse. This is the next big thing we should fix to move forward, but it will not be easy.

@jazzl0ver
Copy link
Author

Thank you, @PeterMatula ! I'll give it a try and let you know

@jazzl0ver
Copy link
Author

@PeterMatula , now it looks different:

c:\retdec\bin>bash retdec-decompiler.sh myfile.exe 
...
 -> running SelfAssignOptimizer ( 4352.93s )
Warning: out of memory; trying to recover
 -> running VarDefForLoopOptimizer ( 4353.96s )
Warning: out of memory; trying to recover
 -> running VarDefStmtOptimizer ( 4354.99s )
Warning: out of memory; trying to recover
 -> running SimplifyArithmExprOptimizer ( 4356.02s )
Warning: out of memory; trying to recover
 -> running DeadCodeOptimizer ( 4357.10s )
Warning: out of memory; trying to recover
 -> running DerefToArrayIndexOptimizer ( 4358.14s )
Warning: out of memory; trying to recover
 -> running IfToSwitchOptimizer ( 4359.17s )
 -> running CCastOptimizer ( 4359.74s )
Warning: out of memory; trying to recover
 -> running CArrayArgOptimizer ( 4362.11s )
Warning: out of memory; trying to recover
Running phase: variable renaming [readable] ( 4363.14s )
LLVM ERROR: Could not acquire a cryptographic context: Provider DLL failed to initialize correctly.  (0x8009001D)
Error: Decompilation of file '/c/retdec/bin/myfile.c.backend.bc' failed

Same errors when issued with the extra option you mentioned:

c:\retdec\bin>bash retdec-decompiler.sh --backend-llvmir2bir-converter new myfile.exe 
Warning: out of memory; trying to recover
 -> running CArrayArgOptimizer ( 4489.14s )
Warning: out of memory; trying to recover
Running phase: variable renaming [readable] ( 4490.24s )
LLVM ERROR: Could not acquire a cryptographic context: An internal error occurred.  (0x80090020)
Error: Decompilation of file '/c/retdec/bin/myfile.c.backend.bc' failed

image
There's actually a lot of free memory out there..

BTW, if you were able to de-compile my file, could you please send me the sources?

@silverbacknet
Copy link

--max-memory always defaults to half of your physical memory now, which is exactly what's happening here. You should probably use --max-memory 60000000000 or so.

@jazzl0ver
Copy link
Author

Thanks, @silverbacknet . Unfortunately, that didn't help:

$ bash retdec-decompiler.sh --backend-llvmir2bir-converter new --max-memory 60000000000 myfile.exe
...
 -> running EmptyArrayToStringOptimizer ( 6727.27s )
 -> running BitOpToLogOpOptimizer ( 6727.27s )
Warning: out of memory; trying to recover
 -> running SimplifyArithmExprOptimizer ( 6728.46s )
Warning: out of memory; trying to recover
 -> running UnusedGlobalVarOptimizer ( 6730.04s )
Warning: out of memory; trying to recover
 -> running DeadLocalAssignOptimizer ( 6731.06s )
Warning: out of memory; trying to recover
 -> running SimpleCopyPropagationOptimizer ( 6744.21s )
Warning: out of memory; trying to recover
 -> running CopyPropagationOptimizer ( 6750.76s )
Warning: out of memory; trying to recover
 -> running SelfAssignOptimizer ( 6757.13s )
Warning: out of memory; trying to recover
 -> running VarDefForLoopOptimizer ( 6758.22s )
Warning: out of memory; trying to recover
 -> running VarDefStmtOptimizer ( 6759.33s )
Warning: out of memory; trying to recover
 -> running SimplifyArithmExprOptimizer ( 6761.19s )
Warning: out of memory; trying to recover
 -> running DeadCodeOptimizer ( 6762.44s )
Warning: out of memory; trying to recover
 -> running DerefToArrayIndexOptimizer ( 6763.54s )
Warning: out of memory; trying to recover
 -> running IfToSwitchOptimizer ( 6764.56s )
 -> running CCastOptimizer ( 6765.00s )
Warning: out of memory; trying to recover
 -> running CArrayArgOptimizer ( 6774.60s )
Warning: out of memory; trying to recover
Running phase: variable renaming [readable] ( 6775.69s )
LLVM ERROR: Could not acquire a cryptographic context: An internal error occurred.  (0x80090020)
Error: Decompilation of file '/c/test/retdec/bin/myfile.c.backend.bc' failed

Do you think I need more RAM?

@PeterMatula
Copy link
Collaborator

I think you will simply not decompile this file using RetDec at the moment. And even if you would, the result would not be very good. We need to make it better in order to handle something like this in a reasonable time and quality.

@jazzl0ver
Copy link
Author

I see :( Hope you will continue improving it.

@s3rvac
Copy link
Member

s3rvac commented Feb 15, 2020

@megacoderencoderdecoder Because you spam our issue tracker and annoy people by unnecessarily tagging them.

@avast avast deleted a comment Feb 15, 2020
@avast avast deleted a comment Feb 15, 2020
@avast avast deleted a comment Feb 15, 2020
@avast avast locked as spam and limited conversation to collaborators Feb 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants
@silverbacknet @jazzl0ver @Convery @s3rvac @PeterMatula @MerovingianByte and others