Silently concretizing and symbolic input via s2ecmd #391

cedrickrause · 2020-02-28T09:07:31Z

Hi everyone,

I am currently trying to retrieve information about the system calls a particular binary (and its associated libraries) are invoking. I am currently currently starting the program with the S2E_SYM_ARGS variable in bootstrap.sh and this works fine in some cases. If I for example start ls with the following line:
S2E_SYM_ARGS="1" LD_PRELOAD="${S2E_SO}" ./${TARGET} -a > /dev/null 2> /dev/null it does what I expect and the TestCaseGenerator shows me all options for ls that consist of 1 dash and 1 letter. However when I increase the size of the input that shall be symbolized from 2 bytes to something longer then it will create all kinds of combinations for option flags which are all unique, but will contain i.e. (for 4 bytes) -sf<each other 1 letter flag> and also -sd<each other 1 letter flag> and that for each combination of flags so it really explodes very fast (quite unthinkable to doe it with 11 bytes to get something like the --full-time flag). I noticed that there is a lot of silently concretizing during that execution (much more than on 2 bytes) and I figured that might be the reason, that it concretizes the symbolic bytes too early so it does not collect info about the path constraints and will then traverse the same code many times (with also earlier concretized input). The reasons for that are always either memory access to concrete code or access to " + regName + " register from libcpu helper.
That connects to the other topic of this questions: symbolic values with symbwrite. It would be great for my project to be able to take symbolic input via stdin and from the documentation this should work with symbwrite. However in the same scenario (ls $(./s2ecmd symbwrite 2)) so 2 symbolic bytes this does not generate all the options flags which I would expect, but rather only creates the pairs (0x0, 0x0) and (0x0, 0x1) and then terminates. Checking the log again I see that right after inserting the symbolic input there are messages of concretization:

5 [Node 0/0 - State 0] BaseInstructions: Inserted symbolic data @0x1390c20 of size 0x2: buffer='\x00\x00' pc=0x401fdb
KLEE: WARNING: silently concretizing (reason: memory access from concrete code)  to value 0x0
5 [Node 0/0 - State 0] Forking state 0 at pc = 0x4573ab at pagedir = 0xdded000
     state 0
     state 1
KLEE: WARNING: silently concretizing (instruction: tcg-llvm-77-7ff51a9f5909:   call void @helper_s2e_tcg_execution_handler(i64 140396890988224, i64 140690690365710)) (reason: access to cc_op register from libcpu helper)  to value 0x0

I don't quite understand why that happens. I would understand concretization right before writing the output of ls so it can display 'real' values, but during that execution I don't quite understand why that happens so often.
This to me also looks like the simplest examples (like in the documentation described at: http://s2e.systems/docs/Tutorials/BasicLinuxSymbex/s2e.so.html#what-about-other-symbolic-input) does not work as expected, or am I missing something here?
I have been battling with those problems for some time now and would greatly appreciate if you can provide me some help.
If you need some further information, just let me know.

Thanks in advance!

The text was updated successfully, but these errors were encountered:

vitalych · 2020-02-28T10:07:49Z

Hi! Thanks for the report.

Regarding path explosion with strings, unfortunately, this is not avoidable. S2E does not support symbolic strings as a first class construct. It only supports arrays of symbolic bytes. That means that string functions (e.g., strlen, strcat...) will branch on every byte. Now to mitigate that, we do have symbolic function models in s2e.so [1]. That reminds me that we'd have to include them in the testsuite, not sure how well they work (I didn't write them ;)). You could give it a shot.
As for the spurious concretization, could you please attach the project to the issue? Also something you could do on your side is to put a breakpoint where that message is printed and get the call stack in gdb.

[1] http://s2e.systems/docs/Plugins/Linux/FunctionModels.html

cedrickrause · 2020-03-04T16:08:31Z

Hi again and thank you for your answer so far.
During trying the things you mentioned I definitely found one source of silent concretization in my own plugin so I removed that. That also caused the concretization right after insertion of symbolic data. What I found now though is that when I use the FunctionModels I still do get silent concretization, even when I don't use symbwrite, but S2E_SYM_ARGS. And always the next message in the debug stream is: FunctionModels: Handling strlen(

)
I see why that may be necessary as it should then create symbolic data again afterwards, right? However using this the result is that it will go through much more possible paths that are unnecessary (i.e. /).
So I wanted to debug it with gdb and I also rebuilt S2E with the --debug flag and started it in gdb, but whenever I run it inside gdb I get the following problem that I don't get when I run it outside of gdb:

qemu-system-x86_64: /home/user/s2e/source/s2e/klee/lib/Expr/BitfieldSimplifier.cpp:317: BitfieldSimplifier::ExprBitsInfo klee::BitfieldSimplifier::doSimplifyBits(const ref<klee::Expr> &, __uint128_t): Assertion `(rbits.knownOneBits & rbits.knownZeroBits) == 0' failed.

Thread 5 "qemu-system-x86" received signal SIGABRT, Aborted.
[Switching to Thread 0x155541129700 (LWP 26504)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.

Do you have an idea why I would get that?

PS: How would I attach a project, exporting it will be in .tar.xz format but apparently that's not supported to be uploaded here.

vitalych · 2020-03-04T16:16:51Z

Regarding silent concretization, a stack trace would help.
I don't know about the gdb crash however, it will require investigating.
It may be gone once this is merged: S2E/klee#11

vitalych · 2020-03-09T14:00:29Z

To attach tar.xz, just make it a tar.xz.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Silently concretizing and symbolic input via s2ecmd #391

Silently concretizing and symbolic input via s2ecmd #391

cedrickrause commented Feb 28, 2020

vitalych commented Feb 28, 2020 •

edited

Loading

cedrickrause commented Mar 4, 2020 •

edited

Loading

vitalych commented Mar 4, 2020

vitalych commented Mar 9, 2020

Silently concretizing and symbolic input via s2ecmd #391

Silently concretizing and symbolic input via s2ecmd #391

Comments

cedrickrause commented Feb 28, 2020

vitalych commented Feb 28, 2020 • edited Loading

cedrickrause commented Mar 4, 2020 • edited Loading

vitalych commented Mar 4, 2020

vitalych commented Mar 9, 2020

vitalych commented Feb 28, 2020 •

edited

Loading

cedrickrause commented Mar 4, 2020 •

edited

Loading