-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
early and comprehensive injection on Linux #47
Comments
From derek.br...@gmail.com on February 24, 2009 08:28:33 this was PR 204554 |
From qin.zhao@gmail.com on March 23, 2009 09:42:21 If we want to have an early injection in Linux, vdso might not being present at init |
From derek.br...@gmail.com on March 23, 2009 09:44:43 expanding on prev comment: xref issue #89 where vdso is now inside ld.so |
From derek.br...@gmail.com on April 02, 2009 13:01:52 being loaded before libc should also let us preempt libc global symbols to provide |
From bruen...@google.com on November 23, 2011 07:00:09 xref a user hitting a conflict with DR's preferred base when running firefox |
From rnk@google.com on April 22, 2012 17:12:33 I started using ptrace to accomplish this over the weekend because I thought it would be fun. I was right! =D We should talk about some of the design points in our Monday meeting. Here's sort of a summary of what I'm doing. I have a new drdeploy_inject binary on Linux (to be renamed later when it's the default injection method) that implements parts of the dr_inject.h API using a typical fork, PTRACE_TRACEME, exec(app, ...) pattern. Currently I'm just linking this code into libdynamorio.so and initializing DR as a standalone library, since that's easier. drdeploy_inject just links against libdyanamorio.so and implements the typical dr_inject_* API call pattern. When we cross execve, we're at _start. At this point, I want to perform a bunch of syscalls to set up DR in the injectee's address space. I do this by generating PIC ilists in the injector, encoding them, and using ptrace to copy them over the injectee's current code. I terminate it with an int3, and set it running. When I hit the int3, I copy xax back to the injector, and put back the registers and memory to the original state. At first, I was just doing two raw syscalls:
After that, I can set XIP to (fn_ptr - get_dynamorio_dll_base()) + dr_base and let it execute in the injectee, without going through the syscall ilist generation and injection. I was hoping that would let me run enough code to initialize DynamoRIO and call into the private loader to finish the job, but it wasn't. Currently I am adding indirection to the Linux (only Linux!) private loader to hook the mmap/munmap/mprotect syscalls it issues and translate them into injected syscalls. This has worked fairly well and I can execute more code, but anything depending on libc still doesn't work. In particular, std_->_fileno in STDOUT broke, so I replaced it with STD__FILENO from unistd.h, which has the typical in->0, out->1, err->2 mapping. I still can't get dynamorio_app_init() to succeed yet, so I'll look into it and possibly round out more private loader support for injection. Based on this work, it should be easy to solve external and internal attaching on Linux, it's just a question of where and how we ptrace. Status: Started |
From bruen...@google.com on April 23, 2012 07:23:48 first we need to decide whether we ever want to rely on ptrace. if so, the first use of ptrace should be issue #37 where we use ptrace plus the regular loader and don't need to implement early injection just yet which requires processing imports w/o using any imports. there are other ways to do that that we should consider. we'll discuss offline. |
From bruen...@google.com on July 11, 2012 17:20:49 we would like to support running DR on static executables, which the injector this case represents would solve |
From bruen...@google.com on July 11, 2012 17:22:57 xref issue #840 |
From zhao...@google.com on July 11, 2012 17:34:30 Owner: zhao...@google.com |
From bruen...@google.com on July 12, 2012 07:37:12 changed title to take into account that this is no longer a tweak to be earlier (via -z initfirst, though note that only the final LD_PRELOAD lib w/ that flag will be the first) but represents creating an injection method that works on static executables as well as taking over from the first app instruction. Summary: early and comprehensive injection on Linux |
From bruen...@google.com on July 13, 2012 06:37:46 for transparency in the application name that's displayed by various tools:
% ./original arg1 arg2 % cat /proc/ % ps ax | grep tmp % pgrep original So pgrep is getting the real name from somewhere, but the regular ps The original name is in /proc/self/comm and /proc/self/status and others:
/proc/14787/comm /proc/14787/environ /proc/14787/maps /proc/14787/numa_maps /proc/14787/sched /proc/14787/smaps /proc/14787/stat /proc/14787/status
|
From bruen...@google.com on July 13, 2012 06:38:50
|
From bruen...@google.com on July 13, 2012 06:44:59 *** TODO proposal for transparency
The one thing we will not easily change is the full path in |
From rnk@google.com on July 13, 2012 07:01:58 I tried changing the "_" environment variable, and that was not reflected in /proc/pid/status either. I thought maybe the kernel was getting it from there instead of argv[0]. It must store it somewhere. |
From rnk@google.com on July 13, 2012 07:06:32 Oh, hello: http://www.kernel.org/doc/man-pages/online/pages/man2/prctl.2.html Look at PR_SET_NAME. Available since Linux 2.6.9. I bet that does what we want. |
From bruen...@google.com on July 13, 2012 08:48:09 that just sets a name, not a path, so it doesn't seem like it solves the remaining issue of the mapped path |
From rnk@google.com on July 13, 2012 09:38:36 I'm OK with libdynamorio.so showing up in /proc/pid/maps, because most people use that for debugging. I just want to make sure that "killall myapp" still works. Not sure if killall uses the kernel name or the argv[0] name. |
From bruen...@google.com on July 13, 2012 10:00:21 PR_SET_NAME can replace step 1) in comment 14. the next 2 steps are still required, and the paths in the maps remain. for PR_SET_NAME xref drmem's -prctl_whitelist option |
From rnk@google.com on July 15, 2012 12:10:35 Qin, how far along are you on this? This was something I looked into using ptrace for a bit in April, but then we backed off because we didn't want to use ptrace. I think this is a fun project, so we both want to do it. :) OTOH, it would be silly to duplicate efforts. I spent a few hours looking into the approach I outlined in my email:
To "finish" libc independence I used the stdio isolation patch I sent out and ifdef'd out the rest of our libc usage (it's mostly in stackdump.c, the private loader, and the environment). I added "-static -e _start" to our link flags and took out -lc -lm -ldl. This gets rid of the PT_INTERP phdr and sets e_entry to point at _start. I defined _start in x86.asm to mov xsp into ARG1 and call a C routine. At this point, I could dump argc, argv, envp as expected, but the PLT and GOT have not been set up, so I could not make any references to exported functions or globals. Global data exports were actually really painful, because our_std* is exported. A lot of our functions have internal and external names, but some of them like the reg_* helpers do not. IMO it would be best if we could find some linker flag to avoid the GOT for internal references to exported data, but I couldn't find one. I thought -Bsymbolic was supposed to do this for functions, but it seems to still rely on the GOT/PLT. |
From rnk@google.com on July 15, 2012 12:16:41 I forgot to mention: If we can't find such a flag, that means we'll probably need a very specialized bootstrapping loader, like the one for Windows. |
From zhao...@google.com on July 15, 2012 21:33:15 A few more notes
my suggestion would be: By doing so app becomes the first arg1, then we can change argc and either change argv or shift the argv on stack, whichever is simpler. For this issue, I think a simple drinjector would be simpler than converting libdynamorio.so into lib/exec, which is more for attach. |
From rnk@google.com on July 16, 2012 06:04:24 Last night I got to the point where libdynamorio.so could be used both as a lib and an exe, so that's more or less already done. I got our loader to relocate DR, but then I didn't need it! I tweaked the ld flags a bit and all the GOT references ended up getting resolved internally. I called dynamorio_app_init() from start and it mostly works, except for the parts that try to interpret libc's TLS segment, which makes sense. w.r.t. the app name and command line, the execve syscall takes a filename and a full argv[] array. I wonder what happens if we pass the filename of libdynamorio.so, and the original argv[]? Maybe that will fix the command line in a simpler way. We can write a simple exe in C that just calls execve("libdynamorio.so", argv, envp) to test that. |
From bruen...@google.com on July 16, 2012 06:33:23 We should also think about MacOS and other *nix ports: a more general injector that uses fewer Linux-specific or even ELF-specific features will save effort later. Although in some cases things are just too different to share. |
From rnk@google.com on August 10, 2012 12:50:10 When launching a process under DR with early injection from gdb, I found that it disables ASLR. This actually has weird consequences for ET_DYN ELFs, and DR gets mapped at 0x555555554000. Things crash soon afterwards, because the kernel does not perform relocations on the GOT, which is filled out by default to assume we were loaded at our preferred base. If you always debug using attach, then this won't happen, but if you start the app under gdb this will come up. The solution is to run 'set disable-randomization off'. When ELF early injection is more prevalent, this should go into our .gdbinit file and HowToDebug. Here's the kernel source code that decides where to load ELFs:
|
From bruen...@google.com on September 05, 2012 22:49:41 reminder to implement control maintenance across execve (and augment suite to detect, ideally) |
From rnk@google.com on September 11, 2012 14:54:18 Another proc file that we haven't mentioned yet: /proc/self/exe This is a symlink to the filename passed to exec. I don't think there are any good ways to fake this. We may need to have some special casing around syscalls. For example, I believe Chrome calls exec on /proc/self/exe. Might be worth splitting this and early follow children as separate issues. |
From zhao...@google.com on October 25, 2012 07:56:24 Owner: rnk@google.com |
From zhao...@google.com on December 28, 2013 02:48:23 I just investigated the possibilities of using -z initfirst. Dynamic section at offset 0x3e38 contains 23 entries: It won't work with normal program, because the loader only honors one library for initfirst. If there are more than one libraries has flag initfirst, the last one will be remembered and called first on initialization. $ readelf -d /lib/x86_64-linux-gnu/libpthread-2.17.so Dynamic section at offset 0x17d50 contains 31 entries: later in call_init, pthread's init will be called first, and no other libraries' initfirst flags are remembered. Owner: zhao...@google.com |
From bruen...@google.com on December 28, 2013 07:07:31 For the record, this is in comment pthread having initfirst must be new, as it was not the case when we were investigating this in the past. |
From derek.br...@gmail.com on February 24, 2009 11:25:45
for LD_PRELOAD we'll start by using -z initfirst. for that we need libc
independence ( issue #48 /PR 206369) and to directly read our env vars off the
stack.
we should also directly read the elf aux vector (PR 289138).
xref issue #37 /PR 248204: ptrace injection
Original issue: http://code.google.com/p/dynamorio/issues/detail?id=47
The text was updated successfully, but these errors were encountered: