Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Want to konw how to use wrapper and ld-preload together. #609

Open
kenluobo opened this issue Nov 30, 2024 · 8 comments
Open

Want to konw how to use wrapper and ld-preload together. #609

kenluobo opened this issue Nov 30, 2024 · 8 comments

Comments

@kenluobo
Copy link

kenluobo commented Nov 30, 2024

How does wrapper call ld-preload, and how does ld-preload call wrapper

Environment:
OS:Ubuntu 20.04
Articature: X86-64 amd
Bear:3.1.5 debug
Build System:bitbake
Project url:https://gitee.com/jiangwei0512/poky.git
Built Command: cd poky && source oe-init-build-env && bitbake go-helloworld

Additional context
Hello, I was recently looking at the code of the bear project. When I saw the wrapper and ld-preload, there was something I didn't quite understand.
For the bitbake project, after using the --force-wrapper and --force-preload parameters respectively, I found that no compilation instructions were collected.
But by default, compilation instructions can be collected. I want to know how the wrapper and ld-preload work together.
At the same time, I also found that the wrapper is a dynamically linked software, and the wrapper will be affected by ld-preload during operation, but it does not cause a circular call.
I want to konw how does wrapper call ld-preload, and how does ld-preload call wrapper, that can intercept all the compile commands.

I also sent a question in the gitter chat room, but found that no one seemed to have responded to other people's questions before, so I raised my question here as well.

@rizsotto
Copy link
Owner

https://github.com/rizsotto/Bear/wiki/Troubleshooting#how-it-works

I've wrote this wiki page to explain the internals. Hope that answer the questions you had.

@kenluobo
Copy link
Author

kenluobo commented Nov 30, 2024

https://github.com/rizsotto/Bear/wiki/Troubleshooting#how-it-works

I've wrote this wiki page to explain the internals. Hope that answer the questions you had.

I write a demo: intercept, libhook.so and wrapper.

  1. intercept call build command. e.g. intercept gcc -c -o test.o test.c
  • init LD_PRELOAD
  • std::system("${build-cmd}")
  1. libhook.so do a wrapper, then use wrapper call ${build-cmd}. e.g. wrapper gcc -c -o test.o test.c
  2. wrapper call ${build-cmd}; wrapper is a static executable(has not dynamic library)
  • init CC, CXX, CPP...; before: CC=gcc, after: CC= wrapper gcc
  • init PATH, insert my-gcc-sysbol-link-dir in front of PATH
  • insert or merge LD_PRELOAD
  • record compile commands

I don't know why it didn't work as expected with bear. It failed in the bitbake build system. And I find there has a subprocess clean env , the LD_PRELAOD can't be used by next process.

@rizsotto
Copy link
Owner

Would it be possible to run Bear in a verbose mode to see the logs what it does? (Maybe against a smaller project with a few file.)

Also, the library preload does have limitations: does not work with static compilers; does not work if the build cleans up the environment variables; might not work if the build manipulates the LD_PRELOAD variable; does not work if the compiler calls happens in a container (basel does that).

@kenluobo
Copy link
Author

kenluobo commented Nov 30, 2024

Sorry, after specifying --verbose, bitbake may be interrupted by using standard output or standard error output.
I am confused about two things:
①Here bitbake calls bitbake-server, bitbake-server clears all environment variables, and ld-preload is also cleared. But the magical thing is that I used strace to observe that bitbake-server and subsequent child process calls all passed ld-preload. And subprocess was called by wrapper.
②At the same time, wrapper is a dynamic link. When it executes posix_spawn, it is not repeatedly loaded by ld-preload, resulting in a circular call.

@kenluobo
Copy link
Author

kenluobo commented Dec 1, 2024

The intercept demo I wrote sets up ld-preload, and then ld-preload captures sube(subprocess execute) through the hook execve function. ld-preload calls wrapper sube, uses wrapper (use static link way) to record sube's information. wrapper uses posix_spawn call sube, and passes ld-preload to sube. Next, sube will call sub-next-e(the child of sube), but before calling sub-next-e, sube cleans up ld-preload. Since ld-preload is cleaned up, the monitoring chain is broken.
But I found that bear can still continue to monitor and pass ld-preload. This is amazing, but after trying to look at the code of wrapper-session, preload-session and wrapper main in intercept module, I still don't understand why this is happening. 🤔
If you can guess the reason, thank you for your answer.😄

@rizsotto
Copy link
Owner

rizsotto commented Dec 1, 2024

About how the wrapper is not statically linked, and still not called by the pre-loaded exec method... The process spawn has a version which looks up the libc symbol directly.

About the bitbake has server and client... I am not familiar with that tool, but have seen gradle which has similar concept. And these tools does not work together with Bear. The reason for that the server process which executes the build is not receives the environment variables that needed for the reporting. And if it receives, it caches and subsequent runs have different address to report the executions, so those report will be lost.... If there is an option to force to run the build as a child process and not under a server, that might have a chance to work.

The current version of Bear is using a supervisor process. This supervisor makes decision which intercept mode will be used. And then sets up the environment and execute the build process. During the build process the wrapper executable will be called. (It is the case for both pre-load and the compiler interpose mode.) The wrapper process connects to the supervisor and asks how to execute the received command. (See the proto file.) This returns which process to execute and what the environment variables shall be set. That guarantees the child processes will be intercepted too.

@kenluobo
Copy link
Author

kenluobo commented Dec 1, 2024

Thank you very much for your answer🌹. I understand that posix_spawn in wrapper uses the original libc symbol, which can avoid the problem of circular calling. 😄
There is another point, that is, sube clear the ld-preload env and uses execve call sub-next-e. How to ensure that wrapper and supervisor can continue to monitor the sub-next-e process. 🤔

@rizsotto
Copy link
Owner

rizsotto commented Dec 1, 2024

If the sube clears the environment variables (including LD_PRELOAD), then that's it. I am not aware how to workaround that... Both pre-load and compiler wrapper intercept modes have this "blind spot" and not able to intercept those child processes.

Alternatively ptrace can be a method where you can intercept these process executions. Bear does not implemented this. (Might do in the next version.) It also have strengths and weaknesses. (It does not work within docker. And it makes the execution slow.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants