Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpose by means of a filter #1027

Closed
iapaddler opened this issue Jul 7, 2022 · 2 comments
Closed

Interpose by means of a filter #1027

iapaddler opened this issue Jul 7, 2022 · 2 comments
Assignees
Labels

Comments

@iapaddler
Copy link
Contributor

The goal is to remove the need for a user to preload, ldscope or attach. This could be done automatically by means of a list of processes to interpose and possibly a list of processes not to interpose.

Contextually there are 2 paths; preload when a process starts and attach after a process has started. It is likely that both are needed at some level.

Initial research efforts will focus on an ability to interpose any and all processes and apply a filter as provided. The ability to attach to a process after it has started is fairly well understood. There are numerous paths from which this can be accomplished.

@iapaddler
Copy link
Contributor Author

Configured an alpine container to preload libscope for every process started after login.

docker run --name musl --privileged -v /home/ubuntu/appscope/:/opt/appscope -it alpine:latest ash
docker exec -it musl ash -l

The "-l" is needed as it causes the profile to be executed. It is not needed for ubuntu.

added to /etc/profile:

export LD_LIBRARY_PATH=/tmp/libscope-1.1.1/
export LD_PRELOAD=/opt/appscope/lib/linux/x86_64/libscope.so

We could document this as a way to enable preload of everything in a docker musl libc env. There are no init.d scripts defined in this container. Probably need to create an alpine VM in order to investigate openrc init in alpine.

@iapaddler
Copy link
Contributor Author

iapaddler commented Jul 8, 2022

A design approach:

We want to start with the context of giving preference to preload on process start over attach after process start. When attaching after process start we lose visibility of start up and a number of limitations. Delineating these is probably better defined in a different communication.

Precedence to preload implies that preload from the loader is used only to load libscope.so. Currently the loader effectively performs loading of libscope.so and interposition of functions because libscope.so is loaded before dependent libs. That behavior will need to change. We currently support the ability to interpose by means of GOT hooking. That is used in an attach operation. It is not a new capability. We will need to disable interposition from the loader and adopt a GOT hooking mechanism for most interpositions.

This shift in interposition approach enables the ability to load libscope.so and choose, based on a user supplied filter, which process will be interposed.

The pathname of the process to be started, as supplied to an exec function, will be compared to a list of processes that should be scoped. There are suggestions that an enable and a deny list should be provided. At this point we are not addressing specifics of the config. Suffice to say that a list of processes is available to libscope. We expect this list to be provided by Edge. It has yet to be determined what default behavior to take if a process list is not available. Coming soon.

At lib constructor time, before the app starts, the executable path is compared to a list and if the process to be started matches, then all the existing functions defined by libscope will be interposed, connections established and events emitted. This reflects current libscope behavior.

It is expected that Edge will provide a default config in a path that libscope checks. That is beyond the scope of this design discussion at this point. The AppScope team will provide Edge with the definition of an appropriate default config.

At lib constructor time if the app does not match an entry in a process list, functions will not be interposed. The use of GOT hooking makes this possible. In this case libscope will interpose execve (and possibly execv as needed. same behavior). One function is interposed. This enables the capability to track child processes created by a process that is not scoped. No connections for events, metrics or logs will be made. No data is emitted. No activity is tracked. The library exists, but nothing except execve is interposed or in any way interacted. This enables the scenario where a process does not match a process list entry but a child process does match.

Note that a distinction is now being made between preloading and interposing or in colloquial terms scoping. A process can be preloaded and not scoped. Where a process list matches, a process can be preloaded and scoped.

At a high level, there are two categories to be addressed in order to preload potential processes defined in a process list; 1) services and 2) interactive apps. Services are those processes started when the OS boots and controlled by a startup mechanism including systemd or initd. Interactive processes are those started by a user having logged in to a system and executing commands.

There is existing capability in the scope CLI, the service command, to enable systemd and initd services to be scoped. A new CLI command will be added to accept and parse a process list. The process list entries will be compared to services defined on the host. Where there is a match, for example /usr/lib/systemd/system/cron.service matches a process entry for cron, the service will be scoped by means of the CLI service capability. As part of the new CLI command the user will be presented with the option to restart a newly scoped service, attach to an existing service or take no action at that point. The new CLI command will require root perms. It is expected that Edge will execute the new CLI command when it provides a process list.

Interactive apps will be preloaded by one of a few potential mechanisms. Note that a process is not scoped unless there is a match with a process list entry. Also note that there are at least three scenarios to consider when preloading interactive apps; 1) local login, 2) remote login and 3) container start. It is not clear at this point if a single approach is effective in all cases. There is a historic glibc capability to define an /etc/ld.so.preload file that causes all processes to preload an entry in that file. This is not supported by musl libc. Therefore, this is not considered a viable approach. The act of scoping the sshd service works well and supports remote logins. All processes started by that logged in session will be preloaded. There is a great demo of this. Processes started from a local login can be preloaded in one of a couple ways. We are investigating scoping the login service. We have had success in adding two lines to an /etc/profile file. This has proven to enable preloading of libscope for all processes started from a local login and from a container start. Research and POC efforts are in progress to define specific approach.

For reference, for future consideration, the approach defined here enables the ability to consider several states of a preloaded process. Up to now there is one state when scope is enabled; a process has libscope loaded, functions are interposed, connections are established for events, metrics and logs and a periodic thread is executing. The approach delineated here defines a second state; preloaded, one function interposed, no connections established and no periodic thread enabled.

We can now start to consider various states. For example a process could be preloaded and a connection established to Edge or other remote source. In this manner no events are emitted, yet the process can be remotely managed such that a user could request that data start being emitted. Likewise, a process could be preloaded, all functions interposed, a connection to remote management established and no data emitted. A user could then request that data start to flow, which will include all network activity currently in progress, for example. Conversely, a process could be preloaded, no connection established, with a periodic thread executing. This would allow the periodic thread to respond to changes in confg from the local file system in lieu of a remote connection. Numerous options as we move forward.

Ref: https://github.com/criblio/appscope/discussions/960

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants