Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple systemd units with the same name #192

Closed
mskarbek opened this issue Jan 5, 2022 · 14 comments · Fixed by #199
Closed

Multiple systemd units with the same name #192

mskarbek opened this issue Jan 5, 2022 · 14 comments · Fixed by #199
Labels
question Further information is requested

Comments

@mskarbek
Copy link

mskarbek commented Jan 5, 2022

I have just started playing with parca and this is the first thing that came up. ;)

Use case - multiple containers with systemd inside:
Cgroup v2 structure:

/sys/fs/cgroup/machine.slice/machine-libpod_pod_<some_hash>.slice/libpod-<some_hash>.scope/container/system.slice
/sys/fs/cgroup/machine.slice/machine-libpod_pod_<other_hash>.slice/libpod-<other_hash>.scope/container/system.slice
/sys/fs/cgroup/machine.slice/machine-libpod_pod_<antother_hash>.slice/libpod-<another_hash>.scope/container/system.slice

Each of the containers runs systemd unit with the same name. Obviously, from the container perspective those are separate systemd instances, but from the host perspective when using single parca-agent there is a problem, I can't just use --systemd-units=my-app.service - parca-agent will try to find that service under /sys/fs/cgroup/system.slice (BTW default for now is /sys/fs/cgroup/systemd/system.slice which is wrong for cgroup v2. Shouldn't this be autodetected?)

Can this be supported, or should I run multiple agents for each container with --systemd-cgroup-path= pointing at each system.slice inside machine.slice?

Env: RHEL 8 and podman.

@kakkoyun
Copy link
Member

kakkoyun commented Jan 5, 2022

@mskarbek Thanks for reaching out. Which version of parca-agent is this?

We recently fixed issues regarding cgroup v2, it's in the main branch but we haven't released it yet. #178

cc @derekparker

@kakkoyun kakkoyun added the question Further information is requested label Jan 5, 2022
@mskarbek
Copy link
Author

mskarbek commented Jan 5, 2022

@kakkoyun current main (d20ec7a) build locally on RHEL 8 and Fedora 34
RHEL 8:

  • make-4.2.1-10.el8.x86_64
  • gcc-8.5.0-4.el8_5.x86_64
  • golang-1.16.12-1.module+el8.5.0+13637+960c7771.x86_64
  • clang-12.0.1-4.module+el8.5.0+13246+cefb5d4c.x86_64
  • llvm-12.0.1-2.module+el8.5.0+12488+254d2a07.x86_64
  • coreutils-8.30-12.el8.x86_64
  • binutils-2.30-108.el8_5.1.x86_64
  • elfutils-0.185-1.el8.x86_64
  • elfutils-devel-0.185-1.el8.x86_64
  • zlib-devel-1.2.11-17.el8.x86_64
  • kernel-4.18.0-348.7.1.el8_5.x86_64

Fedora 34:

  • make-4.3-5.fc34.x86_64
  • gcc-11.2.1-1.fc34.x86_64
  • golang-1.16.8-1.fc34.x86_64
  • clang-12.0.1-1.fc34.x86_64
  • llvm-12.0.1-1.fc34.x86_64
  • coreutils-8.32-30.fc34.x86_64
  • binutils-2.35.2-6.fc34.x86_64
  • elfutils-0.186-1.fc34.x86_64
  • elfutils-devel-0.186-1.fc34.x86_64
  • zlib-devel-1.2.11-26.fc34.x86_64
  • kernel-5.14.20-200.fc34.x86_64

Currently, I'm trying to force parca-agent to run properly on both systems. Both systems are configured to use only cgroup v2. The above is an observation after the first few minutes.

@mskarbek
Copy link
Author

mskarbek commented Jan 5, 2022

./parca-agent --http-address=":7071" --node=systemd-test --systemd-units=fake-service.service --kubernetes=false --store-address=127.0.0.1:7070 --insecure --log-level=debug
ts=2022-01-05T19:52:57.548707165Z caller=main.go:100 msg=starting... node=systemd-test store=127.0.0.1:7070                                                                   
level=debug ts=2022-01-05T19:52:57.54876692Z caller=main.go:101 msg="parca-agent initialized" version= commit= date= builtBy= config="{debug :7071 systemd-test map[] 127.0.0.
1:7070   true false 1 false  [fake-service.service] /tmp  10s /sys/fs/cgroup/systemd/system.slice}"                                                                           
level=debug ts=2022-01-05T19:52:57.549231852Z caller=discoverymanager.go:186 msg="Starting provider" provider=systemd/0 subs=[systemd]                                        
level=debug ts=2022-01-05T19:52:57.549755489Z caller=main.go:288 msg="starting batch write client"                                                                            
level=debug ts=2022-01-05T19:52:57.549802659Z caller=main.go:321 msg="starting discovery manager"                                                                             
level=debug ts=2022-01-05T19:52:57.549839388Z caller=main.go:332 msg="starting target manager"                                                                                
level=debug ts=2022-01-05T19:52:58.550529698Z caller=systemd.go:79 discovery=systemd msg="running systemd manager" units=1                                                    
level=debug ts=2022-01-05T19:52:59.550545606Z caller=systemd.go:79 discovery=systemd msg="running systemd manager" units=1                                                    
level=debug ts=2022-01-05T19:53:00.550308653Z caller=systemd.go:79 discovery=systemd msg="running systemd manager" units=1                                                    
level=debug ts=2022-01-05T19:53:01.550362306Z caller=systemd.go:79 discovery=systemd msg="running systemd manager" units=1                                                    
level=debug ts=2022-01-05T19:53:02.549495442Z caller=systemd.go:79 discovery=systemd msg="running systemd manager" units=1                                                    
level=debug ts=2022-01-05T19:53:02.550740194Z caller=targetmanager.go:222 msg="reconciling targets"
level=debug ts=2022-01-05T19:53:02.551075251Z caller=profile.go:160 labels="{__cgroup_path__=\"/sys/fs/cgroup/system.slice/fake-service.service/\", node=\"systemd-test\", systemd_unit=\"fake-service.service\"}" msg="starting cgroup profiler"
level=debug ts=2022-01-05T19:53:02.555007175Z caller=targetmanager.go:137 msg="profiler ended with error" error="open cgroup: open /sys/fs/cgroup/system.slice/fake-service.service/: no such file or directory" labels="{__name__=\"parca_agent_cpu\", node=\"systemd-test\", systemd_unit=\"fake-service.service\"}"
./parca-agent --http-address=":7071" --node=systemd-test --systemd-units=fake-service.service --kubernetes=false --store-address=127.0.0.1:7070 --insecure --systemd-cgroup-path="/sys/fs/cgroup/machine.slice/machine-libpod_pod_d289e2358b60fc95dc8679a1cee0b8d15869036a3deb5f6ad8d6e147a0407b7e.slice/libpod-d928f15e90168eafe453899b1f4a4a14c66b19dbbb3635368d8b77e3bd5f7498.scope/container/system.slice" --log-level=debug
ts=2022-01-05T19:53:52.356892145Z caller=main.go:100 msg=starting... node=systemd-test store=127.0.0.1:7070
level=debug ts=2022-01-05T19:53:52.356983826Z caller=main.go:101 msg="parca-agent initialized" version= commit= date= builtBy= config="{debug :7071 systemd-test map[] 127.0.0.1:7070   true false 1 false  [fake-service.service] /tmp  10s /sys/fs/cgroup/machine.slice/machine-libpod_pod_d289e2358b60fc95dc8679a1cee0b8d15869036a3deb5f6ad8d6e147a0407b7e.slice/libpod-d928f15e90168eafe453899b1f4a4a14c66b19dbbb3635368d8b77e3bd5f7498.scope/container/system.slice}"
level=debug ts=2022-01-05T19:53:52.357529245Z caller=discoverymanager.go:186 msg="Starting provider" provider=systemd/0 subs=[systemd]
level=debug ts=2022-01-05T19:53:52.357951951Z caller=main.go:288 msg="starting batch write client"
level=debug ts=2022-01-05T19:53:52.358014194Z caller=main.go:332 msg="starting target manager"
level=debug ts=2022-01-05T19:53:52.35806054Z caller=main.go:321 msg="starting discovery manager"
level=debug ts=2022-01-05T19:53:53.358608191Z caller=systemd.go:79 discovery=systemd msg="running systemd manager" units=1
level=debug ts=2022-01-05T19:53:54.358402241Z caller=systemd.go:79 discovery=systemd msg="running systemd manager" units=1
level=debug ts=2022-01-05T19:53:55.358399128Z caller=systemd.go:79 discovery=systemd msg="running systemd manager" units=1
level=debug ts=2022-01-05T19:53:56.358369889Z caller=systemd.go:79 discovery=systemd msg="running systemd manager" units=1
level=debug ts=2022-01-05T19:53:57.358488731Z caller=systemd.go:79 discovery=systemd msg="running systemd manager" units=1
level=debug ts=2022-01-05T19:53:57.358597482Z caller=targetmanager.go:222 msg="reconciling targets"
level=debug ts=2022-01-05T19:53:57.358767635Z caller=profile.go:160 labels="{__cgroup_path__=\"/sys/fs/cgroup/system.slice/fake-service.service/\", node=\"systemd-test\", systemd_unit=\"fake-service.service\"}" msg="starting cgroup profiler"
level=debug ts=2022-01-05T19:53:57.360957189Z caller=targetmanager.go:137 msg="profiler ended with error" error="open cgroup: open /sys/fs/cgroup/system.slice/fake-service.service/: no such file or directory" labels="{__name__=\"parca_agent_cpu\", node=\"systemd-test\", systemd_unit=\"fake-service.service\"}"

fake-service is a Go binary used for testing - https://github.com/nicholasjackson/fake-service
I have 5 podman containers on host, each with its own fake-service.servce systemd unit.

@derekparker
Copy link
Contributor

So as I understand it, this isn't an issue with cgroup v2 support but more an issue of using systemd within separate containers and then having parca-agent detect that cgroup fs hierarchy under the /sys/fs/cgroup/machine.slice prefix, correct?

Does supplying the --systemd-cgroup-path= parameter solve the issue? If so that may be the best solution for now. I think that running systemd within a container is somewhat of an edge case so I'm not sure whether or not we'd support it explicitly.

@mskarbek
Copy link
Author

mskarbek commented Jan 6, 2022

As illustrated by the above debug logs, using --systemd-cgroup-path= does not help. In fact, it appears that that path is ignored by the agent completely.

@derekparker
Copy link
Contributor

As illustrated by the above debug logs, using --systemd-cgroup-path= does not help. In fact, it appears that that path is ignored by the agent completely.

Ah, well, yes... that's a problem! I can take a shot at wiring that parameter up and see if we can use it to solve your problem at least partially!

derekparker added a commit to derekparker/parca-agent that referenced this issue Jan 6, 2022
This patch wires up the --systemd-cgroup-path flag and actually plumbs
the user provided value into the systemd config and subsequently into
the systemd discoverer to be used when trying to reconcile unit files.

Fixes parca-dev#192
@derekparker
Copy link
Contributor

@mskarbek could you please test this patch #199 and let me know if it helps solve your problem?

@mskarbek
Copy link
Author

mskarbek commented Jan 6, 2022

@derekparker patch does not apply onto d20ec7a and current main (af1ca89) fails to build for me right now. I'll resume testing when this will be resolved.

# sigs.k8s.io/json/internal/golang/encoding/json
../../go/pkg/mod/sigs.k8s.io/json@v0.0.0-20211020170558-c049b76a60c6/internal/golang/encoding/json/encode.go:1249:12: sf.IsExported undefined (type reflect.StructField has no field or method IsExported)
../../go/pkg/mod/sigs.k8s.io/json@v0.0.0-20211020170558-c049b76a60c6/internal/golang/encoding/json/encode.go:1255:18: sf.IsExported undefined (type reflect.StructField has no field or method IsExported)
make: *** [Makefile:64: dist/parca-agent] Error 2

@derekparker
Copy link
Contributor

@mskarbek hm, I'm having no problem building current main at the moment. Can you try clearing your module / build cache?

@mskarbek
Copy link
Author

mskarbek commented Jan 7, 2022

@derekparker:

[marcin@t14 projects]# rm -rf ~/go/pkg/mod/
[marcin@t14 projects]# rm -rf parca-agent/
[marcin@t14 projects]# git clone https://github.com/parca-dev/parca-agent.git
Cloning into 'parca-agent'...
remote: Enumerating objects: 2145, done.
remote: Counting objects: 100% (804/804), done.
remote: Compressing objects: 100% (298/298), done.
remote: Total 2145 (delta 611), reused 544 (delta 489), pack-reused 1341
Receiving objects: 100% (2145/2145), 6.92 MiB | 3.81 MiB/s, done.
Resolving deltas: 100% (1196/1196), done.
[marcin@t14 projects]# cd parca-agent/
[marcin@t14 parca-agent]# git submodule init 
Submodule '3rdparty/libbpf' (https://github.com/libbpf/libbpf) registered for path '3rdparty/libbpf'
[marcin@t14 parca-agent]# git submodule update 
Cloning into '/home/marcin/projects/parca-agent/3rdparty/libbpf'...
Submodule path '3rdparty/libbpf': checked out '7c382f0df9bcdda688cfed372e5c42eeee23b50c'
[marcin@t14 parca-agent]# make
(...)
# sigs.k8s.io/json/internal/golang/encoding/json
/home/marcin/go/pkg/mod/sigs.k8s.io/json@v0.0.0-20211020170558-c049b76a60c6/internal/golang/encoding/json/encode.go:1249:12: sf.IsExported undefined (type reflect.StructField has no field or method IsExported)
/home/marcin/go/pkg/mod/sigs.k8s.io/json@v0.0.0-20211020170558-c049b76a60c6/internal/golang/encoding/json/encode.go:1255:18: sf.IsExported undefined (type reflect.StructField has no field or method IsExported)
make: *** [Makefile:64: dist/parca-agent] Error 2

@derekparker
Copy link
Contributor

@mskarbek what version of Go are you using?

@mskarbek
Copy link
Author

mskarbek commented Jan 7, 2022

@derekparker golang-1.16.12-1.module+el8.5.0+13637+960c7771.x86_64 - it is a RHEL 8 package for 1.16.12

@derekparker
Copy link
Contributor

@derekparker golang-1.16.12-1.module+el8.5.0+13637+960c7771.x86_64 - it is a RHEL 8 package for 1.16.12

Ah, that's the issue. That dependency uses IsExported which was added in Go 1.17: https://pkg.go.dev/reflect#StructField.IsExported.

I'll update our go.mod to require Go 1.17.

@mskarbek
Copy link
Author

mskarbek commented Jan 7, 2022

This will exclude building on RHEL 8 until 8.6 is released (May), which is a little problematic. I'll try to work around that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants