Skip to content

Conversation

@yanglong1010
Copy link
Contributor

hi,

Our project uses the java-profiler code, and the Java process has a Too many open files error. After using strace -k -e trace=openat -f java-pid, I feel that I may have found the reason. After I modified it slightly, I have verified that the problem can be solved. I don't know if I can contribute a little code to java-profiler.

[WARN] perf_event_open for TID 30704 failed: Too many open files

ulimit -n
256

lsof -p 30499|grep '/proc/30499/task'|wc -l
202

lsof -p 30499|grep '/proc/30499/task'|head
java 30499 root 23r DIR 0,3 0 61771716 /proc/30499/task
java 30499 root 24r DIR 0,3 0 61771716 /proc/30499/task
java 30499 root 51r DIR 0,3 0 61771716 /proc/30499/task
java 30499 root 52r DIR 0,3 0 61771716 /proc/30499/task
java 30499 root 53r DIR 0,3 0 61771716 /proc/30499/task
java 30499 root 54r DIR 0,3 0 61771716 /proc/30499/task
java 30499 root 55r DIR 0,3 0 61771716 /proc/30499/task
java 30499 root 56r DIR 0,3 0 61771716 /proc/30499/task
java 30499 root 57r DIR 0,3 0 61771716 /proc/30499/task
java 30499 root 58r DIR 0,3 0 61771716 /proc/30499/task

@richardstartin
Copy link
Contributor

Hi @yanglong1010 it's interesting that you're running the wallclock profiler without the tracer-guided thread filter - we were considering removing the fallback mode. Are you running the profiler without dd-java-agent, or are you setting -Ddd.profiling.ddprof.wall.context.filter=false?

@yanglong1010
Copy link
Contributor Author

Yes, we did not use dd-java-agent, but a java agent written by ourselves.

I found that you recorded the span id in the wall clock sample event. This is a very good idea. I ported this part of the code into our project to achieve a similar continuous profiling effect. When running in continuous mode, we set the thread filter and everything is fine. The other mode is a one-time short-term opening without setting the thread filter, so it will enter the fallback mode.

@richardstartin
Copy link
Contributor

@yanglong1010 Thank you so much for this patch. This is a public mirror of a repository we do new work in internally, so I can't merge your patch here, but I will apply your commit manually and sync it here later.

Further patches are very welcome, and if you ever want to discuss features or give feedback, please email me at richard [dot] startin [at] datadoghq.com

@richardstartin
Copy link
Contributor

@yanglong1010 your commit was merged here 6f95dc4

r1viollet added a commit that referenced this pull request Oct 23, 2024
Avoid shifting a negative value
This was raised by asan nightly runs

> Task :ddprof-test:testClasses
ddprof-lib/src/main/cpp/vmStructs.cpp:816:34: runtime error: left shift of negative value -1
    #0 0x7f6f6bd20631 in ScopeDesc::readInt()
    #1 0x7f6f6bce54a6 in ScopeDesc::decode(int)
    #2 0x7f6f6bce7fb0 in StackWalker::walkVM(void*, ASGCT_CallFrame*, int, void const*, void const*)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants