Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling doesn't log any data #1061

Closed
dlienko opened this issue Sep 27, 2018 · 13 comments
Closed

Profiling doesn't log any data #1061

dlienko opened this issue Sep 27, 2018 · 13 comments
Labels
bug Something is not working.
Milestone

Comments

@dlienko
Copy link

dlienko commented Sep 27, 2018

Do you want to request a feature or report a bug?
A bug

What is the current behavior?
When profiling is enabled via an ENV property, the file where the data is supposed to be logged is created, but stays empty.

If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem.

/ # echo $PROFILING
memory
/ # 
/ # ls /tmp
profile002855020
/ # ls /tmp/profile002855020/
mem.pprof
/ # tail /tmp/profile002855020/mem.pprof 
/ # 
/ # 
/ # ls -l /tmp/profile002855020/mem.pprof 
-rw-r--r--    1 root     root             0 Sep 26 16:48 /tmp/profile002855020/mem.pprof

The same happens for PROFILING=cpu

Which version of the software is affected?

/ # hydra version
Version:    v1.0.0-beta.8
Git Hash:   93dcbcf3b9e0726c03b45b7e74ec9ca4c89eab03
Build Time: 2018-08-10T10:50:16Z

The build is taken from Dockerhub

I've noticed in the pkg/runtime/pprof documentation that CPU profiling may not work on linux systems if an app is built in a certain way: https://golang.org/pkg/runtime/pprof/#StartCPUProfile

Unfortunately, I don't know if hydra is built with those options. Also, there is no such remark for memory profiling option.

@michalwojciechowski
Copy link

michalwojciechowski commented Sep 27, 2018

Hello @aeneasr,

I would like to join the discussion & extend the scope of this issue. I'm after reading https://community.ory.sh/t/high-availability-and-scaling/664/10 and https://www.ory.sh/docs/guides/master/performance/1-hydra. Is there a way to expose more detailed metrics that may point us to the exact location of the possible bottlenecks, possibly on sql connections management? We are suffering from a degradation of response times from hydra if the number of requests/s is greater 50, the investigation is still ongoing though. Any advises/additional hints on performance investigation will be greatly welcome. Thank you!

@aeneasr
Copy link
Member

aeneasr commented Sep 27, 2018

@aaslamin currently introduces open tracing, this could probably help. But without any details about which endpoint(s) are affected is hard to give advice. Are you aware of the SQL config options available (max_con etc)?

@aeneasr
Copy link
Member

aeneasr commented Sep 27, 2018

Regarding the original issue, please make sure to actually kill the command at some point. I think the data is only flushed then. For build commands you can check out the Dockerfile itself or compile the binary yourself to see if it works then.

@aeneasr
Copy link
Member

aeneasr commented Sep 28, 2018

Hm, interesting, I can't get it working even with regular compilation. Not sure what's going on...

@aeneasr
Copy link
Member

aeneasr commented Sep 28, 2018

I think the flushing doesn't work properly or maybe there is some racy behavior, it would be helpful if someone could help investigate!

@michalwojciechowski
Copy link

Re max_conn param, according to SetMaxOpenConns, the default value is unlimited. Will let you know once we gather the metrics from the underlying DB to see if it's actually true. Also, you may expect from us more detailed insight about the distribution of traffic to the REST endpoints in the next days. Stay tuned!

@aeneasr
Copy link
Member

aeneasr commented Sep 28, 2018

Nice! Would be great if we could track this independently, feel free to create an issue regarding this any time.

@aeneasr
Copy link
Member

aeneasr commented Oct 5, 2018

Ok so my gut feeling here tells me that maybe the graceful http server handler causes issues with gracefully shutting down profiling. Not sure about this yet though.

@michalwojciechowski maybe #1069 is something that fixes your issues.

@aaslamin
Copy link
Contributor

aaslamin commented Oct 20, 2018

@michalwojciechowski in the tests that you are running, are you hitting the token endpoint heavily? If so, what you are seeing is expected given how expensive bcrypt is. The token endpoint is the most costly endpoint and must be protected via rate-limiting given how CPU intensive it is.

You can adjust the work factor for bcrypt via an environment variable to see if that makes a difference.

@aeneasr aeneasr added this to the unplanned milestone Oct 25, 2018
@aeneasr aeneasr added the bug Something is not working. label Oct 25, 2018
@kminehart
Copy link
Contributor

Ok so my gut feeling here tells me that maybe the graceful http server handler causes issues with gracefully shutting down profiling

I did some playing around with the profiling and that's what I found.

In ory/x/profilex/profile.go, changing:

return profile.Start(profile.CPUProfile)

to

return profile.Start(profile.CPUProfile, profile.NoShutdownHook)

seemed to do the trick. I'll make a PR as soon as I've verified that the profiling data shows up as expected. While it's not empty anymore, I haven't been able to get any actual data to propagate, which could be related to my environment or just not letting it run for long enough.

@aeneasr
Copy link
Member

aeneasr commented Mar 17, 2019

Oh right, then it doesn't hook into the shutdown. Not sure why that's in there though, it has to have a reason?

@aeneasr
Copy link
Member

aeneasr commented Mar 17, 2019

Oh I see, it might have to do something with .Stop() not actually being called because, well, we're doing a SIGINT with ctrl+c or shutdown. I think this should fix it then.

@aeneasr
Copy link
Member

aeneasr commented Mar 26, 2019

This has been fixed on master by @kminehart

@aeneasr aeneasr closed this as completed Mar 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working.
Projects
None yet
Development

No branches or pull requests

5 participants