Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Improvements #6

Closed
saviorand opened this issue Jan 4, 2024 · 8 comments
Closed

Performance Improvements #6

saviorand opened this issue Jan 4, 2024 · 8 comments

Comments

@saviorand
Copy link
Collaborator

Parallelization and performance optimizations

@crunchy-vonage
Copy link

I appreciate you may not have optimised yet.
But fyi, I get approximately:-

  • 50req/s with mojo lightbug.🔥
  • 100req/s compiled

whereas python flask does about 1000 req/s on a single core.
Performance profile attached
perf

@crunchy-vonage
Copy link

Nether mind it's just something in the Welcome handler
With my own handler, I get 2700 req/s.
perf

@saviorand
Copy link
Collaborator Author

Woah, nice! Thanks for testing! Yes, the welcome handler serves an html page with an image, which might be slower. Can I ask how you're profiling this? The charts look sick

@crunchy-vonage
Copy link

Profile was with Linux's built in kernel profiler and "perf" usermode tool, I couldn't find a profiler specifically for mojo yet. This technique does have the advantage of showing all user and kernel mode activity, i.e. the libc and cpython work.

I suspect there is a lot of memory allocation or copying happening in the welcome handler but I'm not all that familiar with mojo and haven't found a technique to profile memory allocation.

i'm also suspicious the use of python sockets might be suboptimal, but what do i know?

flame graph is by https://www.brendangregg.com/perf.html

git clone https://github.com/brendangregg/FlameGraph
cd FlameGraph
sudo perf record -F99 -g -p `pgrep lightbug` -- sleep 60
sudo perf script | ./stackcollapse-perf.pl > out.perf-folded
./flamegraph.pl out.perf-folded > perf.svg
google-chrome perf.svg

@crunchy-vonage
Copy link

you might also enjoy perf top

@crunchy-vonage
Copy link

Yeah 1500req/s with the base64 image removed.

@saviorand
Copy link
Collaborator Author

@crunchy-vonage we're actually doing external_calls to C in the Mojo server implementation in the sys folder (this one is enabled by default) and not talking to Python! Python is only invoked in the separate Python implementation in the python folder

@saviorand saviorand changed the title [EPIC] Performance Improvements Performance Improvements Apr 14, 2024
@saviorand
Copy link
Collaborator Author

I've made some improvements in #40 , getting 10468 reqs per second now with wrk. wrk is the tool used, among other things, for TechEmpower benchmarks. I have a fork for potential submission here, but the performance is not satisfying enough yet, and we don't even have JSON serialization support in order to submit it to the listing. Would be cool if we can make an entry at some point though.

Running 1s test @ http://localhost:8080
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.67ms   11.33ms  78.56ms   94.60%
    Req/Sec     9.56k     2.40k   11.29k    72.73%
  Latency Distribution
     50%   53.00us
     75%   58.00us
     90%   98.00us
     99%   66.70ms
  10468 requests in 1.10s, 1.59MB read

@saviorand saviorand moved this to Backlog in Lightbug's Roadmap May 28, 2024
@saviorand saviorand moved this from Backlog to Done in Lightbug's Roadmap Jul 22, 2024
@saviorand saviorand closed this as completed by moving to Done in Lightbug's Roadmap Jul 22, 2024
@github-project-automation github-project-automation bot moved this from Backlog to Done in Lightbug's Roadmap Jul 22, 2024
@saviorand saviorand moved this from Done to Backlog in Lightbug's Roadmap Jul 22, 2024
@saviorand saviorand moved this from Done to Backlog in Lightbug's Roadmap Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants