Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Worker Coordinator Async Profiler #710

Open
wants to merge 40 commits into
base: main
Choose a base branch
from

Conversation

IanHoang
Copy link
Collaborator

@IanHoang IanHoang commented Dec 10, 2024

Description

OSB comes with a profiler, called Async Profiler, that can analyze performance calls in OSB's worker coordinator. In order to use the Async Profiler, users can run the tests with --enable-worker-coordinator-profiling and profiled results will be outputted to ~/.benchmark/logs/profile.log once the benchmark completes. The profiler is helpful when determining if custom runners or param sources are bottlenecking OSB performance.

The profiler currently outputs all profiled lines to ~/.benchmark/logs/profile.log. This PR makes it easier to interpret the profile.log by labeling which output pertains to which task and client (if using more than one client) and gives users the options to sort the outputted lines by sort types (or column names). This will also help users who are investigating performance bottlenecks in the code.

Example profile.log that was not sorted

=== Profile Start for client id [2] and task [warmup-indices] ===

Clock type: CPU
Ordered by: totaltime, desc

name                                                                                                                                          ncall     tsub      ttot      tavg
...
/home/ec2-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/thespian/system/transport/__init__.py:260 TransmitIntent.tx_done      5         0.000019  0.000254  0.000051
/home/ec2-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/thespian/system/transport/__init__.py:306 TransmitIntent.expired      27        0.000087  0.000252  0.000009
/usr/lib64/python3.9/typing.py:352 _SpecialForm.__getitem__                                                                                   2/1       0.000007  0.000238  0.000119
/usr/lib64/python3.9/asyncio/selector_events.py:481 _UnixSelectorEventLoop.sock_connect                                                       1         0.000027  0.000238  0.000238
/usr/lib64/python3.9/concurrent/futures/thread.py:180 ThreadPoolExecutor._adjust_thread_count                                                 1         0.000025  0.000237  0.000237
/usr/lib64/python3.9/typing.py:472 _SpecialForm.Optional                                                                                      1         0.000009  0.000234  0.000234
..ec2-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/thespian/system/transport/TCPTransport.py:1009 TCPTransport._next_XMIT_5  4         0.000014  0.000231  0.000058
/usr/lib64/python3.9/asyncio/sslproto.py:630 SSLProtocol._on_handshake_complete                                                               1         0.000032  0.000224  0.000224
/home/ec2-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/thespian/system/timing.py:179 ExpirationTimerView.__init__            165       0.000190  0.000223  0.000001
..-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/thespian/system/transport/__init__.py:244 TransmitIntent.completionCallback  5         0.000036  0.000219  0.000044
/usr/lib64/python3.9/contextlib.py:114 _GeneratorContextManager.__enter__                                                                     25        0.000053  0.000219  0.000009
..ec2-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/thespian/system/transport/TCPTransport.py:670 TCPTransport._finishIntent  4         0.000059  0.000214  0.000053
/usr/lib64/python3.9/typing.py:841 _SpecialGenericAlias.copy_with                                                                             2         0.000013  0.000209  0.000105
/usr/lib64/python3.9/asyncio/base_events.py:741 _UnixSelectorEventLoop.call_soon                                                              17        0.000055  0.000202  0.000012
/home/ec2-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/aiohttp/client_reqrep.py:279 ClientRequest.__init__                   1         0.000087  0.000197  0.000197
/home/ec2-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/aiohttp/connector.py:908 TCPConnector._resolve_host                   1         0.000041  0.000196  0.000196
../ec2-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/thespian/system/transport/asyncTransportBase.py:43 exclusive_processing  51        0.000093  0.000187  0.000004
/home/ec2-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/thespian/system/transport/TCPTransport.py:1061 _socketFile            53        0.000120  0.000187  0.000004
/home/ec2-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/aiohttp/client_reqrep.py:1040 RawClientResponse.start                 1         0.000033  0.000185  0.000185
/home/ec2-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/aiohttp/connector.py:829 TCPConnector.__init__                        1         0.000081  0.000184  0.000184
..pensearch-benchmark/.venv/lib64/python3.9/site-packages/thespian/system/transport/TCPTransport.py:652 TCPTransport._scheduleTransmitActual  1         0.000017  0.000180  0.000180
/usr/lib64/python3.9/typing.py:434 _SpecialForm.Union                                                                                         1         0.000010  0.000178  0.000178
../ec2-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/thespian/system/transport/TCPTransport.py:885 TCPTransport._next_XMIT_2  4         0.000030  0.000177  0.000044
/usr/lib64/python3.9/contextlib.py:123 _GeneratorContextManager.__exit__                                                                      26        0.000063  0.000177  0.000007
../ec2-user/opensearch-benchmark/.venv/lib64/python3.9/site-packages/thespian/system/transport/__init__.py:140 ResultCallback.resultCallback  10/5      0.000042  0.000170  0.000017

Example of profile.log ouput sorted on ncalls field

=== Profile start for client id [2] and task [warmup-indices] ===

Clock type: CPU
Ordered by: ncall, desc

name                                                                                                                                          ncall     tsub      ttot      tavg
/usr/lib64/python3.9/asyncio/base_events.py:1923 _UnixSelectorEventLoop.get_debug                                                             44        0.000037  0.000037  0.000001
/usr/lib64/python3.9/asyncio/base_events.py:694 _UnixSelectorEventLoop.time                                                                   29        0.000054  0.000081  0.000003
src/propcache/_helpers_c.pyx:35 __get__                                                                                                       27/22     0.000061  0.000112  0.000004
/usr/lib64/python3.9/asyncio/base_events.py:513 _UnixSelectorEventLoop._check_closed                                                          27        0.000023  0.000023  0.000001
/usr/lib64/python3.9/asyncio/events.py:78 Handle._run                                                                                         23        0.000061  0.005928  0.000258
/usr/lib64/python3.9/asyncio/events.py:31 TimerHandle.__init__                                                                                23        0.000078  0.000106  0.000005
/usr/lib64/python3.9/typing.py:713 _GenericAlias.__setattr__                                                                                  19        0.000042  0.000113  0.000006
/usr/lib64/python3.9/typing.py:665 _is_dunder                                                                                                 19        0.000043  0.000070  0.000004
/usr/lib64/python3.9/asyncio/base_events.py:770 _UnixSelectorEventLoop._call_soon                                                             18        0.000069  0.000156  0.000009
/usr/lib64/python3.9/selectors.py:452 EpollSelector.select                                                                                    17        0.000137  0.000272  0.000016
/usr/lib64/python3.9/asyncio/base_events.py:741 _UnixSelectorEventLoop.call_soon                                                              17        0.000055  0.000205  0.000012
/usr/lib64/python3.9/asyncio/selector_events.py:592 _UnixSelectorEventLoop._process_events                                                    17        0.000036  0.000085  0.000005
/usr/lib64/python3.9/asyncio/base_events.py:1830 _UnixSelectorEventLoop._run_once                                                             16        0.000243  0.006659  0.000416

Issues Resolved

#709

Testing

  • New functionality includes testing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
IanHoang and others added 8 commits December 10, 2024 16:16
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <51065478+IanHoang@users.noreply.github.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
…pace

Signed-off-by: Ian Hoang <ianhoang16@gmail.com>
@IanHoang IanHoang changed the title Enhance worker coordinator profiler Enhance Worker Coordinator Async Profiler Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant