-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add event tracing and ETDumps to executor_runner #5027
base: main
Are you sure you want to change the base?
Add event tracing and ETDumps to executor_runner #5027
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5027
Note: Links to docs will display an error until the docs builds have been completed. ❌ 31 New FailuresAs of commit fde5862 with merge base 01d526f (): NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label 'partner: arm' |
@pytorchbot label ciflow/trunk |
Can't add following labels to PR: ciflow/trunk. Please ping one of the reviewers for help. |
Hi @GregoryComer. Would it be possible to run the CI on your side to see if the issue from the previous PR is still occurring? I'm having a hard time understanding where this comes from. |
@digantdesai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
2 similar comments
@digantdesai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@digantdesai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
- Enabled via EXECUTORCH_ENABLE_EVENT_TRACER - Add flag 'etdump_path' to specify the file path for the ETDump file - Add flag 'num_executions' for number of iterations to run - Create and pass event tracer 'ETDumpGen' - Save ETDump to disk - Update docs to reflect the changes Signed-off-by: Benjamin Klimczak <benjamin.klimczak@arm.com> Change-Id: I7e8e8b7f21453bb8d88fa2b9c2ef66c532f3ea46
3288eda
to
b09d09e
Compare
Hi @dbort . Sorry for dragging you into this, but I saw your comment on |
I don't see a CI failure anymore |
@digantdesai To me |
@digantdesai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Yeah I see, from the main CMakeList, and qnn does have |
Any update on this? |
Hi @digantdesai, I'm still hoping for some pointer from @dbort or you as I'm struggling to reproduce it locally and can't really make sense of the error. |
@digantdesai will you have a look at this one since it touches code outside arm delegate. Thx! |
The error shows up when running this script If you have a linux machine, can you follow https://pytorch.org/executorch/stable/build-run-qualcomm-ai-engine-direct-backend.html and see if the script fails? |
@cccclai I finally managed to reproduce the issue by running script |
Ah I remember that. @dbort and @Olivia-liu, any thought on this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry that I missed your mentions of me for so long! Thanks @cccclai for pointing me to this.
Just to double check, is this the error you're seeing?
gmake[2]: *** No rule to make target '../third-party/flatcc/lib/libflatccrt.a', needed by 'executor_runner'. Stop.
I think I remember @Olivia-liu looking into this, and worked around it by building in release mode. (hence
-DCMAKE_BUILD_TYPE=Release \ |
|
||
Result<Method> method = program->load_method(method_name, &memory_manager); | ||
EventTracer* event_tracer_ptr = nullptr; | ||
#ifdef ET_EVENT_TRACER_ENABLED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is already so long and complex, I'd like to factor out these ifdefs if possible.
You could create a class to encapsulate the event tracing, like
class TraceManager {
public:
TraceManager();
EventTracer* get_event_tracer();
Error write_etdump_to_file(const char* filename);
};
If tracing is enabled, the ctor could create the ETDump (a field), get_event_tracer can return a pointer to it, and write_to_file can open the file and write the contents. If tracing is disabled, the class is basically empty, returning a null tracer and just returning Error::NotSupported when asked to write.
Then main() can say
TraceManager tracer;
program->load_method(..., tracer->get_event_tracer());
...
if (tracer->get_event_tracer() != nullptr) {
status = tracer->write_etdump_to_file(FLAGS_etdump_path.c_str());
ET_CHECK_MSG(status == Error::Ok, ...);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Let me know if the implementation looks ok.
Also for what it's worth, I'm trying to merge dvidelabs/flatcc#306 into upstream flatcc to let us remove |
Thanks @dbort , this is the error exactly. I'm not sure about the workaround to use release mode. The command used in the ci script is already using |
Ok, thanks for looking into that.
That means that there's some kind of race condition. Based on your PR, it looks like executor_runner has a proper dependency on libflatccrt; otherwise I would have expected a use-without-dependency situation. ...except maybe it doesn't. This PR adds a dep on "${FLATCCRT_LIB}" from the top-level CMakeLists.txt, but when I look for a place that sets FLATCCRT_LIB I only see it in the cmake config file at executorch/build/executorch-config.cmake Lines 51 to 55 in 538bfaf
afaik, that config isn't included in the top-level cmake system. That file is used to point to an already-built version of the core ET libs from external projects, like
If
then executor_runner wouldn't properly depend on libflatccrt.a. A parallel build could cause that lib to be coincidentally built earlier with -j16, "fixing" the problem, while -j2 would be less likely to do so. Could you try printing the value of FLATCCRT_LIB from the top-level CMakeLists.txt to see if it's empty? |
Though theoretically executor_runner shouldn't even need to know about libflatccrt: it should inherit the dep from the PUBLIC section of executorch/devtools/CMakeLists.txt Lines 179 to 183 in 538bfaf
But in this case, you could try updating this PR to use |
Hi @dbort . I tried fixing the flatccrt dependency as suggested, but without any effect:
(I did this both in I did find a new workaround though, which should be more stable than just removing the I feel like the flatccrt issue is not related to my change so I will open an issue for it. I can push the above workaround in a separate PR. I will be off from tomorrow until next year, but I really hope we can find a solution together to get this PR merged. |
- Raise a CMake error if event tracing is enabled without the devtools - Re-factoring of the changes in the portable executor_runner - Minor fix in docs Change-Id: Ia50fef8172f678f9cbe2b33e2178780ff983f335 Signed-off-by: Benjamin Klimczak <benjamin.klimczak@arm.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! All issues be fixed now.
Change-Id: I0ebb22636cdd64aea24bcee51cba05496ed78b1f
Re-upload of #4502 to discuss with @GregoryComer.