Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VITIS-13434 Provide more error details for AIE async errors via xrt::error #8736

Merged
merged 92 commits into from
Feb 4, 2025

Conversation

zhangchiming
Copy link
Collaborator

Problem solved by the commit
Initial check in for adding more error info (NPU error id, col and row) from driver to xrt::error.

Bug / issue (if any) fixed, which PR introduced the bug, how it was discovered
N/A

How problem was solved, alternative solutions (if any) and why they were rejected
Add another 64 bit error code to existing xrt::error_impl and it will be populated by device driver.

Risks (if any) associated the changes in the commit
Low, since the new code path will not be used until the NPU device driver code is updated with new error reporting.

What has been tested and how, request additional testing if necessary
Compile and tested on Windows 11 with both existing driver and the upcoming new driver with extra error code reporting.

Documentation impact (if any)
N/A

zhangchiming and others added 30 commits June 5, 2024 20:48
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
2) avoid crash from getting hip mem object from null pointer
3) allow setting 0 for kernel arguments

Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
…tation.

Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
…t char*.

Signed-off-by: Chiming <chimingz@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Signed-off-by: Chiming Zhang <Chiming.Zhang@amd.com>
Copy link
Collaborator

@stsoe stsoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation here is clean and good, but I have few clarifying questions.

src/runtime_src/core/common/api/xrt_error.cpp Outdated Show resolved Hide resolved
src/runtime_src/core/common/api/xrt_error.cpp Show resolved Hide resolved
src/runtime_src/core/common/api/xrt_error.cpp Show resolved Hide resolved
src/runtime_src/core/common/query_requests.h Outdated Show resolved Hide resolved
Chiming added 2 commits February 2, 2025 22:52
Signed-off-by: Chiming <chimingz@amd.com>
Signed-off-by: Chiming <chimingz@amd.com>
Chiming added 2 commits February 2, 2025 22:59
Signed-off-by: Chiming <chimingz@amd.com>
Signed-off-by: Chiming <chimingz@amd.com>
@zhangchiming zhangchiming requested a review from stsoe February 3, 2025 07:08
@stsoe stsoe changed the title Provide more error details for AIE async errors via xrt::error VITIS-13434 Provide more error details for AIE async errors via xrt::error Feb 4, 2025
@stsoe stsoe merged commit b97de64 into Xilinx:master Feb 4, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants