Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rearranged the rotm kernel to adapt to the architecture. #5053

Closed
wants to merge 2 commits into from

Conversation

tingboliao
Copy link

Hi, martin-frbg:
In accordance with the suggestion put forward in PR #5038, we rearranged the rotm kernel to ensure its compatibility with the architecture. Additionally, we developed relevant test cases for conducting functional and performance verifications on K230 and K1 platforms.

The latest performance data are shown as below:
Parameter setting: OPENBLAS_LOOPS = 10000.

K230 [C908, vlen = 128]@1.6GHz:
| Cases | Scalar / MFlops | Optimized RVV / MFlops |
| srotm.goto | 872.52 | 1545.43 |
| drotm.goto | 797.53 | 1410.64 |

K1 [C908, vlen = 256]@1.6GHz:
| Cases | Scalar / MFlops | Optimized RVV / MFlops |
| srotm.goto | 896.42 | 1512.47 |
| drotm.goto | 819.14 | 1576.13 |

In the above data, the bigger value is, the better performance is.

Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
@tingboliao tingboliao closed this Jan 7, 2025
@tingboliao tingboliao reopened this Jan 7, 2025
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
@tingboliao tingboliao closed this Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant