-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
decomp: add prefetch for matched seq on aarch64 #3164
Conversation
match is used for following sequence copy. It is only updated when extDict is needed, which is a low probability case. So it can be prefetched to reduce cache miss. The benchmarks on various Arm platforms showed uplift from 1% ~ 14% with gcc-11/clang-14. Signed-off-by: Jun He <jun.he@arm.com> Change-Id: If201af4799d2455d74c79f8387404439d7f684ae
Benchmark changes on Arm N1/A72/A57 with gcc-11:
|
Benchmark changes on Arm N1/A72/A57 with clang-14:
|
This change may lead to regression on x86. My test on Xeon5218 is:
|
I believe this is a rather good policy. There are a few quirks, such as overlapped prefetch orders when decoding in |
match is used for following sequence copy. It is
only updated when extDict is needed, which is a
low probability case. So it can be prefetched to
reduce cache miss.
The benchmarks on various Arm platforms showed
uplift from 1% ~ 14% with gcc-11/clang-14.
Signed-off-by: Jun He jun.he@arm.com
Change-Id: If201af4799d2455d74c79f8387404439d7f684ae