Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace hint_common_parent_position() by backwards accumulator updates #5870

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ces42
Copy link
Contributor

@ces42 ces42 commented Feb 8, 2025

Only calls to evaluate() now trigger NNUE accumulator updates. To make sure that we are likely to find parent positions from which to update the accumulators we perform a backwards NNUE update whenever we compute the accumulator from scratch for some position.

passed STC
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 39680 W: 10474 L: 10164 D: 19042
Ptnml(0-2): 171, 4068, 11042, 4398, 161
https://tests.stockfishchess.org/tests/view/67a27f26eb183d11c65945be

passed LTC
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 337308 W: 86408 L: 85550 D: 165350
Ptnml(0-2): 276, 30551, 106126, 31441, 260
https://tests.stockfishchess.org/tests/view/67a287efeb183d11c65945ee

then simplified:
STC
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 28608 W: 7641 L: 7413 D: 13554
Ptnml(0-2): 132, 3036, 7744, 3256, 136
https://tests.stockfishchess.org/tests/view/67a4703719f522d3866d3345

LTC
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 200226 W: 51026 L: 50990 D: 98210
Ptnml(0-2): 170, 18468, 62799, 18508, 168
https://tests.stockfishchess.org/tests/view/67a4f255229c1a170cc08964

The version in this PR is a bit different from the simplified version, but it's compile-time changes only.

Copy link

github-actions bot commented Feb 8, 2025

clang-format 18 needs to be run on this PR.
If you do not have clang-format installed, the maintainer will run it when merging.
For the exact version please see https://packages.ubuntu.com/noble/clang-format-18.

(execution 13316780261 / attempt 1)

@xu-shawn
Copy link
Contributor

xu-shawn commented Feb 8, 2025

Congratulations! Just FYI, an LTC is not strictly required for non-functional speedups.

@ces42
Copy link
Contributor Author

ces42 commented Feb 8, 2025

well vondele told me to run LTC on discord, because of the size of the change. It's also not really a non-functional change since it changes the algorithm for computing evaluations, so it is conceivable that this would be slower at higher depth.

@xu-shawn
Copy link
Contributor

xu-shawn commented Feb 8, 2025

Thread sanitizer seems to have failed. Maybe this has something to do with the shared root state?

@ces42
Copy link
Contributor Author

ces42 commented Feb 8, 2025

Yeah, I'm looking at it.

@mstembera
Copy link
Contributor

Congrats! Can you say a bit about why the small accumulator can't be handled the same way as the big one?

src/position.h Outdated
int rule50;
int pliesFromNull;
int16_t rule50;
int16_t pliesFromNull;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these 16 bit now?

@vondele
Copy link
Member

vondele commented Feb 8, 2025

can you please squash the PR, also we'll need to understand why CI is / was failing, in particular if thread sanitizer is flagging something.

@ces42
Copy link
Contributor Author

ces42 commented Feb 9, 2025

Thread sanitizer seems to have failed. Maybe this has something to do with the shared root state?

Yes, that seems to be the issue. Or at least nodes where st->previous->next != st when doing SMP, which I assume are the children of the root node? Do you know of a good way to detect these? Adding the condition st->previous->next == st to the loop looking for a computed parent fixed the issue. But I am slightly worried about race conditions where the condition is OK when the parent node is searched for, but then the next-pointer of the parent is changed by another thread before the NNUE update is performed. Do you know of a better way to detect root nodes?

Incidentally, why do some root nodes not have st->previous == nullptr?

@ces42
Copy link
Contributor Author

ces42 commented Feb 9, 2025

Congrats! Can you say a bit about why the small accumulator can't be handled the same way as the big one?

Since there are much fewer smallnet evaluations (only 20% or so I think) the heuristic "computing the accumulator of an interior node has zero effective cost because we will be able to reuse it often" does not work for the small accumulator (The old code that does hints for the small accumulator is also definitively "wrong" in this sense). But the small accumulator is also 24x cheaper to compute so there's basically nothing to optimize there.

@ces42
Copy link
Contributor Author

ces42 commented Feb 9, 2025

OK I have fixed the SMP issue. Sorry for the mess. This should be ready for merging now. I ran an SMP test on 8074a0b

SMP STC
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 45464 W: 11977 L: 11662 D: 21825
Ptnml(0-2): 57, 5054, 12196, 5367, 58
https://tests.stockfishchess.org/tests/view/67a7d1d00b25c820efc8bc2d

I have added two more commits since that test, but the first one (reverting 5aedb70) should be a strict speedup if anything and the second one is just changing the type of two ints and should be elo neutral (cf this test).

should I run any tests on the final version?

I'm not squashing commits yet for reference purposes.

@vondele vondele requested a review from Sopel97 February 9, 2025 07:20
@@ -539,12 +529,13 @@ class FeatureTransformer {
const IndexType offsetR0 = HalfDimensions * removed[0];
auto* columnR0 = reinterpret_cast<const vec_t*>(&weights[offsetR0]);

if (removed.size() == 1)
if ((Forward && removed.size() == 1)
|| (Backwards && added.size() == 1)) // added.size() == removed.size() == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move the code from these comments to asserts inside the ifs

{
if (FeatureSet::requires_refresh(st, Perspective)
|| (!Big && (gain -= FeatureSet::update_cost(st) < 0)) || !st->previous || st->previous->next != st)
goto refresh;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like this goto, as removing it would only involve a boolean flag and break.

@mstembera
Copy link
Contributor

mstembera commented Feb 12, 2025

This https://tests.stockfishchess.org/tests/view/67abfe1ca04df5eb8dbebf0b passed. I can open a PR after this gets merged but whatever is easiest or more expedient is fine by me.

@Disservin
Copy link
Member

@ces42 will you resolve the goto comment as well?

@Disservin
Copy link
Member

fyi needs a rebase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants