Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static field addresses are not deterministic when doing SPMI replay #53773

Open
SingleAccretion opened this issue Jun 5, 2021 · 4 comments
Open
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:3 Work that is nice to have
Milestone

Comments

@SingleAccretion
Copy link
Contributor

SingleAccretion commented Jun 5, 2021

Reproduction steps:

1. Construct a trivial app that accesses a static:

public class Program
{
    static int _global;

    [MethodImpl(MethodImplOptions.AggressiveOptimization)]
    private static int Main()
    {
        return _global;
    }
}

2. Do an SPMI collection for that app.
3. Add some printf logging for the static address' value to morph.cpp here (or use -jitoption JitDump=*, but it did not work for me for some reason).
4. Replay using superpmi.exe, twice, log the output to two files.
5. git diff them.
6. You should see something like the following:

-Static field address is 2902599013844, method is Main
+Static field address is 2220318337444, method is Main

This can cause spurious diffs, because CSE uses these values, as was seen in #53748. The commit I reproduced this on is 44f050a.

cc @BruceForstall

category:eng-sys
theme:super-pmi
skill-level:expert
cost:medium
impact:medium

@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Jun 5, 2021
@SingleAccretion
Copy link
Contributor Author

SingleAccretion commented Jun 5, 2021

Debugging it some more, it looks like the root cause is this code:

if (value.fieldValue != (DWORD)-1)
{
temp = (void*)GetFieldAddress->GetBuffer(value.fieldValue);
cr->recAddressMap((void*)value.fieldAddress, temp, toCorInfoSize(repGetFieldType(field, nullptr, nullptr)));
}
else
{
temp = (void*)value.fieldAddress;
}
return temp;

SPMI tries to record values of static fields (all of them, even though only readonly ones are looked at by the Jit), so the value.fieldValue != (DWORD)-1 branch is hit practically always, and the changed temp returned.

This is necessary, since we must have the Jit look for the value now stored in the SPMI's map (replay_address), the original address will almost certainly make no sense in the replay process (in other words, dereferencing it will cause a segfault).

Looks like solving this might require extending the Jit-EE interface.

@BruceForstall
Copy link
Member

I'll have to think about/debug this further. But typically what we would have to do is improve the recording side, in MethodContext::recGetFieldAddress().

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Jun 7, 2021
@JulieLeeMSFT JulieLeeMSFT added this to the 6.0.0 milestone Jun 7, 2021
@JulieLeeMSFT JulieLeeMSFT added the Priority:3 Work that is nice to have label Jun 7, 2021
@BruceForstall
Copy link
Member

It seems like this is a problem in CSE for using the field address for any purpose that might cause diffs, not SuperPMI. E.g., it seems like multiple non-SuperPMI runs could also see asm diffs if the actual value of the address could affect codegen.

So I don't think this is a SuperPMI problem. SuperPMI does generally return the same addresses to the JIT as during record, assuming the JIT won't deref them, but things like the target code buffer address obviously must be able to change.

@jakobbotsch
Copy link
Member

It seems like this is a problem in CSE for using the field address for any purpose that might cause diffs, not SuperPMI. E.g., it seems like multiple non-SuperPMI runs could also see asm diffs if the actual value of the address could affect codegen.

I think it is the case even without CSE using the field addresses, e.g. ARM64 can sometimes encode field addresses in fewer instructions depending on the specifics of their alignment/constant values. And of course there's a bunch of other things like static readonly fields that can affect the codegen depending on runtime specifics.

I think the problem here is that the contract is poorly specified. JIT assumes it can deref pointers to fields, and to make that work SPMI needs to return something else than what it saw at runtime. But JIT is also using the field address for codegen purposes. I think we need a new JIT-EE method to query the data for a field to make this work reliably (as a bonus, it may potentially help R2R/NAOT too).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:3 Work that is nice to have
Projects
None yet
Development

No branches or pull requests

4 participants