Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update native File.Exists #9223

Merged
merged 14 commits into from
Sep 23, 2024
Merged

Update native File.Exists #9223

merged 14 commits into from
Sep 23, 2024

Conversation

JaynieBai
Copy link
Member

@JaynieBai JaynieBai commented Sep 18, 2023

Fixes #4272

Context

The root of the problem here is that our custom Windows-only implementation of File.Exists behaves differently than the regular one in BCL

Changes Made

Rewrite the WindowsFileSystem.FileExists implementation like this:

#if NETFRAMEWORK
            return Microsoft.IO.File.Exists(path);
#else
            return File.Exists(path);
#endif

Testing

ProjectItemSpecTooLong()

Notes

@AR-May
Copy link
Member

AR-May commented Oct 26, 2023

@JaynieBai Could you comment on this PR and fill the description?

@JaynieBai
Copy link
Member Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@JaynieBai JaynieBai marked this pull request as ready for review November 1, 2023 05:46
Copy link
Contributor

@f-alizada f-alizada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking good!
One comment left probably nit

src/Build.UnitTests/BackEnd/MSBuild_Tests.cs Show resolved Hide resolved
@JanKrivanek JanKrivanek requested a review from f-alizada March 27, 2024 12:26
@JaynieBai JaynieBai changed the title Enable test ProjectItemSpecTooLong Shorten the path always when actually just passing the long paths Mar 29, 2024
@JaynieBai JaynieBai changed the title Shorten the path always when actually just passing the long paths Shorten the path always when actually passing the long paths Mar 29, 2024
@f-alizada f-alizada requested a review from AR-May April 9, 2024 10:26
@ladipro
Copy link
Member

ladipro commented Apr 9, 2024

It looks like the root of the problem here is that our custom Windows-only implementation of File.Exists behaves differently than the regular one in BCL. Here it is:

[SupportedOSPlatform("windows")]
internal static bool FileExistsWindows(string fullPath)
{
WIN32_FILE_ATTRIBUTE_DATA data = new WIN32_FILE_ATTRIBUTE_DATA();
bool success = GetFileAttributesEx(fullPath, 0, ref data);
return success && (data.fileAttributes & FILE_ATTRIBUTE_DIRECTORY) == 0;
}

BCL API accepts paths like ..\..\..\..\..\..\..\..\..\..\..\..\..\mydir\myfile while ours does not. I think we should make sure that the fix is done at the right layer (or if it needs to be done at all). What are our guarantees when it comes to such paths? Are there other places where we currently have a similar Windows-only problem? Does the proposed fix of always normalizing paths with GetFullPathNoThrow have a negative perf impact on Linux and Mac where we likely don't need it?

@JaynieBai, can you please check why exactly FileSystems.Default.FileExists(projectPath) is failing without the proposed fix? Is it because the argument is longer than MAX_PATH, because it contains unbalanced ..'s, or because it contains any ..'s at all?

@JaynieBai JaynieBai closed this Apr 12, 2024
@JaynieBai JaynieBai reopened this Apr 12, 2024
@JaynieBai
Copy link
Member Author

JaynieBai commented Apr 12, 2024

It looks like the root of the problem here is that our custom Windows-only implementation of File.Exists behaves differently than the regular one in BCL. Here it is:

[SupportedOSPlatform("windows")]
internal static bool FileExistsWindows(string fullPath)
{
WIN32_FILE_ATTRIBUTE_DATA data = new WIN32_FILE_ATTRIBUTE_DATA();
bool success = GetFileAttributesEx(fullPath, 0, ref data);
return success && (data.fileAttributes & FILE_ATTRIBUTE_DIRECTORY) == 0;
}

BCL API accepts paths like ..\..\..\..\..\..\..\..\..\..\..\..\..\mydir\myfile while ours does not. I think we should make sure that the fix is done at the right layer (or if it needs to be done at all). What are our guarantees when it comes to such paths? Are there other places where we currently have a similar Windows-only problem? Does the proposed fix of always normalizing paths with GetFullPathNoThrow have a negative perf impact on Linux and Mac where we likely don't need it?

@JaynieBai, can you please check why exactly FileSystems.Default.FileExists(projectPath) is failing without the proposed fix? Is it because the argument is longer than MAX_PATH, because it contains unbalanced ..'s, or because it contains any ..'s at all?

I Find FileSystems.Default.FileExists calls the unmanage function GetFileAttributesEx. So write the following code and find when file is prefixed with different length of "..\". The outputs of File.Exists and GetFileAttributesEx are different. Not sure the reason now. @f-alizada Could you have a look?

static void Main(string[] args)
{
    int[] numbers = new int[] {7, 8, 50, 57, 101, 250 };
    for(int j=0 ; j<numbers.Length; j++)
    {
        string file = null;
        for (int i = 0; i < numbers[j]; i++)
        {
            file += "..\\";
        }
        file += "Users\\file.tmp";
        var test = File.Exists(file);
        Console.WriteLine("FileLength:" + file.Length);
        Console.WriteLine("File.Exists Output:" + test);
        WIN32_FILE_ATTRIBUTE_DATA data = new WIN32_FILE_ATTRIBUTE_DATA();
        var result = GetFileAttributesEx(file, 0, ref data);
        Console.WriteLine("FileSystems.Default.FileExists Output: " + result);
        if (!result)
        {
            int error = Marshal.GetLastWin32Error();
            Console.WriteLine($"Error {error} occurred while getting file attributes.");
        }
    }
}

[DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Unicode)]
[return: MarshalAs(UnmanagedType.Bool)]
internal static extern bool GetFileAttributesEx(String name, int fileInfoLevel, ref WIN32_FILE_ATTRIBUTE_DATA lpFileInformation);

/// <summary>
/// Contains information about a file or directory; used by GetFileAttributesEx.
/// </summary>
[StructLayout(LayoutKind.Sequential)]
public struct WIN32_FILE_ATTRIBUTE_DATA
{
    internal int fileAttributes;
    internal uint ftCreationTimeLow;
    internal uint ftCreationTimeHigh;
    internal uint ftLastAccessTimeLow;
    internal uint ftLastAccessTimeHigh;
    internal uint ftLastWriteTimeLow;
    internal uint ftLastWriteTimeHigh;
    internal uint fileSizeHigh;
    internal uint fileSizeLow;
}

.NET Framework Output

FileLength:35
File.Exists Output:False
FileSystems.Default.FileExists Output: False
Error 3 occurred while getting file attributes.
FileLength:38
File.Exists Output:True
FileSystems.Default.FileExists Output: True
FileLength:164
File.Exists Output:True
FileSystems.Default.FileExists Output: True
FileLength:185
File.Exists Output:True
FileSystems.Default.FileExists Output: False
Error 3 occurred while getting file attributes.
FileLength:317
File.Exists Output:True
FileSystems.Default.FileExists Output: False
Error 3 occurred while getting file attributes.
FileLength:764
File.Exists Output:True
FileSystems.Default.FileExists Output: False
Error 3 occurred while getting file attributes.

.Net Output

FileLength:35
File.Exists Output:False
FileSystems.Default.FileExists Output: False
Error 0 occurred while getting file attributes.
FileLength:38
File.Exists Output:False
FileSystems.Default.FileExists Output: False
Error 0 occurred while getting file attributes.
FileLength:164
File.Exists Output:True
FileSystems.Default.FileExists Output: True
FileLength:185
File.Exists Output:True
FileSystems.Default.FileExists Output: False
Error 0 occurred while getting file attributes.
FileLength:317
File.Exists Output:True
FileSystems.Default.FileExists Output: False
Error 0 occurred while getting file attributes.
FileLength:764
File.Exists Output:True
FileSystems.Default.FileExists Output: False
Error 0 occurred while getting file attributes.

@AR-May
Copy link
Member

AR-May commented Jun 12, 2024

I agree with @ladipro here that we need to make sure a change happens on a correct abstraction layer and isolated only to the failing scenario, given the perf concerns.

@rainersigwald do you have some knowledge why we have a custom Windows-only implementation of File.Exists here?

@JanKrivanek
Copy link
Member

Synced with @AR-May on this - our conclusion: The root of the problem is in our custom implementation of FileExists (FileExistsWindows). The suggested course of action (@JaynieBai):

  • Compare performance of standard File.Exists and our custom FileExistsWindows (you can use BenchmarkDotNet for this) - and see if there is a significant difference (report the results here)
  • If there are no significant differences - let's replace our custom impl. with the standard impl
  • If there is a difference - ping the team - we'll need to have different (more specific) bug created for fixing the FileExistsWindows

@MichalPavlik
Copy link
Member

MichalPavlik commented Aug 29, 2024

BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.4037/23H2/2023Update/SunValley3)
AMD Ryzen 7 5700X, 1 CPU, 16 logical and 8 physical cores
[Host] : .NET Framework 4.8.1 (4.8.9261.0), X64 RyuJIT VectorSize=256
DefaultJob : .NET Framework 4.8.1 (4.8.9261.0), X64 RyuJIT VectorSize=256

Method Mean Error StdDev Allocated
Native 59.18 us 1.025 us 1.052 us -
Managed 61.09 us 0.804 us 0.752 us 96 B

No big difference in time, but managed implementation allocates.

@JanKrivanek
Copy link
Member

Thanks @MichalPavlik! We should probably file a bug with Runtime team.

I'm wondering how does this allocation influence overall build perf. @JaynieBai - can you run OrchardCore build with current version (custom FileExists impl) vs version where we'd switch to File.Exists?
If the change is not measurable I'd still vote to get off of our custom implementation and adopt the File.Exists

@MichalPavlik
Copy link
Member

MichalPavlik commented Sep 3, 2024

@JanKrivanek, it was a little bit misleading as I forgot to test Microsoft.IO.Redist implementation.

BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.4037/23H2/2023Update/SunValley3)
AMD Ryzen 7 5700X, 1 CPU, 16 logical and 8 physical cores
[Host] : .NET Framework 4.8.1 (4.8.9261.0), X64 RyuJIT VectorSize=256
DefaultJob : .NET Framework 4.8.1 (4.8.9261.0), X64 RyuJIT VectorSize=256

Method Mean Error StdDev Allocated
Native 58.10 us 0.648 us 0.574 us -
Managed 60.87 us 0.870 us 0.814 us 96 B
Managed_IORedist 59.39 us 0.984 us 0.920 us -

This one doesn't allocate :)
And I can share also result from my VM. Although it runs on virtual disk drive, maybe the ReFS/DevDrive has some impact :)

Method Mean Error StdDev Allocated
Native 26.13 us 0.483 us 0.452 us -
Managed 28.64 us 0.206 us 0.172 us 96 B
Managed_IORedist 27.41 us 0.525 us 0.605 us -

@MichalPavlik
Copy link
Member

MichalPavlik commented Sep 5, 2024

@JanKrivanek, @JaynieBai Replacing the FileExistsWindows should resolve #9986. Any chance to use MS.IO.Redist version for file existence check? Frankly, I would replace all P/Invokes in WindowsFileSystem, but as new PR for different issue :)

@JanKrivanek
Copy link
Member

I have no strong preference between the 2 (MS.IO.Redist vs System.IO) - but I'm definitely in favor of getting rid the custom implementation.

@MichalPavlik
Copy link
Member

MichalPavlik commented Sep 6, 2024

@JaynieBai, please rewrite the WindowsFileSystem.FileExists implementation with something like this:

#if NETFRAMEWORK
            return Microsoft.IO.File.Exists(path);
#else
            return File.Exists(path);
#endif

This simple focused change is enough for now. We will talk about the rest of P/Invokes later.

@JaynieBai
Copy link
Member Author

@MichalPavlik Microsoft.Build.UnitTests.XMakeAppTests.ConfigurationInvalid failed since not find the project file when build with copied MSBuild in the temp folder.
But actually, the project file is there, and build with RunnerUtilities.ExecBootstrapedMSBuild from the bootstrap MSBuild.exe , it succeeds.

@MichalPavlik
Copy link
Member

MichalPavlik commented Sep 18, 2024

This was tricky. The reason why is MS.IO.Redist returning false in this case is due to IOException swallowing inside the MS.IO.File.Exists method. When the method is called inside of a MSBuild copy with modified app.config, an exception is thrown:

System.IO.FileLoadException: 'Could not load file or assembly 'System.Memory, Version=4.0.1.1, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)'

This exception is silently ignored as FileLoadException is a subtype of IOException. The current version of System.Memory assembly distributed with MSBuild is 4.0.1.2. Adding bindingRedirect to the configContent resolves the issue, but it's a workaround that we would need to maintain. Or we would have to extract the assemblyBinding from existing app.config and inject to the generated one.

@rainersigwald Is this test critical enough to have such a complicated functional test? If so, then we have these options:

  • Make the test more complex by adding current assemblyBinding to the created app.config
  • Try to test this differently - using new temporary AppDomain (with modified config) if possible, or unit test if possible
  • Using old System.IO.File.Exists (which allocates 96 bytes) without modifying the test
  • Something else :)

I would personally prefer to find a way how to simplify the test if we really need this one. These kind of tests can fail on magnitude of reasons, and investigation is costly.

Edit: Bug report created - dotnet/maintenance-packages#117

@MichalPavlik
Copy link
Member

No opinions so far. Let's use the simplest way :) @JaynieBai, please replace the configContent assignment with this:

XElement configRuntimeElement = XDocument.Load(RunnerUtilities.PathToCurrentlyRunningMsBuildExe + ".config").Root.Element("runtime");

string configContent = $@"<?xml version =""1.0""?>
                            <configuration>
                                <configSections>
                                    <section name=""msbuildToolsets"" type=""Microsoft.Build.Evaluation.ToolsetConfigurationSection, Microsoft.Build, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a"" />
                                    <foo/>
                                </configSections>
                                <startup>
                                    <supportedRuntime version=""v4.0""/>
                                </startup>
                                <foo/>
                                <msbuildToolsets default=""X"">
                                <foo/>
                                    <toolset toolsVersion=""X"">
                                        <foo/>
                                    <property name=""MSBuildBinPath"" value=""Y""/>
                                    <foo/>
                                    </toolset>
                                <foo/>
                                </msbuildToolsets>
                                {configRuntimeElement}
                            </configuration>";

It adds all assembly redirects to the config file and the Exists method will work as expected.

@MichalPavlik MichalPavlik self-assigned this Sep 19, 2024
@rainersigwald
Copy link
Member

Ah sorry. I like that plan @MichalPavlik.

@JaynieBai JaynieBai changed the title Shorten the path always when actually passing the long paths Update native File.Exists Sep 20, 2024
@MichalPavlik MichalPavlik merged commit 642eed5 into main Sep 23, 2024
10 checks passed
@MichalPavlik MichalPavlik deleted the jennybai/issue4247 branch September 23, 2024 05:52
@Ligtorn
Copy link

Ligtorn commented Dec 15, 2024

FileLoadException

#9223 (comment)

I am not sure if the teams working on this component is also part of the team for Azure DevOps agents. But I would just let you know, that I just reported an issues happening in Azure DevOps self hosted agents, because of this recent change to MS.IO.Redist. https://developercommunity.visualstudio.com/t/MSBuild-Task-stopped-finding-MSBuildexe/10813148

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MSBuild task tests fail when long-paths enabled
8 participants