-
Notifications
You must be signed in to change notification settings - Fork 515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Building a NET6/NET7 iOS project on agent M2 ARM 64 hangs/freezes #17825
Comments
@vincentcastagna have you ever seen this on an M1 machine? or have you never tried on M1? |
Hi @vincentcastagna. We have added the "need-info" label to this issue, which indicates that we have an open question for you before we can take further action. This issue will be closed automatically in 7 days if we do not hear back from you by then - please feel free to re-open it if you come back to this issue after that time. |
I don't believe we tried on a M1 machine with a .NET6/.NET7 iOS project (only Xamarin.IOS). We will try asap and give feedback here. |
|
Not sure if this is the same thing, but this hangs during the AOT compilation every time (running on an M1):
Works fine for |
Related, is there a reason that the AOT process on an arm64 machine has to run though x64 emulation? I see it's using Microsoft.NETCore.App.Runtime.AOT.osx-x64.Cross.iossimulator-arm64. I don't see any published packages for ...AOT.osx-arm64... |
@rolfbjarne We have tested on a M1 machine. The behavior is exactly the same, sometimes it builds successfully, sometimes it just hangs. Seem to be happening half the time, exactly like M2. |
Just time constraints. We're fixing it for .NET 8 (dotnet/runtime#74175). |
I think this is a different issue, because I believe this is just the build taking very long because of a few things add up:
If you add This might also work for your device builds (for different reasons - we've seen llvm run into infinite loops in the past) - so could you try and see if you notice any difference? |
Hi @vincentcastagna. We have added the "need-info" label to this issue, which indicates that we have an open question for you before we can take further action. This issue will be closed automatically in 7 days if we do not hear back from you by then - please feel free to re-open it if you come back to this issue after that time. |
@rolfbjarne Not sure what extra info you would need ? I can provide. I omitted to precise that those agents are fully working on ANY Xamarin project, the build is super fast and never fails. Only our .NET6/.NET7 agents randomly hangs. |
@rolfbjarne - you were right on all accounts. I just had to wait about 6 minutes instead of the normal 5 to 10 seconds. Setting @vincentcastagna - Sorry. I didn't mean to hijack this thread. Just thought it could be useful. Not sure if that's what's happening on your build agents or not. Thanks. |
@vincentcastagna I'm assuming you only see this when building in Azure DevOps, and never locally? One theory is that something pops up a permission dialog for some reason, and that blocks the build until it times out. Unfortunately these issues can be hard to track down unless you can access the built bot remotely (and catch it when the build is stuck). One idea might be to make the build as verbose as possible, that should pinpoint a bit better exactly where it stops, and this is done by passing /v:diagnostic to the dotnet command:
Could you do this and see what it shows? |
Hi @vincentcastagna. We have added the "need-info" label to this issue, which indicates that we have an open question for you before we can take further action. This issue will be closed automatically in 7 days if we do not hear back from you by then - please feel free to re-open it if you come back to this issue after that time. |
@rolfbjarne here logs with /v:diagnostic you can see the instruction at the top of the logs. I don't see a real difference with or without this instruction. I have access to the machine of the agent, and I have never seen a permission dialog poping up though ... even in CLI logs or else. |
Because right after
|
Oh my bad, I missunderstood your previous comment, I will provide logs wtih verbosity level set to diagnostic asap. |
@vincentcastagna I'm sorry I didn't answer earlier, but unfortunately I don't have any good ideas. I see you're building the 'Release' configuration, does the same thing happen if you build 'Debug'? If so, one idea might be to turn off LLVM (by setting |
We already tried deactivating LLVM when I created the issue, but in case, I retried. And the behavior remains the same, sometimes it goes through, sometimes it just hangs. |
What about a debug build that's not signed, so something like this (i.e.
|
If you happen to have a way to run something on the machine with the stuck process then |
I have ran a dotnet stack report -p for each msbuild processes I found running using pstree once a build hangs. I don't see much information here, but hopefully this would be useful to you : @filipnavara I tried to run |
Both of the stack traces contain
There are two possible explanations for this. Either I messed up and it's trying to dump itself in a loop, or some process is stuck so badly that not even the diagnostic pipes work. The former is not very likely since I tested that very same command locally. The later would likely imply hitting some .NET runtime bug (and there's only one thread-suspension bug that comes to mind which was fixed in .NET 7 iirc)... |
Thank you for your quick answer.
As you saw I found two msbuild processes ... could it be that they wait on each other, driving an endless waiting loop. Any advice maybe to try confirm that or seek for other processes that would be waited by msbuild ? I decided to let
I'll also try to target latest .NET 7 |
I think I'm running into this issue also. I recently moved from microsoft hosted to a self-hosted M2 Max MacStudio. Changing nothing in the pipeline definition, the command line I tried running |
At this point I believe this is either a bug in msbuild or in the runtime, not in any of our MSBuild logic, so I'm moving to dotnet/msbuild. |
This issue was moved to dotnet/msbuild#8970 |
The MSBuild team analyzed this, and found that a potential culprit is that we're not limiting parallization of AOT processes to the number of CPUs, so we can end up with hundreds of concurrent processes competing for resources. Ref: dotnet/msbuild#8970 (comment) So I'm reopening this issue to fix the parallelization problem. Note: this may not turn out to be the actual culprit, but it's a good thing to fix anyways. |
…of processors. This might fix xamarin#17825, but even if it doesn't, it's a good thing to do to not overload machines. Ref: xamarin#17825
The fix to limit parallelization has been merged, and I'm closing this tentatively. I'll try to get the fix in a service release for .NET 7 (it's too late for the next one, but it'll likely be in the one after that). Otherwise it'll also be in .NET 8 RC 2 (not RC 1, too late for that too). Feel free to reopen this issue if the hangs/freezes persist even with the fix. |
…ers to the number of processors. This might fix xamarin#17825, but even if it doesn't, it's a good thing to do to not overload machines. Ref: xamarin#17825 Backport of xamarin#18793.
…mpilers to the number of processors. (xamarin#18793) This might fix xamarin#17825, but even if it doesn't, it's a good thing to do to not overload machines. Ref: xamarin#17825
Steps to Reproduce
We don't face the issue on X64 on prem agents or even hosted.
There is no real consistency on when the build will hangs or not. It depends on the run.
We already tried removing Trimmer, which doesn't seem to have any effect. With or without, the behavior is the same.
Expected Behavior
Build should never hang
Actual Behavior
Build hangs sometimes and never ends, until timeout
Environment
AGENT CAPABILITIES :
Build Logs
Working build logs =>
iOS BUILD - OK.txt
Hanging build logs =>
iOS BUILD - HANGS.txt
MSBUILD BINLOG (seem corrupted ...)
build-net7.0-ios.zip
Example Project (If Possible)
https://github.com/nventive/UnoApplicationTemplate/blob/dev/vica/make-usage-new-agents-net7.0/src/app/ApplicationTemplate.Mobile/ApplicationTemplate.Mobile.csproj
The text was updated successfully, but these errors were encountered: