Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update perf proj from arcade #38127

Merged
merged 7 commits into from
Jun 23, 2020
Merged

Conversation

DrewScoggins
Copy link
Member

No description provided.

@ghost
Copy link

ghost commented Jun 18, 2020

Tagging subscribers to this area: @safern, @ViktorHofer
Notify danmosemsft if you want to be subscribed.

@DrewScoggins DrewScoggins requested a review from ooooolivia June 18, 2020 23:34
<HelixWorkItem Include="Crossgen2 Composite Framework R2R">
<PayloadDirectory>$(WorkItemDirectory)\ScenarioCorrelation</PayloadDirectory>
<Command>$(Python) %HELIX_CORRELATION_PAYLOAD%\performance\src\scenarios\crossgen2\test.py crossgen2 --composite %HELIX_CORRELATION_PAYLOAD%\performance\src\scenarios\crossgen2\framework-r2r.dll.rsp --core-root %HELIX_CORRELATION_PAYLOAD%\Core_Root</Command>
<Timeout>1:00</Timeout>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 1 hour timeout? This leads to very long hangs blocking a machine for long periods of time.

I also see the other workitems we send have 4 hours timeouts. Can we modify this for the purpose of PR runs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For crossgen2 composite workitem, 1 hour timeout also takes care of CI runs where we run 5 iterations, while we only have 1 iteration for PR runs and each iteration takes ~8min. For the rest of the workitems we will be able to shorten the timeout once the partition is enabled. @DrewScoggins Do you have an estimate of the timeout value after partition?

Copy link
Member

@safern safern Jun 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened: #38136 to follow up.

The reason why this concerns me is that if we enter a deadlock like: #38138 the machine is blocked for that period of time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the micro benchmarks we should be able to set the timeout to about 10 minutes once we get this partition change in. A run take about 3-5 minutes, but I want to leave some wiggle room. These are of course Helix timeouts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, Helix timeouts are the ones we care about reducing on PRs for these micro benchmarks. 10-15 mins sounds good to me.

@DrewScoggins
Copy link
Member Author

Added the timeout, but just need to make sure that it works. If it does I will check this in here and also check-in this change on the arcade side.

@@ -113,13 +113,23 @@
<!--
Partition the Microbenchmarks project, but nothing else
-->
<ItemGroup Condition="$(TargetCsproj.Contains('MicroBenchmarks.csproj'))">
<ItemGroup Condition="$(TargetCsproj.Contains('MicroBenchmarks.csproj')) and $($HelixSourcePrefix) == 'pr'">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo:

Suggested change
<ItemGroup Condition="$(TargetCsproj.Contains('MicroBenchmarks.csproj')) and $($HelixSourcePrefix) == 'pr'">
<ItemGroup Condition="$(TargetCsproj.Contains('MicroBenchmarks.csproj')) and $(HelixSourcePrefix) == 'pr'">

</HelixWorkItem>
</ItemGroup>

<ItemGroup Condition="$(TargetCsproj.Contains('MicroBenchmarks.csproj')) and $($HelixSourcePrefix) == 'official'">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo

Suggested change
<ItemGroup Condition="$(TargetCsproj.Contains('MicroBenchmarks.csproj')) and $($HelixSourcePrefix) == 'official'">
<ItemGroup Condition="$(TargetCsproj.Contains('MicroBenchmarks.csproj')) and $(HelixSourcePrefix) == 'official'">

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, just found that. Doing a run to make sure it works now then going to check-in

<PreCommands Condition="'$(Compare)' == 'true'">$(WorkItemCommand) --bdn-artifacts $(BaselineArtifactsDirectory) --bdn-arguments="--anyCategories $(BDNCategories) $(ExtraBenchmarkDotNetArguments) $(BaselineCoreRunArgument) --partition-count $(PartitionCount) --partition-index %(HelixWorkItem.Index)"</PreCommands>
<Command>$(WorkItemCommand) --bdn-artifacts $(ArtifactsDirectory) --bdn-arguments="--anyCategories $(BDNCategories) $(ExtraBenchmarkDotNetArguments) $(CoreRunArgument) --partition-count $(PartitionCount) --partition-index %(HelixWorkItem.Index)"</Command>
<PostCommands Condition="'$(Compare)' == 'true'">$(DotnetExe) run -f $(_Framework) -p $(ResultsComparer) --base $(BaselineArtifactsDirectory) --diff $(ArtifactsDirectory) --threshold 2$(Percent) --xml $(XMLResults);$(FinalCommand)</PostCommands>
<Timeout>45</Timeout>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be simpler if you introduce an intermediate variable then you wouldn't need to duplicate these code. You could just set the timeout on that variable and use it when creating the HelixWorkItem that way you would only have one HelixWorkItem declaration for Microbenchmarks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

@@ -64,6 +64,14 @@
<WorkItemCommand>$(WorkItemCommand) $(CliArguments)</WorkItemCommand>
</PropertyGroup>

<PropertyGroup Condition="$(HelixSourcePrefix) == 'pr'">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to also escape $(HelixSourcePrefix) for the comparison to be correct:

Suggested change
<PropertyGroup Condition="$(HelixSourcePrefix) == 'pr'">
<PropertyGroup Condition="'$(HelixSourcePrefix)' == 'pr'">

<WorkItemTimeout>15</WorkItemTimeout>
</PropertyGroup>

<PropertyGroup Condition="$(HelixSourcePrefix) == 'official'">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

<WorkItemTimeout>0:15</WorkItemTimeout>
</PropertyGroup>

<PropertyGroup Condition="'$(HelixSourcePrefix)'' == 'official'">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If for some reason HelixSourcePrefix is not pr nor official it will use the default HelixSDK timeout, maybe we want to be intentional about it. Something like:

<PropertyGroup>
   <WorkItemTimeout>0:45</WorkItemTimeout>
  <WorkItemTimeout  Condition="'$(HelixSourcePrefix)' != 'official'">0:15</WorkItemTimeout>
</PropertyGroup>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@@ -64,21 +64,54 @@
<WorkItemCommand>$(WorkItemCommand) $(CliArguments)</WorkItemCommand>
</PropertyGroup>

<PropertyGroup Condition="'$(HelixSourcePrefix)'' == 'pr'">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: extra '

@DrewScoggins
Copy link
Member Author

@safern I think this is good to go.

Copy link
Member

@safern safern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DrewScoggins thanks for including the timeout change here. Did we validate that Helix actually honored the new timeout?

@DrewScoggins
Copy link
Member Author

Yes, I did. It was listed at 900 seconds.

@DrewScoggins DrewScoggins merged commit 7c2e2e7 into dotnet:master Jun 23, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants