Update SuperPMI artifact logging #77685

AaronRobinsonMSFT · 2022-10-31T16:36:08Z

ghost · 2022-10-31T16:36:23Z

Tagging subscribers to this area: @hoyosjs
See info in area-owners.md if you want to be subscribed.

Issue Details

Author:	AaronRobinsonMSFT
Assignees:	-
Labels:	`area-Infrastructure-coreclr`
Milestone:	8.0.0

BruceForstall · 2022-10-31T17:24:40Z

If the System.JobAttempt works, then there is no need for "continueOnError: true" (because if we can't upload an artifact then it really is an error). Note that in the SuperPMI pipelines, this error isn't fatal (but is reported in the UI as an error).

However, the documentation for System.JobAttempt says that it is not available in templates:

https://learn.microsoft.com/en-us/azure/devops/pipelines/build/variables?view=azure-devops&tabs=yaml

and this YAML is in a template. So does it work?

Also, the other SuperPMI pipelines have the same issue, not just this one.

Finally, every other pipeline that uses "PublishPipelineArtifact@1" (which is all of them?) simply sets "continueOnError: true" which means they fail to upload artifacts for 2nd and subsequent runs, and ignore those failures. They could also benefit from a change like this that properly uploads artifacts, if it works.

riarenas · 2022-10-31T17:33:10Z

However, the documentation for System.JobAttempt says that it is not available in templates:

That usually means the variable is not available at template evaluation time, like to resolve conditionals on whether certain parts of YAML should execute.

The source-build template follows the approach of using the attempt just like this suggested change:

runtime/eng/common/templates/steps/source-build.yml

Line 112 in a299124

    
           artifactName: BuildLogs_SourceBuild_${{ parameters.platform.name }}_Attempt$(System.JobAttempt)

That template uploads its logs as:

riarenas · 2022-10-31T17:41:44Z

Note that in the SuperPMI pipelines, this error isn't fatal (but is reported in the UI as an error).

That doesn't seem to be the case for this pipeline, so I would check others for the same bug. https://dev.azure.com/dnceng-public/public/_build/results?buildId=65930&view=results 's only failures were log upload failures, and they caused the build to fail.

BruceForstall · 2022-10-31T17:59:20Z

That doesn't seem to be the case for this pipeline, so I would check others for the same bug. https://dev.azure.com/dnceng-public/public/_build/results?buildId=65930&view=results 's only failures were log upload failures, and they caused the build to fail.

Ok, my definition of "fail" here is a little different: the "real work" of the pipeline completes, and subsequent steps all get run, but the job still reports as failure.

That usually means the variable is not available at template evaluation time, like to resolve conditionals on whether certain parts of YAML should execute.

That makes sense.

But we should actually test it with (1) a successful run where the log file name is now different, and (2) a failing run where we force a retry and see two log files.

Then, we should change all SuperPMI pipelines (and other pipelines) to the same system.

AaronRobinsonMSFT · 2022-10-31T18:10:58Z

I am going to change all uses of PublishPipelineArtifact@1 in the SuperPMI pipelines to append the _Attempt$(System.JobAttempt and mark them as continueOnError: true.

Any other changes? I'd prefer to avoid continually kicking the CI, so let's iterate on expectations before I submit an update.

BruceForstall · 2022-10-31T21:10:02Z

The changes LGTM as-is.

I don't think the continueOnError: true is necessary, but since everybody else uses it for PublishPipelineArtifact@1 I guess it's fine. If included, it means that any failure to upload a log file will not show up as a job failure. Maybe that's ok, or maybe it will hide problems that should be investigated. Given that the upload failure shouldn't happen, it seems like it should be investigated. At least to make sure that we're generating the correct file name that we expect to upload. (If there's some transient network issue on upload, that's out of our control.)

AaronRobinsonMSFT · 2022-10-31T22:23:02Z

Thanks @BruceForstall.

Can I get a sign off from someone?

BruceForstall · 2022-10-31T22:26:24Z

Can I get a sign off from someone?

I'm happy to sign off after it is tested.

AaronRobinsonMSFT · 2022-10-31T22:27:33Z

@BruceForstall Whats the best way to make SuperPMI fail?

BruceForstall · 2022-10-31T22:34:48Z

@BruceForstall Whats the best way to make SuperPMI fail?

I think you first want to test the success case. Note that the runtime-coreclr superpmi-diffs pipeline and runtime-coreclr superpmi-replay pipelines did not run on this PR because they only run if a file in the JIT directory changes. You probably want to trigger them manually (azp run might not work).

As for failure, this will cause it for superpmi replays:

https://github.com/dotnet/runtime/compare/main...BruceForstall:FailSpmi?expand=1

AaronRobinsonMSFT · 2022-10-31T22:40:16Z

/azp list

azure-pipelines · 2022-10-31T22:40:23Z

CI/CD Pipelines for this repository: runtime-coreclr outerloop runtime-coreclr jitstress runtime-coreclr jitstressregs runtime-coreclr jitstress2-jitstressregs runtime-coreclr gcstress0x3-gcstress0xc runtime-coreclr gcstress-extra runtime-coreclr r2r-extra runtime-coreclr jitstress-isas-x86 runtime-coreclr jitstress-isas-arm runtime-coreclr jitstressregs-x86 runtime-coreclr libraries-jitstressregs runtime-coreclr libraries-jitstress2-jitstressregs runtime-coreclr r2r runtime-coreclr runincontext runtime-coreclr crossgen2 runtime-libraries-coreclr outerloop runtime-libraries-coreclr outerloop-windows runtime-libraries-coreclr outerloop-linux runtime-libraries-coreclr outerloop-osx runtime runtime-libraries enterprise-linux runtime-libraries stress-http runtime-libraries stress-ssl runtime-dev-innerloop runtime-coreclr crossgen2 outerloop coreclr-release-outerloop-nightly runtime-coreclr crossgen2-composite runtime-jit-experimental runtime-coreclr libraries-jitstress dotnet-linker-tests runtime-coreclr ilasm runtime-coreclr crossgen2-composite gcstress runtime-staging runtime-coreclr pgo runtime-coreclr libraries-pgo Antigen runtime-community Fuzzlyn runtime-coreclr superpmi-replay runtime-wasm runtime-coreclr superpmi-diffs runtime-coreclr superpmi-asmdiffs-checked-release runtime-extra-platforms jit-cfg runtime-wasm-perf runtime-llvm runtime-coreclr jitstress-random runtime-coreclr libraries-jitstress-random runtime-android-grpc-client-tests runtime-wasm-libtests runtime-wasm-non-libtests runtime-android runtime-androidemulator runtime-ioslike runtime-ioslikesimulator runtime-linuxbionic runtime-maccatalyst runtime-coreclr pgostress

AaronRobinsonMSFT · 2022-10-31T22:41:59Z

/azp run runtime-coreclr superpmi-replay
/azp run runtime-coreclr superpmi-diffs
/azp run runtime-coreclr superpmi-asmdiffs-checked-release

azure-pipelines · 2022-10-31T22:42:05Z

No pipelines are associated with this pull request.

AaronRobinsonMSFT · 2022-10-31T22:48:33Z

@BruceForstall Can you help me trigger this? I am signed into AzDO, but am not being shown anything that allows me to trigger this run.

BruceForstall · 2022-10-31T23:09:08Z

I triggered superpmi-replay manually by going to https://dev.azure.com/dnceng-public/public/_build?definitionId=150&_a=summary, hitting "Run Pipeline", then using "refs/pull/77685/head" as the "Branch".

There's no need, IMO, to run the others: it would be the same testing, and there's no need to burn all those cycles.

BruceForstall · 2022-11-01T01:15:22Z

The success case worked, artifacts here:

https://dev.azure.com/dnceng-public/public/_build/results?buildId=68726&view=artifacts&pathAsName=false&type=publishedArtifacts

AaronRobinsonMSFT · 2022-11-01T15:00:38Z

@BruceForstall Yep. Seems to work as expected - https://dev.azure.com/dnceng-public/public/_build/results?buildId=68877&view=artifacts&pathAsName=false&type=publishedArtifacts

AaronRobinsonMSFT · 2022-11-01T15:01:02Z

I will revert my breaking trigger.

This reverts commit edf035a.

BruceForstall

Thanks for doing the testing.

Update SuperPMI artifact logging

6ae455a

AaronRobinsonMSFT added the area-Infrastructure-coreclr label Oct 31, 2022

AaronRobinsonMSFT added this to the 8.0.0 milestone Oct 31, 2022

AaronRobinsonMSFT requested a review from BruceForstall October 31, 2022 16:36

ghost assigned AaronRobinsonMSFT Oct 31, 2022

Review feedback

252b581

Introduce failure for testing

edf035a

AaronRobinsonMSFT added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Nov 1, 2022

build-analysis bot mentioned this pull request Nov 1, 2022

Tracking issue for CI build timeouts #76454

Closed

Revert "Introduce failure for testing"

a36e0c8

This reverts commit edf035a.

runfoapp bot mentioned this pull request Nov 1, 2022

Infra improvements for Helix #68176

Closed

BruceForstall approved these changes Nov 1, 2022

View reviewed changes

AaronRobinsonMSFT merged commit 0e6fa62 into dotnet:main Nov 1, 2022

AaronRobinsonMSFT deleted the update_superpmi_artifact_logging branch November 1, 2022 18:07

BruceForstall mentioned this pull request Nov 4, 2022

superpmi-replay pipeline fails when re-run in CI due to the log artifact already existing #77295

Closed

ghost locked as resolved and limited conversation to collaborators Dec 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update SuperPMI artifact logging #77685

Update SuperPMI artifact logging #77685

AaronRobinsonMSFT commented Oct 31, 2022

ghost commented Oct 31, 2022

BruceForstall commented Oct 31, 2022

riarenas commented Oct 31, 2022

riarenas commented Oct 31, 2022 •

edited

Loading

BruceForstall commented Oct 31, 2022

AaronRobinsonMSFT commented Oct 31, 2022

BruceForstall commented Oct 31, 2022

AaronRobinsonMSFT commented Oct 31, 2022

BruceForstall commented Oct 31, 2022

AaronRobinsonMSFT commented Oct 31, 2022

BruceForstall commented Oct 31, 2022

AaronRobinsonMSFT commented Oct 31, 2022

azure-pipelines bot commented Oct 31, 2022

AaronRobinsonMSFT commented Oct 31, 2022

azure-pipelines bot commented Oct 31, 2022

AaronRobinsonMSFT commented Oct 31, 2022

BruceForstall commented Oct 31, 2022

BruceForstall commented Nov 1, 2022

AaronRobinsonMSFT commented Nov 1, 2022

AaronRobinsonMSFT commented Nov 1, 2022

BruceForstall left a comment

Update SuperPMI artifact logging #77685

Update SuperPMI artifact logging #77685

Conversation

AaronRobinsonMSFT commented Oct 31, 2022

ghost commented Oct 31, 2022

BruceForstall commented Oct 31, 2022

riarenas commented Oct 31, 2022

riarenas commented Oct 31, 2022 • edited Loading

BruceForstall commented Oct 31, 2022

AaronRobinsonMSFT commented Oct 31, 2022

BruceForstall commented Oct 31, 2022

AaronRobinsonMSFT commented Oct 31, 2022

BruceForstall commented Oct 31, 2022

AaronRobinsonMSFT commented Oct 31, 2022

BruceForstall commented Oct 31, 2022

AaronRobinsonMSFT commented Oct 31, 2022

azure-pipelines bot commented Oct 31, 2022

AaronRobinsonMSFT commented Oct 31, 2022

azure-pipelines bot commented Oct 31, 2022

AaronRobinsonMSFT commented Oct 31, 2022

BruceForstall commented Oct 31, 2022

BruceForstall commented Nov 1, 2022

AaronRobinsonMSFT commented Nov 1, 2022

AaronRobinsonMSFT commented Nov 1, 2022

BruceForstall left a comment

Choose a reason for hiding this comment

riarenas commented Oct 31, 2022 •

edited

Loading