Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust handling of "no space left on device" error #715

Closed
wants to merge 3 commits into from

Conversation

Skgland
Copy link
Contributor

@Skgland Skgland commented Dec 13, 2023

  1. prefere NoSpace reason over DependsOn
  • the later is nosiy and hides a simple problem in a long log
  1. Change NoSpace from spuriouse to non-spuriouse
  • to keep/move these crates into the retry-regressed-list.txt as the DependsOn failiures, that were really NoSpace failures, appear to be frequently retried

Notes rearding 1. this somewhat mudles the hirarchy of failure reasons,
but I didn't think NoSpace should be necessarily be above CompilerError(_) and CompilerDiagnosticChange

Regarding 2. while technically spuriouse I think it is reasonable to treat NoSpace as a non-spuriouse regression for practical reasonse.

  • visibility as spuriouse failiures are a lot less visible
  • retry-regressed-list.txt, without adding a secondary retry-spurious-regressed-list.txt or splitting spuriouse failiures into retry-able and non-retry-able spuriouse failiures

This somewhat undose my fix for #700 from #713, but at least it should now be its own group/category.

- noisy to prefer "depends on" over "no space"
This is to keep these cases on the "retry-regressed-list.txt" for easy retry.

Alternatively spurious-regressions could be split into
- those that should be retries and
- those that should not
and the former could be included in the "retry-regressed-list.txt"
@Skgland
Copy link
Contributor Author

Skgland commented Dec 13, 2023

An examplary crater run would be https://crater-reports.s3.amazonaws.com/pr-118120/index.html.
A lot of regressed: dependencies are due to no space left on device and were retried.
While 12 no space left on device regressions hiding under spuriouse-regressions were not.

@Skgland
Copy link
Contributor Author

Skgland commented Dec 13, 2023

rust-lang/rust#116088 looks similar https://crater-reports.s3.amazonaws.com/pr-116088/index.html, but a lot larger.
With 123 no space left on device under spurious-regressions and a lot of regressed: dependencies hiding no space left on device regressions.
After checking the first 50 entries and then random entries every touchpad scroll distance all of the checked regressed: dependencies regressions are no space left on device related

@Skgland
Copy link
Contributor Author

Skgland commented Dec 13, 2023

rust-lang/rust#116088 (comment)

Let's sieve those out...

made me think that the reason for the retries might be to get the no space left on device errors out of the regressed list to see the actual regression, rather than re-running the spurious failiurs to possibly find hidden regressions?

In that case the first commit should be sufficent as, they would be no space left on device would be hidden under spurious-regressions, and not cluttering up the non-spurious regresssions.

I think my preference would be to re-run (keeping all commits here), to ensure no regressions are over shadowed by no space left on device, especially as re-running appears to rather consitenly resolve them.

@Skgland
Copy link
Contributor Author

Skgland commented Jan 11, 2024

@Mark-Simulacrum you appear the most active on this repo, any feedback regarding this?

  1. I think prefereing NoSpace over DependsOn would be good as otherwise the fix for Disk full errors should be spurious  #700 in detect new spuroius error and fix a regression considered spurious #713 is bearly effective, as most no space left on device errors appear to appear while compiling dependencies.

  2. Given the frequency with which they occure I am uncertain whether hiding these errors between the spuriouse regressions is ideal.
    Currently the NoSpace errors that are classified as DependsOn get retried and usually succeed on the second iteration, giving a second chance to detect regressions.
    That would be lost if they are moved to spurious regresssions as they will be no longer visible in the regressed list and also will be absent from the retry-regressed-list.txt

@Skgland Skgland marked this pull request as draft May 31, 2024 18:24
@Skgland
Copy link
Contributor Author

Skgland commented May 31, 2024

I will open a new PR rather than rewriting this from scratch

@Skgland Skgland closed this May 31, 2024
bors added a commit that referenced this pull request Aug 25, 2024
prioritize `NoSpace` error more

Replacement for #715

## adjust the prioritization of `NoSpace`

After this it will be prioritized above
  - `DependsOn`
  - `CompilerError`
  - `NetworkAccess`
  - `CompilerDiagnosticChange`

The later two are already spurious-regressions so only the reason will change.
The former two would previously result in a regression but now would be spurious-regressions.
I think this change is reasonable as no space will likely be the cause for the compiler error or the error in the dependency.

## include the spurious regressions in the `retry-regressed-list.txt`

Spurious failures might hide actual regressions, this gives them another chance at showing actual regressions.
Further more spurious regressions are usually not too many and re-running will in a lot of cases resolve the spurious failure.

## Sort some `CommandError::IO(_)` into as better `FailureReason`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant