Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tests to use dacapo-23.9-RC3-chopin #241

Merged
merged 9 commits into from
Sep 14, 2023

Conversation

qinsoon
Copy link
Member

@qinsoon qinsoon commented Sep 13, 2023

The PR changes the CI tests:

  • Use dacapo-23.9-RC3-chopin for testing
  • Work around the out-of-disk error using maximize-build-space
  • Properly deal with the return code in the step with running when using pipe to redirect output.

@qinsoon
Copy link
Member Author

qinsoon commented Sep 13, 2023

Every job was skipped for the out of disk error.

@qinsoon
Copy link
Member Author

qinsoon commented Sep 13, 2023

Most tests work fine. But there are ones that seem to randomly fail due to out of disk. For example release-h2o in this run: https://github.com/mmtk/mmtk-openjdk/actions/runs/6167427566/job/16739054614?pr=241.

It was skipped in the step of fetching dacapo. The error message is

Unhandled exception. System.IO.IOException: No space left on device : '/home/runner/runners/2.309.0/_diag/Worker_20230913-034413-utc.log'
   at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
   at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite()
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.Diagnostics.TextWriterTraceListener.Flush()
   at System.Diagnostics.TraceSource.Flush()
   at GitHub.Runner.Common.TraceManager.Dispose(Boolean disposing)
   at GitHub.Runner.Common.TraceManager.Dispose()
   at GitHub.Runner.Common.HostContext.Dispose(Boolean disposing)
   at GitHub.Runner.Common.HostContext.Dispose()
   at GitHub.Runner.Worker.Program.Main(String[] args)
System.IO.IOException: No space left on device : '/home/runner/runners/2.309.0/_diag/Worker_20230913-034413-utc.log'
   at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
   at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite()
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.Diagnostics.TextWriterTraceListener.Flush()
   at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id)
   at GitHub.Runner.Common.HostTraceListener.TraceEvent(TraceEventCache eventCache, String source, TraceEventType eventType, Int32 id, String message)
   at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message)
   at GitHub.Runner.Worker.Worker.RunAsync(String pipeIn, String pipeOut)
   at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)
System.IO.IOException: No space left on device : '/home/runner/runners/2.309.0/_diag/Worker_20230913-034413-utc.log'
   at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
   at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite()
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.Diagnostics.TextWriterTraceListener.Flush()
   at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id)
   at GitHub.Runner.Common.HostTraceListener.TraceEvent(TraceEventCache eventCache, String source, TraceEventType eventType, Int32 id, String message)
   at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message)
   at GitHub.Runner.Common.Tracing.Error(Exception exception)
   at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)

But in the step before fetching dacapo, it showed we should have enough disk space.

Filesystem                   Size  Used Avail Use% Mounted on
/dev/root                     84G   82G  1.8G  98% /
tmpfs                        3.4G  172K  3.4G   1% /dev/shm
tmpfs                        1.4G  1.1M  1.4G   1% /run
tmpfs                        5.0M     0  5.0M   0% /run/lock
/dev/sdb15                   105M  6.1M   99M   6% /boot/efi
/dev/sda1                     14G   13G  218M  99% /mnt
tmpfs                        694M   12K  694M   1% /run/user/1001
/dev/mapper/buildvg-buildlv   55G  900K   55G   1% /home/runner/work/mmtk-openjdk/mmtk-openjdk

In a successful run, this is the disk usage after fetching and unzipping dacapo. It seems dacapo uses ~15G.

Filesystem                   Size  Used Avail Use% Mounted on
/dev/root                     84G   82G  1.8G  98% /
tmpfs                        3.4G  172K  3.4G   1% /dev/shm
tmpfs                        1.4G  1.1M  1.4G   1% /run
tmpfs                        5.0M     0  5.0M   0% /run/lock
/dev/sdb15                   105M  6.1M   99M   6% /boot/efi
/dev/sda1                     14G   13G  215M  99% /mnt
tmpfs                        694M   12K  694M   1% /run/user/1001
/dev/mapper/buildvg-buildlv   55G   15G   40G  28% /home/runner/work/mmtk-openjdk/mmtk-openjdk

I am guessing the log file for the runner ('/home/runner/runners/2.309.0/_diag/Worker_20230913-034413-utc.log') is stored on a different filesystem, which is actually out of disk space. I changed maximize-build-space to leave more space for the root and the temp file system. I will see if that works.

@qinsoon qinsoon marked this pull request as ready for review September 14, 2023 01:19
Copy link
Collaborator

@k-sareen k-sareen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks! The only thing is we should gather the new MMTk MarkCompact minheap values and use them, but that's not urgent. We can circle back to it.

@k-sareen k-sareen merged commit 7a40312 into mmtk:master Sep 14, 2023
39 of 45 checks passed
@qinsoon
Copy link
Member Author

qinsoon commented Sep 14, 2023

Just a note. maximize-build-space will create a different file system with roughly 50G available disk space. But the file system is mounted with the working directory /home/runner/work/mmtk-openjdk/mmtk-openjdk so the 50G is only available for the working directory. We may get random out-of-disk errors if other file system is out of disk. For example, the runner logs into the filesystem /dev/root. To work around the issue, I set root-reserve-mb in maximize-build-space to 6000 (6G) to allow spacious disk for other uses. It seems working fine. We should keep an eye on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants