Skip to content

Conversation

@martin-g
Copy link
Member

Proposed Changes

Add workflow_call event to regression.yml Github Actions workflow. This way it is possible to re-use it from another workflow.

Add a new workflow - scheduled-arm64.yml. It is a scheduled workflow that just executes regressions.yml on a self-hosted runner.

Related Work

su2code/Docker-Builds#1 - A PR that updates the build and test Docker images to Ubuntu 20.04 so that they could be used on both AMD64 and ARM64.

PR Checklist

  • I am submitting my contribution to the develop branch.
  • My contribution generates no new compiler warnings (try with the '-Wall -Wextra -Wno-unused-parameter -Wno-empty-body' compiler flags, or simply --warnlevel=2 when using meson).
  • My contribution is commented and consistent with SU2 style.
  • I have added a test case that demonstrates my contribution, if necessary.
  • I have updated appropriate documentation (Tutorials, Docs Page, config_template.cpp) , if necessary.

@pr-triage pr-triage bot added the PR: draft label Apr 28, 2022
# workflows: ["Regression and Unit Post-Validation on Linux ARM64"]
# types: [requested]
# branches:
# - 'develop'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lines will be uncommented later to lock the caller only to scheduled-arm64.yml

matrix:
config_set: [BaseMPI, ReverseMPI, ForwardMPI, BaseNoMPI, ReverseNoMPI, ForwardNoMPI, BaseOMP, ReverseOMP, ForwardOMP]
#config_set: [BaseMPI, ReverseMPI, ForwardMPI, BaseNoMPI, ReverseNoMPI, ForwardNoMPI, BaseOMP, ReverseOMP, ForwardOMP]
config_set: [BaseMPI]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporarily reduced the config_set to reduce the build time. Once it works I will re-enable the complete list again.

restore-keys: ${{ matrix.config_set }}
- name: Build
uses: docker://su2code/build-su2:20191105
uses: docker://ghcr.io/martin-g/su2code/build-su2:2
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to su2code/Docker-Builds#1. I need to use the Docker image on the self-hosted Linux ARM64 runner and this requires updating the base image to Ubuntu 20.04 that is multi-platform.

The url to the Docker image will be updated to an official one once su2code/Docker-Builds#1 is merged.

chmod a+x install/bin/*
- name: Run Tests in Container
uses: docker://su2code/test-su2:20200303
uses: docker://ghcr.io/martin-g/su2code/test-su2:1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as with the build-su2 image.


on:
schedule:
- cron: "*/2 * * * *"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every 2 minutes to be able to see the result quickly. Once everything works the schedule will be once per day or something similar.

@martin-g
Copy link
Member Author

Initially I tried to work on this task in my personal fork - https://github.com/martin-g/SU2/pull/1.
The problem there was that compileSU2.sh clones https://github.com/su2code/SU2 and tries to find the PR ref in it and fails.
Now this problem should be gone but the expected issue is that the self-hosted runner won't be found because it is not yet registered for this Github repository.
@pcarruscag Please send me the preferred hardware specs (CPU cores, RAM, disk, ...) at martin.grigorov at gmail dot com and I will reserve a VM!

@martin-g
Copy link
Member Author

martin-g commented May 5, 2022

@pcarruscag Could you please enable the Github Actions checks for this PR ?
They are disabled because it is my first PR in this repository.
Thank you!

@pcarruscag pcarruscag changed the title Build on self hosted aarch64 runner scheduled 2 [WIP] Build on self hosted aarch64 runner scheduled 2 May 10, 2022
martin-g added 4 commits May 11, 2022 09:30
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
…st-su2

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Workaround for:

[9/816] Compiling C object externals/cgns/hdf5/libsu2hdf5.a.p/H5system.c.o
FAILED: externals/cgns/hdf5/libsu2hdf5.a.p/H5system.c.o
ccache cc -Iexternals/cgns/hdf5/libsu2hdf5.a.p -Iexternals/cgns/hdf5 -I../externals/cgns/hdf5 -I/usr/lib/x86_64-linux-gnu/openmpi/include -I/usr/lib/x86_64-linux-gnu/openmpi/include/openmpi -fdiagnostics-color=always -Wall -Winvalid-pch -Wextra -Wpedantic -Werror -std=c99 -O3 -fPIC -pthread -Wno-unused-result -Wno-unused-parameter -Wno-unused-variable -Wno-unused-but-set-variable -Wno-sign-compare -Wno-stringop-overflow -Wno-discarded-qualifiers -Wno-error=maybe-uninitialized -Wno-error=unused-function -Wno-error=unused-label -Wno-pedantic -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_LARGEFILE_SOURCE -D_POSIX_C_SOURCE=200809L -D_GNU_SOURCE -MD -MQ externals/cgns/hdf5/libsu2hdf5.a.p/H5system.c.o -MF externals/cgns/hdf5/libsu2hdf5.a.p/H5system.c.o.d -o externals/cgns/hdf5/libsu2hdf5.a.p/H5system.c.o -c ../externals/cgns/hdf5/H5system.c
In file included from /usr/include/string.h:495,
                 from ../externals/cgns/hdf5/H5private.h:58,
                 from ../externals/cgns/hdf5/H5system.c:33:
In function ‘strncat’,
    inlined from ‘H5_build_extpath’ at ../externals/cgns/hdf5/H5system.c:858:13:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:136:10: error: ‘__builtin_strncat’ output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
  136 |   return __builtin___strncat_chk (__dest, __src, __len, __bos (__dest));
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../externals/cgns/hdf5/H5system.c:33:
../externals/cgns/hdf5/H5system.c: In function ‘H5_build_extpath’:
../externals/cgns/hdf5/H5private.h:1448:28: note: length computed here
 1448 | #define HDstrncat(X, Y, Z) strncat(X, Y, Z)
      |                            ^~~~~~~~~~~~~~~~
../externals/cgns/hdf5/H5system.c:858:13: note: in expansion of macro ‘HDstrncat’
  858 |             HDstrncat(full_path, new_name, HDstrlen(new_name));
      |             ^~~~~~~~~
cc1: all warnings being treated as errors
[10/816] Compiling C object externals/cgns/hdf5/libsu2hdf5.a.p/H5timer.c.o
[11/816] Compiling C object externals/cgns/hdf5/H5detect.p/H5detect.c.o

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Related-to: #1568

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
@martin-g
Copy link
Member Author

martin-g commented May 11, 2022

The Github Actions checks passed after making two changes:

  1. efe98fe
    I needed to lower the warnlevel due to problems in externals/cgns/hdf5.
    Maybe the HDF5 files should be updated to a newer version ?! But it is not clear to me which version has been used in Add HDF5 backend support for CGNS in SU2 #1500.
    @MicK7 Do you have an idea how to fix this ?

  2. 2614022
    A workaround recommended at Issue building python wrapper with CGNS support (problems with static/dynamic HDF5 dependencies) #1568 (comment)

  3. The regression tests fail because the new test-su2 Docker image is private. I will need someone with higher privileges than me to make it public.

martin-g added 7 commits May 13, 2022 09:46
…uled-2

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
…lly"

This reverts commit 574b15d.

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
The values depend on the used type and version of GCC/G++ compiler

Confirmed-by: Pedro Gomes

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
@martin-g martin-g closed this May 19, 2022
@martin-g martin-g reopened this May 19, 2022
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
@martin-g
Copy link
Member Author

@pcarruscag The build passes now after updating the regression tests expectations!

Should I continue working on this PR for the ARM64 part or should we merge it (without the new regressions-arm64.yml) and do the rest in another PR ?

Also what is your opinion on the change for the warnlevel ? Without it it is not possible to compile HDF5

@pcarruscag
Copy link
Member

If you can continue on this one for bit more while we wrap up #1560 (which also had some changes in some regressions) it would help us.
You'll need to disable specific CGNS or HDF5 warnings in the respective meson.build files (in externals/cgns/...), it took some time to get our code warning-free and we shouldn't go back on that.

@martin-g
Copy link
Member Author

martin-g commented May 20, 2022

The regression tests fail on Ubuntu 20.04 aarch64:

==================================================================
Summary of the parallel tests
python version: 3.8.10 (default, Mar 15 2022, 12:22:08) 
[GCC 9.4.0]
  passed - discadj_naca0012
  passed - discadj_cylinder3D
  passed - discadj_arina2k
  passed - ea_naca64206
  passed - discadj_rans_naca0012_sa
  passed - discadj_rans_naca0012_sst
  passed - discadj_incomp_NACA0012
  passed - discadj_incomp_cylinder
  passed - discadj_incomp_turb_NACA0012_sa
  passed - discadj_incomp_turb_NACA0012_sst
  passed - discadj_axisymmetric_rans
  passed - unsteady_cylinder
  passed - unsteady_cylinder_windowed_average_AD
* FAILED - unsteady_cylinder_windowed_average
  passed - unsteady_cylinder_DT_1ST
  passed - pitchingNACA0012
  passed - transonic_stator
* FAILED - discadj_fea
  passed - discadj_heat
  passed - discadj_fsi
* FAILED - discadj_fsi_airfoil
  passed - discadj_cht
  passed - da_sp_pinArray_cht_2d_dp_hf
  passed - da_sp_pinArray_cht_2d_mf
  passed - da_unsteadyCHT_cylinder
  passed - discadj_topol_optim
  passed - unsteady_NACA0012_restart_adjoint
  passed - unsteady_NACA0012_adjoint_only
  passed - restart_shape_optimization
* FAILED - dyn_discadj_fsi
* FAILED - grad_smooth_sob
Tests failed

It is the same version of Python and GCC as on the Github Actions hosted runner.
We will need logic to differentiate the expected results on x86_64 and aarch64 :-/

…als/

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
@martin-g
Copy link
Member Author

martin-g commented May 20, 2022

@pcarruscag I've reverted the warnlevel back to =3 and added some exclusions for errors and warnings in externals/**!

But there are also some problems in SU2's code (we should see them as build errors in the new checks).

With

diff --git a/meson.build b/meson.build
index 4c95a97047..1a34b8ac1d 100644
--- a/meson.build
+++ b/meson.build
@@ -22,6 +22,12 @@ if build_machine.system() != 'windows'
     default_warning_flags += ['-Wno-empty-body']
   endif
   default_warning_flags += ['-Wno-unused-parameter',
+    '-Wno-error=cast-function-type',
+    '-Wno-cast-function-type',
+    '-Wno-error=stringop-truncation',
+    '-Wno-stringop-truncation',
+    '-Wno-error=missing-field-initializers',
+    '-Wno-missing-field-initializers',
                             '-Wno-deprecated-declarations']

the build passes locally but I guess those should be fixed instead of suppressed.

martin-g added 2 commits May 20, 2022 14:33
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
martin-g and others added 9 commits June 16, 2022 14:22
Add reference file for fd_sp_pinArray_cht_2d_dp_hf for Linux ARM64

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Update test expectations for restart_shape_optimization

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
@martin-g martin-g marked this pull request as ready for review June 23, 2022 09:38
@pr-triage pr-triage bot removed the PR: draft label Jun 23, 2022
@martin-g
Copy link
Member Author

@pcarruscag The PR is ready for review!
There are two regression tests which always fail - https://github.com/su2code/SU2/runs/7019382870?check_suite_focus=true
polar_naca0012 in serial_regression.py and parallel_regression.py. The computed values are different in each run :-/

@martin-g
Copy link
Member Author

Since I used the Github UI to merge the latest updates in develop branch now it is not possible to rebase+squash the commits in this branch.
Unless you have a better idea I plan to just copy the contents of the modified files in this PR to a new branch and open a new PR.

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
@pcarruscag
Copy link
Member

Usually we dont squash-merge code PRs, so it's fine.

martin-g added 6 commits June 27, 2022 23:27
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
it will be added in a separate PR

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
…s job

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Add a new workflow that is triggered only on push to `develop`, i.e.
after merging a PR.
It reuses regression.yml but requires a runner with 'ARM64' label, i.e.
self-hosted one

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
@pcarruscag pcarruscag changed the title [WIP] Build on self hosted aarch64 runner scheduled 2 Build SU2 and run regressions on self hosted aarch64 runner scheduled Jun 28, 2022
@martin-g martin-g merged commit 40e82f4 into su2code:develop Jun 30, 2022
@martin-g martin-g deleted the build-on-self-hosted-aarch64-runner-scheduled-2 branch June 30, 2022 05:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants