Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intrepid2: Disable test #13707

Merged

Conversation

sebrowne
Copy link
Contributor

@sebrowne sebrowne commented Jan 7, 2025

@trilinos/intrepid2

Motivation

Repeated compile failures building Intrepid2_unit-test_Discretization_Basis_HGRAD_TRI_Cn_FEM_test_02_CUDA_DFAD_DFAD (example). With the approach of the AT2 CUDA build being transitioned to production, we need the line clean.

Testing

Same paradigm as used elsewhere.

I did some diagnosis on this, since it was apparently succeeding to build with the AT1 configuration/machine, and I didn't want to disable it if at all possible. The load numbers on the AT2 machine show the CPU spiking to over 100%, but only for 1-2 minutes. When I checked one of the AT1 machines, I saw the same spike, but only to about 60%. This makes sense, since the parallelism for these machines is higher, given that they have more CPUs. To me, this indicates that something (small number) are using a lot of CPU resources, as opposed to the entire build hitting the machine too hard. I think the trade-off of disabling this test vs. slowing down the build of all other tests is justified (though unfortunate). We shall see what this does to the load numbers. Some sort of randomized compile order could also help avoid this type of issue happening repeatedly in the future.

Memory usage looked fine on both machines.

AT1:
at1_gpu_load
AT2:
at2_gpu_load

In line with previous changes to disable building some of these tests,
disable one that is consistently failing to compile with AT2.

I did some diagnosis on this, since it was apparently succeeding to
build with the AT1 configuration/machine, and I didn't want to disable
it if necessary.  The load numbers on the AT2 machine show the CPU
spiking to over 100%, but only for 1-2 minutes.  When I checked one of
the AT1 machines, I saw the same spike, but only to about 60%.  This
makes sense, since the parallelism for these machines is higher, given
that they have more CPUs.  I think the trade-off of disabling this test
vs. slowing down the build of all other tests is justified (though
unfortunate).  We shall see what this does to the load numbers.  Some
sort of randomized compile order could also help avoid this type of
issue happening repeatedly in the future.

Signed-off-by: Samuel E. Browne <sebrown@sandia.gov>
@sebrowne sebrowne added the AT: AUTOMERGE Causes the PR autotester to automatically merge the PR branch once approvals are completed label Jan 7, 2025
@sebrowne sebrowne requested a review from a team as a code owner January 7, 2025 15:38
@sebrowne sebrowne requested a review from a team January 7, 2025 15:45
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 971
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb

Build Information

Test Name: PR_gcc

  • Build Num: 1021
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb

Build Information

Test Name: PR_gcc-openmpi_debug

  • Build Num: 1022
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb

Build Information

Test Name: PR_clang

  • Build Num: 1020
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-clang-11.0.1-openmpi-4.0.5-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb

Build Information

Test Name: PR_cuda

  • Build Num: 1019
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8-gpu
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb

Build Information

Test Name: PR_intel

  • Build Num: 940
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-intel-2021.3-sems-openmpi-4.1.6_release-debug_shared_no-kokkos-arch_no-asan_no-complex_fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb

Build Information

Test Name: PR_cuda-uvm

  • Build Num: 1019
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb

Using Repos:

Repo: TRILINOS (sebrowne/Trilinos)
  • Branch: disable-intrepid2-test
  • SHA: d108844
  • Mode: TEST_REPO

Pull Request Author: sebrowne

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 971
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb

Build Information

Test Name: PR_gcc

  • Build Num: 1021
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb

Build Information

Test Name: PR_gcc-openmpi_debug

  • Build Num: 1022
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb

Build Information

Test Name: PR_clang

  • Build Num: 1020
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-clang-11.0.1-openmpi-4.0.5-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb

Build Information

Test Name: PR_cuda

  • Build Num: 1019
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8-gpu
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb

Build Information

Test Name: PR_intel

  • Build Num: 940
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-intel-2021.3-sems-openmpi-4.1.6_release-debug_shared_no-kokkos-arch_no-asan_no-complex_fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb

Build Information

Test Name: PR_cuda-uvm

  • Build Num: 1019
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 13707
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/sebrowne/Trilinos
TRILINOS_SOURCE_SHA d108844
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 70edefb


CDash Test Results for PR# 13707.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
WARNING: NO REVIEWERS HAVE BEEN REQUESTED FOR THIS PULL REQUEST!

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@sebrowne sebrowne requested a review from mperego January 8, 2025 12:31
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
NO REVIEWS HAVE BEEN PERFORMED ON THIS PULL REQUEST!

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

Copy link
Contributor

@mperego mperego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm. Thanks for the plot on CPU resources.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ mperego ]!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester trilinos-autotester merged commit cc1fe7a into trilinos:develop Jan 8, 2025
15 of 17 checks passed
@trilinos-autotester
Copy link
Contributor

Merge on Pull Request# 13707: IS A SUCCESS - Pull Request successfully merged

@trilinos-autotester trilinos-autotester removed the AT: AUTOMERGE Causes the PR autotester to automatically merge the PR branch once approvals are completed label Jan 8, 2025
@trilinos-autotester trilinos-autotester deleted the disable-intrepid2-test branch January 8, 2025 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants