Add ssh debugging for Github Actions CI. #749

pratikvn · 2021-04-21T12:29:27Z

This PR adds ssh debugging capability to the github actions builds. This is done by creating a ssh connection to the machine with the https://github.com/marketplace/actions/debugging-with-tmate action.

This can be really helpful to debug Windows and OSX builds for us when we dont have access to Windows/OSX machines.

Currently, once this PR has been merged into develop (our default branch), any branch that is based on develop will be able to use this ssh debugging capability.

For example, let's say that a develop branch already contains this PR merged, if I create a new branch called debug-ci and push it and see that it fails for some build, you can go to Actions -> <Specific workflow (OSX/Windows)> on the left graph and you will see something like this:

Here you are manually triggering the workflow and if you enter debug_enabled in the box and say run workflow, it will run the Debug over SSH (tmate) job and spit out a ssh server@domain to which you can directly ssh to. An example view for the OSX job:

To make things easier with the Windows jobs, I also split them up into Windows-{MSVC-CUDA, Reference, MinGW, CygWin}, so that you can only manually trigger the ones you need.

+ Adds ssh debugging capability with tmate for Windows builds.

yhmtsai

LGTM in general.
I think we need to use ssh-key, or I guess anyone can go into the ssh and get the environment variable.
Could you check whether the ssh can get the token or some variables of Github secret?

.github/workflows/windows-cygwin.yml

.github/workflows/osx.yml

.github/workflows/windows-msvc-cuda.yml

yhmtsai · 2021-04-21T13:23:22Z

one more question: do you know how they finish the job?
exceeding the github action time limit, or logout of ssh can finish the job

pratikvn · 2021-04-21T13:31:04Z

I checked if the environment/repository secrets are leaked and it does not seem to be the case. But we can also set

with:
        limit-access-to-actor: true

for the step and allow ssh access only with a ssh-key. But I did not see any need for this right now, but I am open to adding this.

You can finish the job by logging out or by doing a touch continue in the project root directory.

By default, it is the github actions timeout limit, but we can also set a timeout limit manually, if we want to as well. But I think the github actions timeout limit is sufficient for us.

upsj · 2021-04-21T13:39:41Z

As far as I have seen so far, the output logs are restricted in visibility, so if I'm logged out, I can't see the ssh connection information. I guess that's limited to the organization members or similar?

upsj · 2021-04-21T13:40:45Z

Is there something we can do about the ordering of the jobs? After the separation into different files, they are all over the place 😄

pratikvn · 2021-04-21T14:42:37Z

~~Yes, the logs should be restricted to collaborators only~~ Edit: Logs are visible to all signed in users. Also the manual triggers for the debug builds can only be done by people with write access to the repo.

Regarding the job ordering, I am not sure how to fix that. It seems to be ordered differently everytime I see it and depending on which jobs are running :)

yhmtsai · 2021-04-21T14:56:38Z

@pratikvn that's good. Thanks for checking.
the visibility of log only for collaborators should be good enough

pratikvn · 2021-04-21T16:22:57Z

Regarding the ordering of the jobs, it seems to be alphabetical, atleast when the jobs are the same state (pending, running, completed or failed) and in that order between states. I could rename all jobs to have a number before them. So, they would look like 1-OnNewPR... , 2-OnSyncPR , 3-OSX... and so on. But I am not sure it is necessary because after completion, the jobs should have a reasonable order.

pratikvn · 2021-04-22T07:39:00Z

So, after some checking and verification with @yhmtsai , we discovered the following:

The debug through ssh (which is a manual trigger), can only be done by collaborators.
Logs are visible to everyone who is logged in to github. Therefore, if someone triggers a debug-ssh build, and the ssh connection link is being spit out, everyone who is logged in to github, will be able to see it. They will also be able to ssh into the connection.
If the workflow does not have any secrets, then these are not leaked in the ssh session. Therefore, if this is being used, we must not use any secrets in the workflow.

Given the above, I think it is still reasonably secure to use this in our Windows and OSX builds because:

We do not use any secrets in these workflows.
The manual triggering of the builds provides an additional layer of security so that only collaborators are able to trigger these workflows.

If anyone still has some concerns, please let me know.

P.S: I tried to add another layer of security with the requirement of allowing ssh connections to only those in the collaborator list by trying to limit the access to actor. But I am having some weird rate limiting issues with that. We can probably get back to this at a later point.

Co-authored-by: Yuhsiang M. Tsai <19565938+yhmtsai@users.noreply.github.com>

upsj · 2021-04-22T07:52:28Z

In that case, I would prefer if we used limit-access-to-actor: true, since (I guess) we all have SSH keys setup in our Github accounts?

pratikvn · 2021-04-22T11:35:06Z

So, the root of the rate limiting issue seems to be that Github actions currently has very few MacOS machines, probably all with the same IP address. For this reason, you face issues when trying to connect via ssh with a public key, because github has rate limitations based on IP. This issue does not seem to occur for the Windows and Linux machines.

So, I added the ssh + public_key requirement for the windows jobs, but left the non-public key for the OSX job for now. If they fix the issues on their end, we should be able use limit-access-to-actor for OSX jobs as well. For now, I think it should be okay to use them without the restriction of a public_key.

pratikvn · 2021-04-22T20:52:58Z

Just to document the issue with OSX. It has been observed by others as well. mxschmitt/action-tmate#69

Here the issue reporter mentions that the problem might be that the actions CI has limited Mac machines and hence runs into rate limiting issues. I don't really want to pass the GITHUB_TOKEN variable through, so for now, I leave it as it is and in the future if the issue gets resolved from github actions, we can add back the limit-access-to-actor for the OSX jobs as well.

Ginkgo release 1.4.0 The Ginkgo team is proud to announce the new Ginkgo minor release 1.4.0. This release brings most of the Ginkgo functionality to the Intel DPC++ ecosystem which enables Intel-GPU and CPU execution. The only Ginkgo features which have not been ported yet are some preconditioners. Ginkgo's mixed-precision support is greatly enhanced thanks to: 1. The new Accessor concept, which allows writing kernels featuring on-the-fly memory compression, among other features. The accessor can be used as header-only, see the [accessor BLAS benchmarks repository](https://github.com/ginkgo-project/accessor-BLAS/tree/develop) as a usage example. 2. All LinOps now transparently support mixed-precision execution. By default, this is done through a temporary copy which may have a performance impact but already allows mixed-precision research. Native mixed-precision ELL kernels are implemented which do not see this cost. The accessor is also leveraged in a new CB-GMRES solver which allows for performance improvements by compressing the Krylov basis vectors. Many other features have been added to Ginkgo, such as reordering support, a new IDR solver, Incomplete Cholesky preconditioner, matrix assembly support (only CPU for now), machine topology information, and more! Supported systems and requirements: + For all platforms, cmake 3.13+ + C++14 compliant compiler + Linux and MacOS + gcc: 5.3+, 6.3+, 7.3+, all versions after 8.1+ + clang: 3.9+ + Intel compiler: 2018+ + Apple LLVM: 8.0+ + CUDA module: CUDA 9.0+ + HIP module: ROCm 3.5+ + DPC++ module: Intel OneAPI 2021.3. Set the CXX compiler to `dpcpp`. + Windows + MinGW and Cygwin: gcc 5.3+, 6.3+, 7.3+, all versions after 8.1+ + Microsoft Visual Studio: VS 2019 + CUDA module: CUDA 9.0+, Microsoft Visual Studio + OpenMP module: MinGW or Cygwin. Algorithm and important feature additions: + Add a new DPC++ Executor for SYCL execution and other base utilities [#648](#648), [#661](#661), [#757](#757), [#832](#832) + Port matrix formats, solvers and related kernels to DPC++. For some kernels, also make use of a shared kernel implementation for all executors (except Reference). [#710](#710), [#799](#799), [#779](#779), [#733](#733), [#844](#844), [#843](#843), [#789](#789), [#845](#845), [#849](#849), [#855](#855), [#856](#856) + Add accessors which allow multi-precision kernels, among other things. [#643](#643), [#708](#708) + Add support for mixed precision operations through apply in all LinOps. [#677](#677) + Add incomplete Cholesky factorizations and preconditioners as well as some improvements to ILU. [#672](#672), [#837](#837), [#846](#846) + Add an AMGX implementation and kernels on all devices but DPC++. [#528](#528), [#695](#695), [#860](#860) + Add a new mixed-precision capability solver, Compressed Basis GMRES (CB-GMRES). [#693](#693), [#763](#763) + Add the IDR(s) solver. [#620](#620) + Add a new fixed-size block CSR matrix format (for the Reference executor). [#671](#671), [#730](#730) + Add native mixed-precision support to the ELL format. [#717](#717), [#780](#780) + Add Reverse Cuthill-McKee reordering [#500](#500), [#649](#649) + Add matrix assembly support on CPUs. [#644](#644) + Extends ISAI from triangular to general and spd matrices. [#690](#690) Other additions: + Add the possibility to apply real matrices to complex vectors. [#655](#655), [#658](#658) + Add functions to compute the absolute of a matrix format. [#636](#636) + Add symmetric permutation and improve existing permutations. [#684](#684), [#657](#657), [#663](#663) + Add a MachineTopology class with HWLOC support [#554](#554), [#697](#697) + Add an implicit residual norm criterion. [#702](#702), [#818](#818), [#850](#850) + Row-major accessor is generalized to more than 2 dimensions and a new "block column-major" accessor has been added. [#707](#707) + Add an heat equation example. [#698](#698), [#706](#706) + Add ccache support in CMake and CI. [#725](#725), [#739](#739) + Allow tuning and benchmarking variables non intrusively. [#692](#692) + Add triangular solver benchmark [#664](#664) + Add benchmarks for BLAS operations [#772](#772), [#829](#829) + Add support for different precisions and consistent index types in benchmarks. [#675](#675), [#828](#828) + Add a Github bot system to facilitate development and PR management. [#667](#667), [#674](#674), [#689](#689), [#853](#853) + Add Intel (DPC++) CI support and enable CI on HPC systems. [#736](#736), [#751](#751), [#781](#781) + Add ssh debugging for Github Actions CI. [#749](#749) + Add pipeline segmentation for better CI speed. [#737](#737) Changes: + Add a Scalar Jacobi specialization and kernels. [#808](#808), [#834](#834), [#854](#854) + Add implicit residual log for solvers and benchmarks. [#714](#714) + Change handling of the conjugate in the dense dot product. [#755](#755) + Improved Dense stride handling. [#774](#774) + Multiple improvements to the OpenMP kernels performance, including COO, an exclusive prefix sum, and more. [#703](#703), [#765](#765), [#740](#740) + Allow specialization of submatrix and other dense creation functions in solvers. [#718](#718) + Improved Identity constructor and treatment of rectangular matrices. [#646](#646) + Allow CUDA/HIP executors to select allocation mode. [#758](#758) + Check if executors share the same memory. [#670](#670) + Improve test install and smoke testing support. [#721](#721) + Update the JOSS paper citation and add publications in the documentation. [#629](#629), [#724](#724) + Improve the version output. [#806](#806) + Add some utilities for dim and span. [#821](#821) + Improved solver and preconditioner benchmarks. [#660](#660) + Improve benchmark timing and output. [#669](#669), [#791](#791), [#801](#801), [#812](#812) Fixes: + Sorting fix for the Jacobi preconditioner. [#659](#659) + Also log the first residual norm in CGS [#735](#735) + Fix BiCG and HIP CSR to work with complex matrices. [#651](#651) + Fix Coo SpMV on strided vectors. [#807](#807) + Fix segfault of extract_diagonal, add short-and-fat test. [#769](#769) + Fix device_reset issue by moving counter/mutex to device. [#810](#810) + Fix `EnableLogging` superclass. [#841](#841) + Support ROCm 4.1.x and breaking HIP_PLATFORM changes. [#726](#726) + Decreased test size for a few device tests. [#742](#742) + Fix multiple issues with our CMake HIP and RPATH setup. [#712](#712), [#745](#745), [#709](#709) + Cleanup our CMake installation step. [#713](#713) + Various simplification and fixes to the Windows CMake setup. [#720](#720), [#785](#785) + Simplify third-party integration. [#786](#786) + Improve Ginkgo device arch flags management. [#696](#696) + Other fixes and improvements to the CMake setup. [#685](#685), [#792](#792), [#705](#705), [#836](#836) + Clarification of dense norm documentation [#784](#784) + Various development tools fixes and improvements [#738](#738), [#830](#830), [#840](#840) + Make multiple operators/constructors explicit. [#650](#650), [#761](#761) + Fix some issues, memory leaks and warnings found by MSVC. [#666](#666), [#731](#731) + Improved solver memory estimates and consistent iteration counts [#691](#691) + Various logger improvements and fixes [#728](#728), [#743](#743), [#754](#754) + Fix for ForwardIterator requirements in iterator_factory. [#665](#665) + Various benchmark fixes. [#647](#647), [#673](#673), [#722](#722) + Various CI fixes and improvements. [#642](#642), [#641](#641), [#795](#795), [#783](#783), [#793](#793), [#852](#852) Related PR: #857

Release 1.4.0 to master The Ginkgo team is proud to announce the new Ginkgo minor release 1.4.0. This release brings most of the Ginkgo functionality to the Intel DPC++ ecosystem which enables Intel-GPU and CPU execution. The only Ginkgo features which have not been ported yet are some preconditioners. Ginkgo's mixed-precision support is greatly enhanced thanks to: 1. The new Accessor concept, which allows writing kernels featuring on-the-fly memory compression, among other features. The accessor can be used as header-only, see the [accessor BLAS benchmarks repository](https://github.com/ginkgo-project/accessor-BLAS/tree/develop) as a usage example. 2. All LinOps now transparently support mixed-precision execution. By default, this is done through a temporary copy which may have a performance impact but already allows mixed-precision research. Native mixed-precision ELL kernels are implemented which do not see this cost. The accessor is also leveraged in a new CB-GMRES solver which allows for performance improvements by compressing the Krylov basis vectors. Many other features have been added to Ginkgo, such as reordering support, a new IDR solver, Incomplete Cholesky preconditioner, matrix assembly support (only CPU for now), machine topology information, and more! Supported systems and requirements: + For all platforms, cmake 3.13+ + C++14 compliant compiler + Linux and MacOS + gcc: 5.3+, 6.3+, 7.3+, all versions after 8.1+ + clang: 3.9+ + Intel compiler: 2018+ + Apple LLVM: 8.0+ + CUDA module: CUDA 9.0+ + HIP module: ROCm 3.5+ + DPC++ module: Intel OneAPI 2021.3. Set the CXX compiler to `dpcpp`. + Windows + MinGW and Cygwin: gcc 5.3+, 6.3+, 7.3+, all versions after 8.1+ + Microsoft Visual Studio: VS 2019 + CUDA module: CUDA 9.0+, Microsoft Visual Studio + OpenMP module: MinGW or Cygwin. Algorithm and important feature additions: + Add a new DPC++ Executor for SYCL execution and other base utilities [#648](#648), [#661](#661), [#757](#757), [#832](#832) + Port matrix formats, solvers and related kernels to DPC++. For some kernels, also make use of a shared kernel implementation for all executors (except Reference). [#710](#710), [#799](#799), [#779](#779), [#733](#733), [#844](#844), [#843](#843), [#789](#789), [#845](#845), [#849](#849), [#855](#855), [#856](#856) + Add accessors which allow multi-precision kernels, among other things. [#643](#643), [#708](#708) + Add support for mixed precision operations through apply in all LinOps. [#677](#677) + Add incomplete Cholesky factorizations and preconditioners as well as some improvements to ILU. [#672](#672), [#837](#837), [#846](#846) + Add an AMGX implementation and kernels on all devices but DPC++. [#528](#528), [#695](#695), [#860](#860) + Add a new mixed-precision capability solver, Compressed Basis GMRES (CB-GMRES). [#693](#693), [#763](#763) + Add the IDR(s) solver. [#620](#620) + Add a new fixed-size block CSR matrix format (for the Reference executor). [#671](#671), [#730](#730) + Add native mixed-precision support to the ELL format. [#717](#717), [#780](#780) + Add Reverse Cuthill-McKee reordering [#500](#500), [#649](#649) + Add matrix assembly support on CPUs. [#644](#644) + Extends ISAI from triangular to general and spd matrices. [#690](#690) Other additions: + Add the possibility to apply real matrices to complex vectors. [#655](#655), [#658](#658) + Add functions to compute the absolute of a matrix format. [#636](#636) + Add symmetric permutation and improve existing permutations. [#684](#684), [#657](#657), [#663](#663) + Add a MachineTopology class with HWLOC support [#554](#554), [#697](#697) + Add an implicit residual norm criterion. [#702](#702), [#818](#818), [#850](#850) + Row-major accessor is generalized to more than 2 dimensions and a new "block column-major" accessor has been added. [#707](#707) + Add an heat equation example. [#698](#698), [#706](#706) + Add ccache support in CMake and CI. [#725](#725), [#739](#739) + Allow tuning and benchmarking variables non intrusively. [#692](#692) + Add triangular solver benchmark [#664](#664) + Add benchmarks for BLAS operations [#772](#772), [#829](#829) + Add support for different precisions and consistent index types in benchmarks. [#675](#675), [#828](#828) + Add a Github bot system to facilitate development and PR management. [#667](#667), [#674](#674), [#689](#689), [#853](#853) + Add Intel (DPC++) CI support and enable CI on HPC systems. [#736](#736), [#751](#751), [#781](#781) + Add ssh debugging for Github Actions CI. [#749](#749) + Add pipeline segmentation for better CI speed. [#737](#737) Changes: + Add a Scalar Jacobi specialization and kernels. [#808](#808), [#834](#834), [#854](#854) + Add implicit residual log for solvers and benchmarks. [#714](#714) + Change handling of the conjugate in the dense dot product. [#755](#755) + Improved Dense stride handling. [#774](#774) + Multiple improvements to the OpenMP kernels performance, including COO, an exclusive prefix sum, and more. [#703](#703), [#765](#765), [#740](#740) + Allow specialization of submatrix and other dense creation functions in solvers. [#718](#718) + Improved Identity constructor and treatment of rectangular matrices. [#646](#646) + Allow CUDA/HIP executors to select allocation mode. [#758](#758) + Check if executors share the same memory. [#670](#670) + Improve test install and smoke testing support. [#721](#721) + Update the JOSS paper citation and add publications in the documentation. [#629](#629), [#724](#724) + Improve the version output. [#806](#806) + Add some utilities for dim and span. [#821](#821) + Improved solver and preconditioner benchmarks. [#660](#660) + Improve benchmark timing and output. [#669](#669), [#791](#791), [#801](#801), [#812](#812) Fixes: + Sorting fix for the Jacobi preconditioner. [#659](#659) + Also log the first residual norm in CGS [#735](#735) + Fix BiCG and HIP CSR to work with complex matrices. [#651](#651) + Fix Coo SpMV on strided vectors. [#807](#807) + Fix segfault of extract_diagonal, add short-and-fat test. [#769](#769) + Fix device_reset issue by moving counter/mutex to device. [#810](#810) + Fix `EnableLogging` superclass. [#841](#841) + Support ROCm 4.1.x and breaking HIP_PLATFORM changes. [#726](#726) + Decreased test size for a few device tests. [#742](#742) + Fix multiple issues with our CMake HIP and RPATH setup. [#712](#712), [#745](#745), [#709](#709) + Cleanup our CMake installation step. [#713](#713) + Various simplification and fixes to the Windows CMake setup. [#720](#720), [#785](#785) + Simplify third-party integration. [#786](#786) + Improve Ginkgo device arch flags management. [#696](#696) + Other fixes and improvements to the CMake setup. [#685](#685), [#792](#792), [#705](#705), [#836](#836) + Clarification of dense norm documentation [#784](#784) + Various development tools fixes and improvements [#738](#738), [#830](#830), [#840](#840) + Make multiple operators/constructors explicit. [#650](#650), [#761](#761) + Fix some issues, memory leaks and warnings found by MSVC. [#666](#666), [#731](#731) + Improved solver memory estimates and consistent iteration counts [#691](#691) + Various logger improvements and fixes [#728](#728), [#743](#743), [#754](#754) + Fix for ForwardIterator requirements in iterator_factory. [#665](#665) + Various benchmark fixes. [#647](#647), [#673](#673), [#722](#722) + Various CI fixes and improvements. [#642](#642), [#641](#641), [#795](#795), [#783](#783), [#793](#793), [#852](#852) Related PR: #866

pratikvn added 2 commits April 21, 2021 13:17

Add Debug CI setup to OSX-build.

4aee152

Move Windows builds to separate files, add debug

e1a8b14

+ Adds ssh debugging capability with tmate for Windows builds.

pratikvn requested review from upsj, Slaedr, thoasm, yhmtsai and tcojean April 21, 2021 12:29

pratikvn self-assigned this Apr 21, 2021

yhmtsai approved these changes Apr 21, 2021

View reviewed changes

.github/workflows/windows-cygwin.yml Outdated Show resolved Hide resolved

.github/workflows/osx.yml Show resolved Hide resolved

.github/workflows/windows-msvc-cuda.yml Show resolved Hide resolved

upsj approved these changes Apr 21, 2021

View reviewed changes

Fix typo

6dc2537

Co-authored-by: Yuhsiang M. Tsai <19565938+yhmtsai@users.noreply.github.com>

Restrict windows ssh-debugging to collaborators

f0615c7

pratikvn added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels Apr 22, 2021

pratikvn merged commit bef0374 into develop Apr 23, 2021

pratikvn deleted the debug-ci branch April 23, 2021 08:44

upsj mentioned this pull request Nov 7, 2022

Update the github action badge #1187

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ssh debugging for Github Actions CI. #749

Add ssh debugging for Github Actions CI. #749

pratikvn commented Apr 21, 2021

yhmtsai left a comment

yhmtsai commented Apr 21, 2021

pratikvn commented Apr 21, 2021 •

edited

Loading

upsj commented Apr 21, 2021

upsj commented Apr 21, 2021

pratikvn commented Apr 21, 2021 •

edited

Loading

yhmtsai commented Apr 21, 2021

pratikvn commented Apr 21, 2021

pratikvn commented Apr 22, 2021

upsj commented Apr 22, 2021

pratikvn commented Apr 22, 2021

pratikvn commented Apr 22, 2021

Add ssh debugging for Github Actions CI. #749

Add ssh debugging for Github Actions CI. #749

Conversation

pratikvn commented Apr 21, 2021

yhmtsai left a comment

Choose a reason for hiding this comment

yhmtsai commented Apr 21, 2021

pratikvn commented Apr 21, 2021 • edited Loading

upsj commented Apr 21, 2021

upsj commented Apr 21, 2021

pratikvn commented Apr 21, 2021 • edited Loading

yhmtsai commented Apr 21, 2021

pratikvn commented Apr 21, 2021

pratikvn commented Apr 22, 2021

upsj commented Apr 22, 2021

pratikvn commented Apr 22, 2021

pratikvn commented Apr 22, 2021

pratikvn commented Apr 21, 2021 •

edited

Loading

pratikvn commented Apr 21, 2021 •

edited

Loading