Skip to content

Conversation

Flying-dragon-boxing
Copy link
Collaborator

In CUDA 12.9, NVIDIA migrated the NVTX header file from nvToolsExt.h to nvtx3/nvToolsExt.h, and made NVTX a header-only library. This PR is to fix the compile error when CUDA version is greater than 12.9.
See the changelog of CUDA 12.9 for details.
image

zhangzh-pku and others added 30 commits May 1, 2023 01:27
* run INPUT.Default() in every process in InputParaTest (deepmodeling#3490)

Co-authored-by: kirk0830 <67682086+kirk0830@users.noreply.github.com>

* add blas support for FindLAPACK.cmake (deepmodeling#3497)

* more unittest of QO: towards orbital selection (deepmodeling#3499)

* Fix: fix bug in mulliken charge calculation (deepmodeling#3503)

* fix phase

* fix case test

* Refactor: namespace Conv_Coulomb_Pot_K (deepmodeling#3446)

* Refactor: namespace Conv_Coulomb_Pot_K

* Refactor: namespace Conv_Coulomb_Pot_K

---------

Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com>

* enable the computation of all zeros in one function call (deepmodeling#3449)

Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com>

* replace ios.eof() by ios.good() to avoid meeting badbit and failbit in reading STRU (deepmodeling#3506)

* Build: add ccache to accelerate the testing process (deepmodeling#3509)

* Build: add ccache to accelerate the testing process

* Update test.yml

* Update test.yml

* Update test.yml

* Docs: to avoid the misunderstanding in docs (deepmodeling#3518)

* to avoid the misunderstanding in docs

* Update docs/quick_start/hands_on.md

Co-authored-by: Chun Cai <amoycaic@gmail.com>

---------

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Docs: fix a missing depencency in conda build env (deepmodeling#3508)

* Feature: Add ENABLE_RAPIDJSON option to control the output of abacus.json (deepmodeling#3519)

Add ENABLE_RAPIDJSON option to control the output of abacus.json

* Feature: add python wrapper for math sphbes (deepmodeling#3475)

* recommit for review

* add python wrapper

* remove timer since performace tests add

* Feature: support segment split in kline mode in KPT file and `out_band` band output precision control, `8` as default (deepmodeling#3493)

* add precision control

* correct serial version of nscf_band function

* fix issue 3482

* update unit and integrated test

* update document

* correct unittest and make compatible with false and true

* fix: bug in Autotest.sh when result.ref has no totaltimeref (deepmodeling#3523)

* Fix : unit test of module_xc (deepmodeling#3524)

* Fix: omit small magnetic moments to avoid numerical instability (deepmodeling#3530)

* update deltalambda

* avoid numerical error in orbMulP

* add constrain on Mi

* change case reference value

* Fix: fix multiple compiler warnings (deepmodeling#3515)

* Fix: add noreturn attribute to warning_quit

* Add type conversion

* fix string literal

* fix small number trunctuation

* Fix system call returned value not checked

* fix missing braket

* Refactor parameter_pool.cpp and parameter_pool.h

* remove duplicated return statements

* Change WARNING_QUIT occurances in tests

* Add warning message to help debug UT

* output the default precision flag (deepmodeling#3496)

Co-authored-by: kirk0830 <67682086+kirk0830@users.noreply.github.com>

* Build: Improving CMake performance for finding LibXC and ELPA (deepmodeling#3478)

* Fix for finding LibXC and ELPA

* For compatibility to previous routines

* syntax fix for FindELPA.cmake

* Update cmake/FindELPA.cmake

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Using CMake interface as default for finding LibXC

* update docs

* fix for FindLibxc: changing imcompatible if statement

* fix for FindLibxc: changing imcompatible if statement

* fix for FindLibxc: changing imcompatible if statement

* update docs for installing pkg-config

* Update FindLibxc.cmake

* Update FindLibxc.cmake

* remove previous LibXC routine in CMakeLists.txt

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Update easy_install.md with Makefile-built LibXC supported

* Update easy_install.md to include different behavior in different version on finding ELPA

---------

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Docs: correct some docs about mp2 smearing method (deepmodeling#3533)

* correct some docs about mp2 smearing method

* add docs about mv method

* Feature : printing band density (deepmodeling#3501)

Co-authored-by: wenfei-li <liwenfei@gmail.com>
Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com>

* add some docs for PR#3501 (deepmodeling#3537)

* Feature: enable restart charge density mixing during SCF (deepmodeling#3542)

* add a new parameter mixing_restart

* do not update rho if iter==mixing_restart

* do not update rho if iter==mixing_restart-1

* reset mix and rho_mdata if iter==mixing_restart

* fix SCF exit directly since drho=0 if iter=GlobalV::MIXING_RESTART

* re-set_mixing in eachiterinit for PW and LCAO

* enable SCF restarts in esolver_ks::RUN

* add some UnitTests

* add some Docs

* new inputs added

* Update input-main.md (deepmodeling#3551)

Solve the format problem mentioned in issue 3543

* Build: fix compatibility issue against toolchain install (deepmodeling#3540)

* Fix for finding LibXC and ELPA

* For compatibility to previous routines

* syntax fix for FindELPA.cmake

* Update cmake/FindELPA.cmake

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Using CMake interface as default for finding LibXC

* update docs

* fix for FindLibxc: changing imcompatible if statement

* fix for FindLibxc: changing imcompatible if statement

* fix for FindLibxc: changing imcompatible if statement

* update docs for installing pkg-config

* Update FindLibxc.cmake

* Update FindLibxc.cmake

* remove previous LibXC routine in CMakeLists.txt

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Update easy_install.md with Makefile-built LibXC supported

* Update easy_install.md to include different behavior in different version on finding ELPA

* fix compatibility issue against toolchain

* Change default ELPA install routine to old one

---------

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Test: Configure performance tests for math libraries (deepmodeling#3511)

* add performace test of sphbes functions.

* fix benchmark cmake errors

* add dependencies for docker

* update docs

* add performance tests for sphbes

* add google benchmark

* rewrite benchmark tests in fixtures

* disable internal testing in benchmark

* merge benchmark into integration test

---------

Co-authored-by: StarGrys <771582678@qq.com>

* Configure Makefile Compiling, fix typos

* Fix Makefile Intel toolchains compile errors

* Fix even more PEXSI related Makefile compiling issues

* Update hsolver_pw.cpp (deepmodeling#3556)

when use_uspp==false, overlap matrix should be E.

* Fix: cuda build target (deepmodeling#3276)

* Fix: cuda buid target

* Update CMakeLists.txt

---------

Co-authored-by: Denghui Lu <denghuilu@pku.edu.cn>

---------

Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com>
Co-authored-by: kirk0830 <67682086+kirk0830@users.noreply.github.com>
Co-authored-by: Haozhi Han <haozhi.han@outlook.com>
Co-authored-by: Zhao Tianqi <hongriTianqi@users.noreply.github.com>
Co-authored-by: PeizeLin <78645006+PeizeLin@users.noreply.github.com>
Co-authored-by: jinzx10 <jzx016@hotmail.com>
Co-authored-by: Chun Cai <amoycaic@gmail.com>
Co-authored-by: Peng Xingliang <91927439+pxlxingliang@users.noreply.github.com>
Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
Co-authored-by: Wenfei Li <38569667+wenfei-li@users.noreply.github.com>
Co-authored-by: Denghui Lu <denghuilu@pku.edu.cn>
Co-authored-by: YI Zeping <18586016708@163.com>
Co-authored-by: wenfei-li <liwenfei@gmail.com>
Co-authored-by: jingan-181 <78459531+jingan-181@users.noreply.github.com>
Co-authored-by: StarGrys <771582678@qq.com>
Co-authored-by: Haozhi Han <haozhi.han@stu.pku.edu.cn>
Revert "Modify inputs and update to latest version"
@mohanchen mohanchen added GPU & DCU & HPC GPU and DCU and HPC related any issues Compile & CICD & Docs & Dependencies Issues related to compiling ABACUS labels Jun 26, 2025
@mohanchen
Copy link
Collaborator

Nice job

@mohanchen mohanchen merged commit 37bc08a into deepmodeling:develop Jun 26, 2025
14 checks passed
zgn-26714 pushed a commit to zgn-26714/abacus-develop that referenced this pull request Oct 10, 2025
* feat pexsi

* fix : diag not completed

* feat

* feat: pexsi hsolver

* CMake building implemented

* Works

* adapt to the new container

* Turn off USE_PEXSI

* Update LibRI to 553c91c

* modify include files

* namespace-ize

* new inputs added

* Configure Makefile Compiling, fix typos

* Fix Makefile Intel toolchains compile errors

* Fix even more PEXSI related Makefile compiling issues

* Modify inputs and update to latest version (#2)

* run INPUT.Default() in every process in InputParaTest (deepmodeling#3490)

Co-authored-by: kirk0830 <67682086+kirk0830@users.noreply.github.com>

* add blas support for FindLAPACK.cmake (deepmodeling#3497)

* more unittest of QO: towards orbital selection (deepmodeling#3499)

* Fix: fix bug in mulliken charge calculation (deepmodeling#3503)

* fix phase

* fix case test

* Refactor: namespace Conv_Coulomb_Pot_K (deepmodeling#3446)

* Refactor: namespace Conv_Coulomb_Pot_K

* Refactor: namespace Conv_Coulomb_Pot_K

---------

Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com>

* enable the computation of all zeros in one function call (deepmodeling#3449)

Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com>

* replace ios.eof() by ios.good() to avoid meeting badbit and failbit in reading STRU (deepmodeling#3506)

* Build: add ccache to accelerate the testing process (deepmodeling#3509)

* Build: add ccache to accelerate the testing process

* Update test.yml

* Update test.yml

* Update test.yml

* Docs: to avoid the misunderstanding in docs (deepmodeling#3518)

* to avoid the misunderstanding in docs

* Update docs/quick_start/hands_on.md

Co-authored-by: Chun Cai <amoycaic@gmail.com>

---------

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Docs: fix a missing depencency in conda build env (deepmodeling#3508)

* Feature: Add ENABLE_RAPIDJSON option to control the output of abacus.json (deepmodeling#3519)

Add ENABLE_RAPIDJSON option to control the output of abacus.json

* Feature: add python wrapper for math sphbes (deepmodeling#3475)

* recommit for review

* add python wrapper

* remove timer since performace tests add

* Feature: support segment split in kline mode in KPT file and `out_band` band output precision control, `8` as default (deepmodeling#3493)

* add precision control

* correct serial version of nscf_band function

* fix issue 3482

* update unit and integrated test

* update document

* correct unittest and make compatible with false and true

* fix: bug in Autotest.sh when result.ref has no totaltimeref (deepmodeling#3523)

* Fix : unit test of module_xc (deepmodeling#3524)

* Fix: omit small magnetic moments to avoid numerical instability (deepmodeling#3530)

* update deltalambda

* avoid numerical error in orbMulP

* add constrain on Mi

* change case reference value

* Fix: fix multiple compiler warnings (deepmodeling#3515)

* Fix: add noreturn attribute to warning_quit

* Add type conversion

* fix string literal

* fix small number trunctuation

* Fix system call returned value not checked

* fix missing braket

* Refactor parameter_pool.cpp and parameter_pool.h

* remove duplicated return statements

* Change WARNING_QUIT occurances in tests

* Add warning message to help debug UT

* output the default precision flag (deepmodeling#3496)

Co-authored-by: kirk0830 <67682086+kirk0830@users.noreply.github.com>

* Build: Improving CMake performance for finding LibXC and ELPA (deepmodeling#3478)

* Fix for finding LibXC and ELPA

* For compatibility to previous routines

* syntax fix for FindELPA.cmake

* Update cmake/FindELPA.cmake

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Using CMake interface as default for finding LibXC

* update docs

* fix for FindLibxc: changing imcompatible if statement

* fix for FindLibxc: changing imcompatible if statement

* fix for FindLibxc: changing imcompatible if statement

* update docs for installing pkg-config

* Update FindLibxc.cmake

* Update FindLibxc.cmake

* remove previous LibXC routine in CMakeLists.txt

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Update easy_install.md with Makefile-built LibXC supported

* Update easy_install.md to include different behavior in different version on finding ELPA

---------

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Docs: correct some docs about mp2 smearing method (deepmodeling#3533)

* correct some docs about mp2 smearing method

* add docs about mv method

* Feature : printing band density (deepmodeling#3501)

Co-authored-by: wenfei-li <liwenfei@gmail.com>
Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com>

* add some docs for PR#3501 (deepmodeling#3537)

* Feature: enable restart charge density mixing during SCF (deepmodeling#3542)

* add a new parameter mixing_restart

* do not update rho if iter==mixing_restart

* do not update rho if iter==mixing_restart-1

* reset mix and rho_mdata if iter==mixing_restart

* fix SCF exit directly since drho=0 if iter=GlobalV::MIXING_RESTART

* re-set_mixing in eachiterinit for PW and LCAO

* enable SCF restarts in esolver_ks::RUN

* add some UnitTests

* add some Docs

* new inputs added

* Update input-main.md (deepmodeling#3551)

Solve the format problem mentioned in issue 3543

* Build: fix compatibility issue against toolchain install (deepmodeling#3540)

* Fix for finding LibXC and ELPA

* For compatibility to previous routines

* syntax fix for FindELPA.cmake

* Update cmake/FindELPA.cmake

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Using CMake interface as default for finding LibXC

* update docs

* fix for FindLibxc: changing imcompatible if statement

* fix for FindLibxc: changing imcompatible if statement

* fix for FindLibxc: changing imcompatible if statement

* update docs for installing pkg-config

* Update FindLibxc.cmake

* Update FindLibxc.cmake

* remove previous LibXC routine in CMakeLists.txt

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Update easy_install.md with Makefile-built LibXC supported

* Update easy_install.md to include different behavior in different version on finding ELPA

* fix compatibility issue against toolchain

* Change default ELPA install routine to old one

---------

Co-authored-by: Chun Cai <amoycaic@gmail.com>

* Test: Configure performance tests for math libraries (deepmodeling#3511)

* add performace test of sphbes functions.

* fix benchmark cmake errors

* add dependencies for docker

* update docs

* add performance tests for sphbes

* add google benchmark

* rewrite benchmark tests in fixtures

* disable internal testing in benchmark

* merge benchmark into integration test

---------

Co-authored-by: StarGrys <771582678@qq.com>

* Configure Makefile Compiling, fix typos

* Fix Makefile Intel toolchains compile errors

* Fix even more PEXSI related Makefile compiling issues

* Update hsolver_pw.cpp (deepmodeling#3556)

when use_uspp==false, overlap matrix should be E.

* Fix: cuda build target (deepmodeling#3276)

* Fix: cuda buid target

* Update CMakeLists.txt

---------

Co-authored-by: Denghui Lu <denghuilu@pku.edu.cn>

---------

Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com>
Co-authored-by: kirk0830 <67682086+kirk0830@users.noreply.github.com>
Co-authored-by: Haozhi Han <haozhi.han@outlook.com>
Co-authored-by: Zhao Tianqi <hongriTianqi@users.noreply.github.com>
Co-authored-by: PeizeLin <78645006+PeizeLin@users.noreply.github.com>
Co-authored-by: jinzx10 <jzx016@hotmail.com>
Co-authored-by: Chun Cai <amoycaic@gmail.com>
Co-authored-by: Peng Xingliang <91927439+pxlxingliang@users.noreply.github.com>
Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
Co-authored-by: Wenfei Li <38569667+wenfei-li@users.noreply.github.com>
Co-authored-by: Denghui Lu <denghuilu@pku.edu.cn>
Co-authored-by: YI Zeping <18586016708@163.com>
Co-authored-by: wenfei-li <liwenfei@gmail.com>
Co-authored-by: jingan-181 <78459531+jingan-181@users.noreply.github.com>
Co-authored-by: StarGrys <771582678@qq.com>
Co-authored-by: Haozhi Han <haozhi.han@stu.pku.edu.cn>

* Revert "Modify inputs and update to latest version"

* Update FindPEXSI.cmake to fix Comments

* Fix CI errors

* Fix CI Errors and Merge with Upstream

* Resolve Pull Request Reviews

* Fix parallel communication related issue

* Fix vars in Makefile.vars, add input tests and comments for pexsi vars

* Fix nspin > 1 cases

* Improvement: take calculated mu as new initial guess, may slightly improve performance

* Fix mistakes in the last commit

* Fix: params and features
- set default pexsi_temp
- fix md in pexsi

* fix empty lines

* Fix: move params to pexsi_solver, rename USE_PEXSI to ENABLE_PEXSI

* Docs: added docs for pexsi inputs

* Fix unit test issues in input_conv

* Change default pexsi_npole from 80 to 40

* Place pexsi_EDM in DensityMatrix, set size of pexsi_dm = 1 when GlobalV::NSPIN==4, and add comments for dmToRho

* An unit test added for DiagoPexsi

* modify for changed gint interface

* correct nspin related behaviors

* add efermi passthrough

* Revert "add efermi passthrough"

This reverts commit d7b402d.

* commits to resolve conversations related to codes

* DM and EDM pointers in pexsi now handled by diagopexsi, and copying h s matrices no longer needed

* add pexsi examples

* fix pexsi unit test (original version shouldn't run)

* add building docs for pexsi

* set cxx standard to c++14, which is required in make_unique

* Fix: Fix typo related to pexsi

* update to PPEXSIDFTDriver2

* default npoints to 1, so single core pexsi will work

* Fix Compile errors

* refactor to abandon `pdiagh`

* Fix mu_buffer and nspin

* Updates with latest

* Refactor: in ESolver_KS_PW, calculate deband in iter_finish, not in hamilt2density

* Fix: make files in consistent with upstream

* Refactor

* Refactor

* Refactor

* Refactor

* Refactor

* Refactor: fix unit test

* Refactor: fix unit test

* Refactor: fix unit test

* Refactor: fix unit test

* Refactor: Remove set kvec funcs in `K_Vectors`

* Refactor: Remove final_scf

* Refactor: Fix kvecc2d/d2c

* Fix: Tests

* Fix: Tests

* Fix: Tests

* Fix: Tests

* Refactor: Final?

* Fix

* Fix

* Fix

* Fix

* Fix: Compile Error on CUDA > 12.9

* Fix: Compile Error on CUDA > 12.9

---------

Co-authored-by: zhangzhihao <1900017707@pku.edu.cn>
Co-authored-by: zhangzh-pku <64026312+zhangzh-pku@users.noreply.github.com>
Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com>
Co-authored-by: kirk0830 <67682086+kirk0830@users.noreply.github.com>
Co-authored-by: Haozhi Han <haozhi.han@outlook.com>
Co-authored-by: Zhao Tianqi <hongriTianqi@users.noreply.github.com>
Co-authored-by: PeizeLin <78645006+PeizeLin@users.noreply.github.com>
Co-authored-by: jinzx10 <jzx016@hotmail.com>
Co-authored-by: Chun Cai <amoycaic@gmail.com>
Co-authored-by: Peng Xingliang <91927439+pxlxingliang@users.noreply.github.com>
Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
Co-authored-by: Wenfei Li <38569667+wenfei-li@users.noreply.github.com>
Co-authored-by: Denghui Lu <denghuilu@pku.edu.cn>
Co-authored-by: YI Zeping <18586016708@163.com>
Co-authored-by: wenfei-li <liwenfei@gmail.com>
Co-authored-by: jingan-181 <78459531+jingan-181@users.noreply.github.com>
Co-authored-by: StarGrys <771582678@qq.com>
Co-authored-by: Haozhi Han <haozhi.han@stu.pku.edu.cn>
Co-authored-by: Mohan Chen <mohan.chen.chen.mohan@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Compile & CICD & Docs & Dependencies Issues related to compiling ABACUS GPU & DCU & HPC GPU and DCU and HPC related any issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants