Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean library and header dependencies #930

Merged
merged 3 commits into from
Aug 10, 2021
Merged

Conversation

njzjz
Copy link
Member

@njzjz njzjz commented Aug 6, 2021

Fix #926.

@njzjz njzjz linked an issue Aug 6, 2021 that may be closed by this pull request
@codecov-commenter
Copy link

codecov-commenter commented Aug 6, 2021

Codecov Report

Merging #930 (fa73dc9) into devel (4ced020) will decrease coverage by 11.13%.
The diff coverage is n/a.

❗ Current head fa73dc9 differs from pull request most recent head 56959ca. Consider uploading reports for the commit 56959ca to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##            devel     #930       +/-   ##
===========================================
- Coverage   75.41%   64.28%   -11.14%     
===========================================
  Files          85        5       -80     
  Lines        6729       14     -6715     
===========================================
- Hits         5075        9     -5066     
+ Misses       1654        5     -1649     
Impacted Files Coverage Δ
deepmd/infer/model_devi.py
source/op/_prod_virial_se_r_grad.py
deepmd/infer/__init__.py
deepmd/model/model_stat.py
deepmd/__init__.py
deepmd/fit/dipole.py
deepmd/utils/compat.py
deepmd/descriptor/se_a_ebd.py
deepmd/model/__init__.py
deepmd/__about__.py
... and 62 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4ced020...56959ca. Read the comment docs.

Copy link
Member Author

@njzjz njzjz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current dependencies:
image

@amcadmus amcadmus requested review from galeselee and denghuilu August 6, 2021 13:00
@amcadmus amcadmus requested a review from galeselee August 8, 2021 00:17
Copy link
Member

@denghuilu denghuilu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An error accured when I tried to compile and run the deepmd-kit python interface:

root deepmd-kit $ pip install .
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
Processing /root/dp-devel/deepmd-kit
  DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.
   pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555.
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Requirement already satisfied: python-hostlist>=1.21 in /root/dp-devel/tensorflow_venv/lib/python3.6/site-packages (from deepmd-kit==2.0.0b4.dev28+g2684f11) (1.21)
Requirement already satisfied: typing-extensions in /root/dp-devel/tensorflow_venv/lib/python3.6/site-packages (from deepmd-kit==2.0.0b4.dev28+g2684f11) (3.7.4.3)
Requirement already satisfied: scipy in /root/dp-devel/tensorflow_venv/lib/python3.6/site-packages (from deepmd-kit==2.0.0b4.dev28+g2684f11) (1.5.4)
Requirement already satisfied: dargs>=0.2.6 in /root/dp-devel/tensorflow_venv/lib/python3.6/site-packages/dargs-0.2.6-py3.6.egg (from deepmd-kit==2.0.0b4.dev28+g2684f11) (0.2.6)
Requirement already satisfied: numpy in /root/dp-devel/tensorflow_venv/lib/python3.6/site-packages (from deepmd-kit==2.0.0b4.dev28+g2684f11) (1.19.5)
Requirement already satisfied: pyyaml in /root/dp-devel/tensorflow_venv/lib/python3.6/site-packages/PyYAML-5.4.1-py3.6-linux-x86_64.egg (from deepmd-kit==2.0.0b4.dev28+g2684f11) (5.4.1)
Building wheels for collected packages: deepmd-kit
  Building wheel for deepmd-kit (PEP 517) ... done
  Created wheel for deepmd-kit: filename=deepmd_kit-2.0.0b4.dev28+g2684f11-cp36-cp36m-linux_x86_64.whl size=1624618 sha256=d480952a7cbb436be83f76e2825688d2955accde84723c7e5b4ac44733c16bc1
  Stored in directory: /root/.cache/pip/wheels/d7/0d/2a/b6f1653be91ad8dd465c1560d36da90e510c5d488bc4fd6563
Successfully built deepmd-kit
Installing collected packages: deepmd-kit
Successfully installed deepmd-kit-2.0.0b4.dev28+g2684f11
root deepmd-kit $ dp -h
WARNING:tensorflow:From /root/dp-devel/tensorflow_venv/lib/python3.6/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Traceback (most recent call last):
  File "/root/dp-devel/tensorflow_venv/lib/python3.6/site-packages/deepmd/env.py", line 176, in get_module
    module = tf.load_op_library(str(module_file))
  File "/root/dp-devel/tensorflow_venv/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 57, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: libdeepmd_op_cuda.so: cannot open shared object file: No such file or directory

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/dp-devel/tensorflow_venv/bin/dp", line 5, in <module>
    from deepmd.entrypoints.main import main
  File "/root/dp-devel/tensorflow_venv/lib/python3.6/site-packages/deepmd/__init__.py", line 3, in <module>
    import deepmd.utils.network as network
  File "/root/dp-devel/tensorflow_venv/lib/python3.6/site-packages/deepmd/utils/__init__.py", line 2, in <module>
    from .data import DeepmdData
  File "/root/dp-devel/tensorflow_venv/lib/python3.6/site-packages/deepmd/utils/data.py", line 10, in <module>
    from deepmd.env import GLOBAL_NP_FLOAT_PRECISION
  File "/root/dp-devel/tensorflow_venv/lib/python3.6/site-packages/deepmd/env.py", line 253, in <module>
    op_module = get_module("libop_abi")
  File "/root/dp-devel/tensorflow_venv/lib/python3.6/site-packages/deepmd/env.py", line 224, in get_module
    )) from e
RuntimeError: This deepmd-kit package is inconsitent with TensorFlowRuntime, thus an error is raised when loading libop_abi.You need to rebuild deepmd-kit against this TensorFlowruntime.

@njzjz
Copy link
Member Author

njzjz commented Aug 9, 2021

@denghuilu I cannot reproduce the error. Can you check with ldd?

(dpdev) [jz748@localhost deepmd-kit]$ ldd /home/jz748/anaconda3/envs/dpdev/lib/python3.8/site-packages/deepmd/op/libop_abi.so
        linux-vdso.so.1 (0x00007ffd9b1ef000)
        libdeepmd.so => /home/jz748/anaconda3/envs/dpdev/lib/python3.8/site-packages/deepmd/op/libdeepmd.so (0x00007f80e4561000)
        libtensorflow_framework.so.2 => not found
        libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f80e4329000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f80e41e5000)
        libgomp.so.1 => /lib64/libgomp.so.1 (0x00007f80e419f000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f80e4184000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f80e4161000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f80e3f92000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f80e3f8b000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f80e3f80000)
        libdeepmd_op_cuda.so => /home/jz748/anaconda3/envs/dpdev/lib/python3.8/site-packages/deepmd/op/libdeepmd_op_cuda.so (0x00007f80e3c5f000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f80e46e2000)
dp -h
2021-08-08 20:06:38.747579: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
WARNING:tensorflow:From /home/jz748/anaconda3/envs/dpdev/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:Environment variable KMP_BLOCKTIME is empty. Use the default value 0
WARNING:root:Environment variable KMP_AFFINITY is empty. Use the default value granularity=fine,verbose,compact,1,0
usage: dp [-h] {config,transfer,train,freeze,test,compress,doc-train-input,model-devi,convert-from} ...

DeePMD-kit: A deep learning package for many-body potential energy representation and molecular dynamics

optional arguments:
  -h, --help            show this help message and exit

Valid subcommands:
  {config,transfer,train,freeze,test,compress,doc-train-input,model-devi,convert-from}
    config              fast configuration of parameter file for smooth model
    transfer            pass parameters to another model
    train               train a model
    freeze              freeze the model
    test                test the model
    compress            compress a model
    doc-train-input     print the documentation (in rst format) of input training parameters.
    model-devi          calculate model deviation
    convert-from        convert lower model version to supported version

@denghuilu
Copy link
Member

@njzjz

root deepmd-kit $ ls /root/dp-devel/tensorflow_venv/lib/python3.6/site-packages/deepmd/op/
_gelu.py              libdeepmd.so    _prod_force_grad.py       _prod_virial_grad.py       __pycache__               _tabulate_grad.py
__init__.py           libop_abi.so    _prod_force_se_a_grad.py  _prod_virial_se_a_grad.py  _soft_min_force_grad.py
libdeepmd_op_cuda.so  libop_grads.so  _prod_force_se_r_grad.py  _prod_virial_se_r_grad.py  _soft_min_virial_grad.py

root deepmd-kit $ ldd /root/dp-devel/tensorflow_venv/lib/python3.6/site-packages/deepmd/op/libop_abi.so
        linux-vdso.so.1 (0x00007ffec3ed7000)
        libdeepmd.so => /root/dp-devel/tensorflow_venv/lib/python3.6/site-packages/deepmd/op/libdeepmd.so (0x00007f7630ef1000)
        libtensorflow_framework.so.2 => not found
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f7630b68000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f76307ca000)
        libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f763059b000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f7630383000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7630164000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f762fd73000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f762fb6f000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f762f967000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f76314a8000)
        libdeepmd_op_cuda.so => not found

@njzjz
Copy link
Member Author

njzjz commented Aug 9, 2021

Maybe related to https://stackoverflow.com/q/61479487/9567349, but it's unclear to me why I don't have this error...

@denghuilu
Copy link
Member

denghuilu commented Aug 9, 2021

Maybe related to https://stackoverflow.com/q/61479487/9567349, but it's unclear to me why I don't have this error...

I can only get this problem when using DP_VARIANT=cuda, which may be helpful to reproduce the problem.

@njzjz
Copy link
Member Author

njzjz commented Aug 9, 2021

@denghuilu what is your cmake version?

@njzjz
Copy link
Member Author

njzjz commented Aug 9, 2021

@denghuilu Please check if 56959ca works.

@amcadmus amcadmus requested a review from denghuilu August 9, 2021 05:42
@denghuilu
Copy link
Member

@denghuilu what is your cmake version?

cmake version 3.10.2

Copy link
Member

@denghuilu denghuilu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everything works well on my local workstation now.

@amcadmus amcadmus merged commit 06d591d into deepmodeling:devel Aug 10, 2021
@njzjz njzjz deleted the dep branch August 10, 2021 01:03
gzq942560379 pushed a commit to HPC-AI-Team/deepmd-kit that referenced this pull request Sep 2, 2021
* clean library and header dependencies

Fix deepmodeling#926.

* fix typo in rocm

* set INSTALL_RPATH for libraries
njzjz pushed a commit to njzjz/deepmd-kit that referenced this pull request Sep 21, 2023
…#930)

- skip cell_type if from_poscar is True
- use from_poscar = jdata.get('from_poscar', False) to get from_poscar
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] rebuild library dependencies
5 participants