[BUG] _check-whether-"sel"-is-big-enough not working | kit-2.0-beta3_ #1029

Vibsteamer · 2021-08-25T04:08:07Z

Summary
Checking for whether the number of "sel" is big enough in training is not working. Users can train with extremely small values without "WARNING".
But the performance gain in training keeps increasing when decreasing "sel" to rather small value (may mean "sel" did work) .
The "rcut" in test is larger as 12.0, not sure if there is some limit in counting neighbor atoms that was broken and led to the problem.

Deepmd-kit version, installation way, input file, running commands, error log, etc.

DEEPMD INFO    installed to:         /tmp/pip-req-build-qqv2ggzp/_skbuild/linux-x86_64-3.9/cmake-install
DEEPMD INFO    source :              v2.0.0.b3
DEEPMD INFO    source brach:         HEAD
DEEPMD INFO    source commit:        de428e3
DEEPMD INFO    source commit at:     2021-07-04 22:12:13 +0800
DEEPMD INFO    build float prec:     double
DEEPMD INFO    build with tf inc:    /opt/deepmd-kit-2.0.0.b3/lib/python3.9/site-packages/tensorflow/include;/opt/deepmd-kit-2.0.0.b3/lib/python3.9/site-packages/tensorflow/include
DEEPMD INFO    build with tf lib:

Platform : ALI - ehpc - beijing

Steps to Reproduce
please use the attached input("rcut":12.0, "sel": [10,10,10]) and data to train.

Further Information, Files, and Links
input

{
    "model": {
        "descriptor": {
            "type": "se_e2_a",
            "sel": [
                10,
                10,
                10
            ],
            "rcut_smth": 2.0,
            "rcut": 12.0,
            "neuron": [
                25,
                50,
                100
            ],
            "resnet_dt": false,
            "axis_neuron": 12,
            "type_one_side": true,
            "seed": 1801819940,
            "_activation_function": "tanh"
        },
        "fitting_net": {
            "neuron": [
                240,
                240,
                240
            ],
            "resnet_dt": true,
            "_coord_norm": true,
            "_type_fitting_net": false,
            "seed": 2375417769,
            "_activation_function": "tanh"
        },
        "type_map": [
            "Mg",
            "Al",
            "Cu"
        ]
    },
    "learning_rate": {
        "type": "exp",
        "start_lr": 0.001,
        "decay_steps": 80000,
        "stop_lr": 5e-8,
        "_decay_rate": 0.95
    },
    "loss": {
        "start_pref_e": 0.02,
        "limit_pref_e": 2,
        "start_pref_f": 1000,
        "limit_pref_f": 1,
        "start_pref_v": 0.01,
        "limit_pref_v": 1
    },
    "training": {
        "training_data": {
           "systems": "../../data/",
           "batch_size": "auto"
           },
       "validation_data":{
            "systems":          "../../data/",
            "batch_size":       1,
            "numb_btch":        4,
            "_comment":         "that's all"
        },
        "numb_steps": 16000000,
        "seed": 3982377700,
        "_comment": "that's all",
        "disp_file": "lcurve.out",
        "disp_freq": 2000,
        "numb_test": 1,
        "save_freq": 2000,
        "save_ckpt": "model.ckpt",
        "disp_training": true,
        "time_training": true,
        "profiling": false,
        "profiling_file": "timeline.json"
    }
}

data
https://pan.baidu.com/s/139Rsf7qHF6x750xMekAyhQ
uaxz

The text was updated successfully, but these errors were encountered:

njzjz · 2021-08-25T04:24:06Z

The warning was added back in #914, available in v2.0.0.b4. The check will do once before training.

For runtime checking, I may ask @denghuilu if it is effective. If not, we may not need it during training. But for MD simulations, we can consider to add an extra option to check it every N time step.

Vibsteamer · 2021-08-25T04:26:50Z

Thank you, may not need during training.
Solved.
Best~

Fixes deepmodeling#1029. Co-authored-by: Han Wang <92130845+wanghan-iapcm@users.noreply.github.com>

Fix deepmodeling#1029. Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

Vibsteamer added the bug label Aug 25, 2021

Vibsteamer changed the title ~~[BUG] _Replace With Suitable Title_~~ [BUG] _check-whether-"sel"-is-big-enough not working | kit-2.0-beta3_ Aug 25, 2021

Vibsteamer closed this as completed Aug 25, 2021

njzjz added a commit to njzjz/deepmd-kit that referenced this issue Sep 21, 2023

check data instead of tasks (deepmodeling#1031)

8c24997

Fixes deepmodeling#1029. Co-authored-by: Han Wang <92130845+wanghan-iapcm@users.noreply.github.com>

njzjz added a commit to njzjz/deepmd-kit that referenced this issue Sep 21, 2023

assign type_map for all systems (#1033)

c7e69cc

Fix deepmodeling#1029. Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] _check-whether-"sel"-is-big-enough not working | kit-2.0-beta3_ #1029

[BUG] _check-whether-"sel"-is-big-enough not working | kit-2.0-beta3_ #1029

Vibsteamer commented Aug 25, 2021

njzjz commented Aug 25, 2021

Vibsteamer commented Aug 25, 2021

[BUG] _check-whether-"sel"-is-big-enough not working | kit-2.0-beta3_ #1029

[BUG] _check-whether-"sel"-is-big-enough not working | kit-2.0-beta3_ #1029

Comments

Vibsteamer commented Aug 25, 2021

njzjz commented Aug 25, 2021

Vibsteamer commented Aug 25, 2021