Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] _check-whether-"sel"-is-big-enough not working | kit-2.0-beta3_ #1029

Closed
Vibsteamer opened this issue Aug 25, 2021 · 2 comments
Closed
Labels

Comments

@Vibsteamer
Copy link
Contributor

Summary
Checking for whether the number of "sel" is big enough in training is not working. Users can train with extremely small values without "WARNING".
But the performance gain in training keeps increasing when decreasing "sel" to rather small value (may mean "sel" did work) .
The "rcut" in test is larger as 12.0, not sure if there is some limit in counting neighbor atoms that was broken and led to the problem.

Deepmd-kit version, installation way, input file, running commands, error log, etc.

DEEPMD INFO    installed to:         /tmp/pip-req-build-qqv2ggzp/_skbuild/linux-x86_64-3.9/cmake-install
DEEPMD INFO    source :              v2.0.0.b3
DEEPMD INFO    source brach:         HEAD
DEEPMD INFO    source commit:        de428e3
DEEPMD INFO    source commit at:     2021-07-04 22:12:13 +0800
DEEPMD INFO    build float prec:     double
DEEPMD INFO    build with tf inc:    /opt/deepmd-kit-2.0.0.b3/lib/python3.9/site-packages/tensorflow/include;/opt/deepmd-kit-2.0.0.b3/lib/python3.9/site-packages/tensorflow/include
DEEPMD INFO    build with tf lib:

Platform : ALI - ehpc - beijing

Steps to Reproduce
please use the attached input("rcut":12.0, "sel": [10,10,10]) and data to train.

Further Information, Files, and Links
input

{
    "model": {
        "descriptor": {
            "type": "se_e2_a",
            "sel": [
                10,
                10,
                10
            ],
            "rcut_smth": 2.0,
            "rcut": 12.0,
            "neuron": [
                25,
                50,
                100
            ],
            "resnet_dt": false,
            "axis_neuron": 12,
            "type_one_side": true,
            "seed": 1801819940,
            "_activation_function": "tanh"
        },
        "fitting_net": {
            "neuron": [
                240,
                240,
                240
            ],
            "resnet_dt": true,
            "_coord_norm": true,
            "_type_fitting_net": false,
            "seed": 2375417769,
            "_activation_function": "tanh"
        },
        "type_map": [
            "Mg",
            "Al",
            "Cu"
        ]
    },
    "learning_rate": {
        "type": "exp",
        "start_lr": 0.001,
        "decay_steps": 80000,
        "stop_lr": 5e-8,
        "_decay_rate": 0.95
    },
    "loss": {
        "start_pref_e": 0.02,
        "limit_pref_e": 2,
        "start_pref_f": 1000,
        "limit_pref_f": 1,
        "start_pref_v": 0.01,
        "limit_pref_v": 1
    },
    "training": {
        "training_data": {
           "systems": "../../data/",
           "batch_size": "auto"
           },
       "validation_data":{
            "systems":          "../../data/",
            "batch_size":       1,
            "numb_btch":        4,
            "_comment":         "that's all"
        },
        "numb_steps": 16000000,
        "seed": 3982377700,
        "_comment": "that's all",
        "disp_file": "lcurve.out",
        "disp_freq": 2000,
        "numb_test": 1,
        "save_freq": 2000,
        "save_ckpt": "model.ckpt",
        "disp_training": true,
        "time_training": true,
        "profiling": false,
        "profiling_file": "timeline.json"
    }
}

data
https://pan.baidu.com/s/139Rsf7qHF6x750xMekAyhQ
uaxz

@Vibsteamer Vibsteamer added the bug label Aug 25, 2021
@Vibsteamer Vibsteamer changed the title [BUG] _Replace With Suitable Title_ [BUG] _check-whether-"sel"-is-big-enough not working | kit-2.0-beta3_ Aug 25, 2021
@njzjz
Copy link
Member

njzjz commented Aug 25, 2021

The warning was added back in #914, available in v2.0.0.b4. The check will do once before training.

For runtime checking, I may ask @denghuilu if it is effective. If not, we may not need it during training. But for MD simulations, we can consider to add an extra option to check it every N time step.

@Vibsteamer
Copy link
Contributor Author

Thank you, may not need during training.
Solved.
Best~

njzjz added a commit to njzjz/deepmd-kit that referenced this issue Sep 21, 2023
Fixes deepmodeling#1029.

Co-authored-by: Han Wang <92130845+wanghan-iapcm@users.noreply.github.com>
njzjz added a commit to njzjz/deepmd-kit that referenced this issue Sep 21, 2023
Fix deepmodeling#1029.

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants