update NvNMD training code #4800

jiongwalai · 2025-06-11T07:16:45Z

Update NvNMD training code:

support different FPGA device: vu9p (Bohrium and NvNMD web) and vu13p (coming soon)
set "stripped_type_embedding": true, "smooth_type_embdding": true, "set_davg_zero": false in nvnmd-v1
support setting seed in nvnmd dict

Summary by CodeRabbit

New Features
- CLI option to initialize training from a frozen model; device selection support (vu9p/vu13p).
New/Changed Behavior
- Descriptor recovery pipeline updated with an additional descriptor component and propagated recovery mask; mapping mode adjusted.
Improvements
- Deep-copying of configs/data, explicit numeric precision, seed propagation, finer mapping resolution, version-aware data handling and packing.
Bug Fixes
- Updated tests and expected outputs to match revised behavior.
Chores
- Logging, validation and minor cleanups.

…d_train

… devel

update nvnmd training code

Signed-off-by: jiongwalai <89775787+jiongwalai@users.noreply.github.com>

for more information, see https://pre-commit.ci

coderabbitai · 2025-06-11T07:21:21Z

📝 Walkthrough

Walkthrough

Adds a train CLI flag to initialize from a frozen model, replaces descriptor quantization handling with a recovery pipeline (introducing a new k component and recovered_switch), extends MapTable to support k and higher-resolution s2g, introduces device/version-aware NVNMD configuration and deep-copy semantics, minor type-embedding flag, and updates tests to new expected outputs.

Changes

Cohort / File(s)	Summary
CLI / Entrypoints `deepmd/main.py`, `deepmd/tf/nvnmd/entrypoints/train.py`	Added `-f`/`--init-frz-model` CLI arg and propagated `init_frz_model` into `train_nvnmd`; replaced some dict.copy() with `copy.deepcopy()` when preparing training/freeze job data.
Descriptor: recovery & filtering `deepmd/tf/nvnmd/descriptor/se_atten.py`, `deepmd/tf/descriptor/se_atten.py`	Removed `descrpt2r4`; added `build_recovered` and `descrpt2shkr` producing recovered descriptor + `k`/`recovered_switch`; `filter_lower_R42GR` signature extended to accept `recovered_switch` and applies it to two-body terms; se_atten now calls `build_recovered` under quantize+enable conditions and forwards the switch.
Mapping / MapTable changes `deepmd/tf/nvnmd/entrypoints/mapt.py`	Added `k` and `k_grad` to u2s and gradient builders, increased s2g resolution (N→8192, N2→32), changed default Gs_Gt_mode to 2 and adjusted shift logic, and enforced GLOBAL_NP_FLOAT_PRECISION for numpy arrays.
Version/device config & deep copies `deepmd/tf/nvnmd/data/data.py`, `deepmd/tf/nvnmd/utils/config.py`, `deepmd/tf/nvnmd/entrypoints/wrap.py`	Replaced shallow copies with `copy.deepcopy()`, added/propagated `device` (`vu9p`/`vu13p`), adjusted `dscp`/`ctrl` fields (seed, M3, NSTEP, NSTDM, SUB_VERSION, NBIT_NSTEP, set_prefix, descriptor flags), made shape/packing changes for v1 (including k/k_grad), and ensured explicit numeric dtypes.
Type embed minor change `deepmd/tf/utils/type_embed.py`	Sets `self.use_tebd_bias = True` when `nvnmd_cfg.enable` is true.
Argument definitions `deepmd/utils/argcheck_nvnmd.py`	Added `device` argument to `nvnmd_args` (`str`, default `"none"`, doc mentions `vu9p`/`vu13p`).
Tests `source/tests/tf/test_nvnmd_entrypoints.py`	Updated test setup to concatenate map parts and adjusted numerous expected numeric outputs (mappings, descriptors, energies, byte arrays) to match new behavior.

Sequence Diagram(s)

sequenceDiagram
  participant Inputs as Descriptor Inputs
  participant Build as build_recovered
  participant SE as DescrptSeAtten._pass_filter
  participant Filter as filter_lower_R42GR
  participant Next as Downstream Embedding

  Note over Inputs,Build: When nvnmd_cfg.enable && quantize_descriptor
  Inputs->>Build: descrpt, avg/std, params
  Build-->>SE: recovered_descrpt, recovered_switch
  SE->>Filter: inputs_i, nei_type_vec, recovered_switch
  Filter-->>Next: filtered_embedding
  Note over Filter,Next: recovered_switch masks two-body contributions before filtering

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Pay attention to descriptor recovery math and shapes in deepmd/tf/nvnmd/descriptor/se_atten.py.
Verify MapTable k/gradient integration and s2g resolution changes in deepmd/tf/nvnmd/entrypoints/mapt.py.
Ensure deep-copy and device logic consistency across data.py, config.py, and wrap code paths.

Possibly related PRs

fix: consistent DPA-1 model #4320 — High overlap: also modifies the descriptor pipeline replacing descrpt2r4 with a recovered-descriptor flow and integrating recovered_switch.

Suggested reviewers

njzjz
wanghan-iapcm

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The pull request title 'update NvNMD training code' is generic and does not clearly convey the main changes. While it references NvNMD, it lacks specificity about what aspects were updated. The raw_summary shows substantial changes across multiple files including support for different FPGA devices (vu9p/vu13p), descriptor recovery mechanisms, mapping table updates, and configuration defaults. The title does not capture these key improvements or the specific nature of the updates.	Consider refining the title to be more specific about the primary changes, such as 'Add FPGA device support and descriptor recovery to NvNMD training' or 'Support device-specific configuration and quantized descriptor recovery in NvNMD'. This would better communicate the main objectives of the changeset to developers reviewing the commit history.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🔭 Outside diff range comments (1)

deepmd/main.py (1)

243-248: 🛠️ Refactor suggestion

CLI breaking change: -f short option repurposed

-f used to map to --finetune; it now maps to --init-frz-model, while --finetune is moved to -t.
Existing user scripts will break silently.

Consider either:

Keeping the original mapping and adding a new short flag (e.g. -F) for init-frz-model, or

Providing both aliases (-f & -t) for --finetune and issuing a deprecation warning for -f.

This guards backwards compatibility.

🧹 Nitpick comments (7)

deepmd/main.py (1)

781-787: Duplicate option definition – risk of diverging behaviour

The same --init-frz-model / -f flag is re-defined for the train-nvnmd sub-parser.
Keeping two independent definitions invites accidental drift (e.g. help text, default).

Refactor into a small helper that registers the shared mutually-exclusive group for both sub-parsers, ensuring consistency in one place.
deepmd/tf/nvnmd/entrypoints/train.py (1)
70-75: Seed injection is duplicated – centralise to avoid divergence

seed is picked from jdata_nvnmd_ twice. Extract once into a local variable and reuse:
-        "descriptor": {
-            "seed": jdata_nvnmd_.get("seed", 1),
+        seed = jdata_nvnmd_.get("seed", 1)
+        "descriptor": {
+            "seed": seed,
...
-        "fitting_net": {"seed": jdata_nvnmd_.get("seed", 1)},
+        "fitting_net": {"seed": seed},
Reduces chance of the two defaults diverging in future edits.
deepmd/tf/nvnmd/entrypoints/wrap.py (3)
140-141: Consider keeping f-string formatting for better readability.

The change from f-strings to % formatting is a step backward in terms of Python best practices. F-strings are more readable and performant.
-                log.info("%s: %d x % d bit" % (k, h, w * 4))
-                FioTxt().save("nvnmd/wrap/h%s.txt" % (k), d)
+                log.info(f"{k}: {h} x {w * 4} bit")
+                FioTxt().save(f"nvnmd/wrap/h{k}.txt", d)
469-473: Specify explicit dtype for numpy arrays.

For clarity and consistency, specify the dtype explicitly when creating numpy arrays.
-            nrs = np.zeros(nr)
-            ncs = np.zeros(nc)
-            wrs = np.zeros([nr, nc])
-            wcs = np.zeros([nr, nc])
+            nrs = np.zeros(nr, dtype=GLOBAL_NP_FLOAT_PRECISION)
+            ncs = np.zeros(nc, dtype=GLOBAL_NP_FLOAT_PRECISION)
+            wrs = np.zeros([nr, nc], dtype=GLOBAL_NP_FLOAT_PRECISION)
+            wcs = np.zeros([nr, nc], dtype=GLOBAL_NP_FLOAT_PRECISION)
548-549: Remove commented debug code.

Commented debug code should be removed to keep the codebase clean.
-        # k = 2**23
-        # print(dsws[0][42] * k)
deepmd/tf/nvnmd/entrypoints/mapt.py (1)
455-455: Fix typo in comment.
-# N+1 ranther than N for calculating defference
+# N+1 rather than N for calculating difference
deepmd/tf/nvnmd/utils/config.py (1)
170-177: Remove redundant logging statements.

There are three consecutive log statements for self.config["dscp"] which seems excessive. Consider keeping only one or making them conditional on debug level.
-log.info(self.config["dscp"])
 dp_in = {"type_map": fioObj.get(jdata, "type_map", [])}
 self.config["dpin"] = fioObj.update(dp_in, self.config["dpin"])
 #
-log.info(self.config["dscp"])
 self.init_net_size()
 self.init_value()
 log.info(self.config["dscp"])

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a6042c6 and 0e723bb.

📒 Files selected for processing (11)

deepmd/main.py (1 hunks)
deepmd/tf/descriptor/se_atten.py (4 hunks)
deepmd/tf/nvnmd/data/data.py (13 hunks)
deepmd/tf/nvnmd/descriptor/se_atten.py (4 hunks)
deepmd/tf/nvnmd/entrypoints/mapt.py (11 hunks)
deepmd/tf/nvnmd/entrypoints/train.py (5 hunks)
deepmd/tf/nvnmd/entrypoints/wrap.py (12 hunks)
deepmd/tf/nvnmd/utils/config.py (18 hunks)
deepmd/tf/utils/type_embed.py (1 hunks)
deepmd/utils/argcheck_nvnmd.py (2 hunks)
source/tests/tf/test_nvnmd_entrypoints.py (9 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (3)

deepmd/tf/descriptor/se_atten.py (1)

deepmd/tf/nvnmd/descriptor/se_atten.py (1)

build_recovered (79-123)

deepmd/tf/nvnmd/data/data.py (1)

data/raw/copy_raw.py (1)

copy (11-71)

deepmd/tf/nvnmd/entrypoints/mapt.py (4)

deepmd/tf/nvnmd/utils/config.py (1)

init_from_jdata (55-81)

deepmd/tf/utils/sess.py (1)

run_sess (12-50)

deepmd/tf/infer/deep_eval.py (2)

sess (311-314)

sess (1223-1226)

deepmd/utils/data.py (1)

avg (379-394)

🪛 Ruff (0.11.9)

deepmd/tf/nvnmd/descriptor/se_atten.py

202-202: Local variable M3 is assigned to but never used

Remove assignment to unused variable M3

(F841)

deepmd/tf/nvnmd/utils/config.py

55-55: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

🔇 Additional comments (21)

deepmd/tf/nvnmd/entrypoints/train.py (2)

156-160: Deep-copying template dicts – good, but templates should be immutable

Using copy.deepcopy avoids side-effects – nice!
Consider making jdata_cmd_train / jdata_cmd_freeze constants (UPPER_CASE) and never mutating them; deep-copy whenever you need a writable version.

Also applies to: 193-197

147-151:
✅ Verification successful

Passing init_frz_model to generic train() may raise TypeError

jdata_cmd_train now contains init_frz_model, and the dict is forwarded verbatim:
train(**jdata)
If deepmd.tf.entrypoints.train.train() has not been updated to accept this kw-arg, runtime will fail.

Expected: parameter list includes init_frz_model or **kwargs.
If absent, either update the signature or remove the key before the call.

🏁 Script executed:
#!/bin/bash
# Verify `train()` signature contains the new parameter
rg -n --context 2 $'def +train\\(' deepmd/tf/entrypoints | head
Length of output: 269

🏁 Script executed:
#!/bin/bash
# Show the full signature of train() to verify accepted parameters
sed -n '50,100p' deepmd/tf/entrypoints/train.py
Length of output: 1516
No action needed: train() already accepts init_frz_model

The train() signature in deepmd/tf/entrypoints/train.py (lines 55–68) includes both init_frz_model and **kwargs, so forwarding init_frz_model via train(**jdata) will not raise a TypeError.
source/tests/tf/test_nvnmd_entrypoints.py (1)

475-489: Updated golden numbers only – ensure underlying logic is unchanged

Large blocks of expected values were replaced without explanation in the test docstrings.
Verify that they stem from intentional algorithmic changes and not from an unnoticed regression or random seed drift.

Also applies to: 791-805, 576-704

deepmd/tf/nvnmd/entrypoints/wrap.py (2)

519-546: Good implementation of k component handling.

The addition of the k component to the mapping tables is implemented correctly, and the use of copy.deepcopy prevents reference issues when appending to lists. The kkk factor appropriately handles type boundaries.

566-566: Dimension update correctly reflects the addition of k component.

The change from 2 to 3 dimensions properly accounts for the new k component alongside the existing s and h components.

deepmd/tf/descriptor/se_atten.py (1)

719-731: Proper integration of the new descriptor recovery mechanism.

The conditional block correctly checks for NVNMD quantization before calling build_recovered, and the returned recovered_switch is appropriately stored for later use in filtering.

deepmd/tf/nvnmd/descriptor/se_atten.py (2)

79-124: Well-implemented descriptor recovery function.

The build_recovered function properly normalizes the s and h components, applies filtering operations, and returns both the normalized descriptor and the k component as a smoothing switch. The use of tf.ensure_shape provides good runtime validation.

241-252: Verify the formula change is intentional.

The computation has changed from G = Gs * Gt to G = Gs * Gt + Gs. This adds a residual connection which may be intentional, but please confirm this mathematical change is correct for the NVNMD algorithm.

The modulation of two_embd by recovered_switch is correctly implemented for smoothing.

deepmd/tf/nvnmd/data/data.py (6)

1-1: Good use of deep copy for configuration data isolation.

The change from shallow copy to deep copy ensures that nested dictionaries are properly isolated, preventing unintended mutations when configurations are modified. This is particularly important for configuration data that contains nested structures.

Also applies to: 125-127, 261-263, 335-336, 404-405

15-15: Seed parameters correctly added for reproducibility.

The addition of "seed": 1 to descriptor and fitting network configurations ensures reproducible training runs, which aligns with the PR objectives.

Also applies to: 45-45, 149-149, 179-179

158-158: Verify the device-specific parameter adjustments.

The following changes were made:

Added "M3": 2 parameter to v1 configurations

Incremented SUB_VERSION from 1 to 2

Reduced NSTDM from 128 to 64 and NSTDM_M2 from 4 to 2 in v1_ni256

Added "NBIT_NSTEP": 8

Please confirm these parameter adjustments are correct for the vu9p device configuration and document the rationale for these specific values.

Also applies to: 210-210, 266-268, 273-273, 253-253

294-294: Device field correctly added for FPGA support.

The addition of "device": "vu9p" supports the PR objective of enabling different FPGA devices (vu9p and vu13p).

Also applies to: 363-363

344-346: Descriptor configuration properly updated with granular controls.

The replacement of "tebd_input_mode": "strip" with three boolean flags ("stripped_type_embedding": True, "smooth_type_embdding": True, "set_davg_zero": False) provides more precise control over the embedding behavior, as specified in the PR objectives.

331-331: Training data configuration correctly extended.

The addition of "set_prefix": "set" to the training data configuration is appropriate for dataset organization.

Also applies to: 400-400

deepmd/tf/nvnmd/entrypoints/mapt.py (3)

94-104: New mode 2 implementation for recovered switch handling.

The change from Gs_Gt_mode = 1 to Gs_Gt_mode = 2 introduces a new calculation mode that uses xyz_scatter * two_embd * recovered_switch + xyz_scatter with zero shifts for both Gs and Gt. This appears to integrate with the new k tensor (recovered switch) functionality.

407-414: New k tensor (recovered switch) correctly implemented.

The implementation adds a new mapping component k calculated as k = -kk^3 + 1 where kk = 1 - rmin * s, clipped to [0,1]. This recovered switch function and its gradients are properly integrated into the mapping table structure alongside the existing s and h components.

Also applies to: 139-147

550-551: Mapping resolution appropriately increased for finer control.

The s2g mapping resolution has been doubled (N: 4096→8192, N2: 16→32) and the warning threshold increased from 16.0 to 32.0. These changes provide finer granularity in the mapping table, which should improve accuracy for the new device configurations.

Also applies to: 153-153, 554-555

deepmd/tf/nvnmd/utils/config.py (4)

47-47: Device-specific configuration properly implemented.

The code correctly handles device-specific parameters for "vu9p" and "vu13p" devices:

NSTEP values are set based on sel thresholds (128, 160, 192, 224, 256)

NSTDM parameters differ between devices (vu13p: 32, vu9p: 64)

Please verify these device-specific values are correct for the respective FPGA hardware.

Also applies to: 61-61, 219-247, 286-295

366-381: Sel parameter validation and rounding correctly implemented.

The code properly validates and rounds the sel parameter to hardware-supported values (128, 160, 192, 224, 256) for version 1 configurations. This ensures compatibility with FPGA implementations.

2-2: Deep copy usage ensures proper configuration isolation.

The consistent use of copy.deepcopy() for configuration dictionaries prevents unintended mutations when configurations are modified, maintaining data integrity across different configuration variants.

Also applies to: 50-50, 140-144, 149-153, 438-438

355-356: Warning threshold correctly updated to match mapping changes.

The increase from 16.0 to 32.0 is consistent with the expanded mapping resolution in mapt.py and allows for a wider valid range of switch function values.

coderabbitai · 2025-06-11T07:29:31Z

deepmd/tf/utils/type_embed.py

+        if nvnmd_cfg.enable:
+            self.use_tebd_bias = True


🛠️ Refactor suggestion

Implicitly overriding use_tebd_bias may break caller expectations

self.use_tebd_bias is force-set to True whenever nvnmd_cfg.enable is True, regardless of the value passed in the constructor.
This silently discards a user-supplied use_tebd_bias=False, which is surprising and makes the public parameter meaningless as soon as NVNMD is on.

-if nvnmd_cfg.enable: - self.use_tebd_bias = True +if nvnmd_cfg.enable and not self.use_tebd_bias: + log.debug("nvnmd enabled → enabling type-embedding bias") + self.use_tebd_bias = True

At minimum, emit a warning or honour the original flag unless NVNMD strictly requires the bias.
Document the behaviour in the class docstring as well.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if nvnmd_cfg.enable:

self.use_tebd_bias = True

if nvnmd_cfg.enable and not self.use_tebd_bias:

log.debug("nvnmd enabled → enabling type-embedding bias")

self.use_tebd_bias = True

🤖 Prompt for AI Agents

In deepmd/tf/utils/type_embed.py around lines 190 to 191, the code forcibly sets self.use_tebd_bias to True when nvnmd_cfg.enable is True, ignoring any user-provided value for use_tebd_bias. To fix this, modify the logic to either honor the original use_tebd_bias value unless NVNMD strictly requires it to be True, or emit a warning when overriding it. Additionally, update the class docstring to clearly document this behavior so users understand when and why use_tebd_bias might be changed automatically.

coderabbitai · 2025-06-11T07:29:31Z

deepmd/utils/argcheck_nvnmd.py

+    doc_device = "hardware used by model, vu9p or vu13p"
    doc_net_size_file = (


🛠️ Refactor suggestion

device argument lacks value validation

A new "device" option is introduced but any string is currently accepted. Down-stream code (e.g. NvnmdConfig) seems to assume "vu9p" / "vu13p" or "none".

-Argument("device", str, optional=False, default="none", doc=doc_device), +Argument( + "device", + str, + optional=False, + default="none", + doc=doc_device, + choices=["none", "vu9p", "vu13p"], # enforce valid values +),

Fail-fast validation here prevents cryptic errors later in the pipeline.

Also applies to: 30-31

🤖 Prompt for AI Agents

In deepmd/utils/argcheck_nvnmd.py around lines 12-13 and 30-31, the "device" argument accepts any string but downstream code expects only "vu9p", "vu13p", or "none". Add validation logic to check if the provided "device" value is one of these allowed strings and raise an error immediately if not. This fail-fast validation will prevent obscure errors later in the pipeline.

coderabbitai · 2025-06-11T07:29:31Z

deepmd/tf/nvnmd/entrypoints/wrap.py

            NBIT_FLTE = nbit["NBIT_FLTE"]
+            NBIT_NSTEP = nbit["NBIT_NSTEP"]
            NIX = dscp["NIX"]
+            print(dscp)


⚠️ Potential issue

Replace print statement with proper logging.

Debug print statements should use the logging framework for consistency and control.

- print(dscp) + log.debug("Descriptor configuration: %s", dscp)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

print(dscp)

log.debug("Descriptor configuration: %s", dscp)

🤖 Prompt for AI Agents

In deepmd/tf/nvnmd/entrypoints/wrap.py at line 333, replace the print statement with a logging call. Import the logging module if not already done, then use an appropriate logging level such as logging.debug or logging.info to log the variable dscp instead of printing it directly. This ensures consistent and configurable output handling.

deepmd/tf/nvnmd/descriptor/se_atten.py

deepmd/tf/nvnmd/utils/config.py

update code

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

deepmd/tf/nvnmd/descriptor/se_atten.py (2)
98-99: Superfluous subtraction

(h - 0) / std_h is equivalent to h / std_h and just adds noise to the expression.
-    h_norm = (h - 0) / std_h
+    h_norm = h / std_h
127-133: Doc-string and dead code drift

The doc-string still says the routine maps to “s and h”, but the code now produces k as well.
Also, ndescrpt is computed but never used.

Please update the documentation and remove the unused variable to keep the intent clear.

🧰 Tools

🪛 Ruff (0.11.9)

132-132: Local variable ndescrpt is assigned to but never used

Remove assignment to unused variable ndescrpt

(F841)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e723bb and e39c09e.

📒 Files selected for processing (2)

deepmd/tf/nvnmd/descriptor/se_atten.py (4 hunks)
deepmd/tf/nvnmd/entrypoints/wrap.py (12 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

deepmd/tf/nvnmd/entrypoints/wrap.py

🔇 Additional comments (1)

deepmd/tf/nvnmd/descriptor/se_atten.py (1)

197-204: New parameter not reflected in API contract

filter_lower_R42GR() now expects recovered_switch, yet the function doc-string and upstream callers were not updated.
For clarity and to avoid silent breakage:

Document the new argument in the doc-string.

Audit all call-sites to make sure the extra tensor is always supplied.

coderabbitai · 2025-06-12T02:54:58Z

deepmd/tf/nvnmd/descriptor/se_atten.py

+        s_norm = op_module.flt_nvnmd(s_norm)
+        log.debug("#s: %s", s_norm)
+        s_norm = tf.ensure_shape(s_norm, [None, 1])
+    with tf.variable_scope("h", reuse=True):
+        h_norm = op_module.flt_nvnmd(h_norm)
+        log.debug("#h: %s", h_norm)
+        h_norm = tf.ensure_shape(h_norm, [None, 1])
+    # merge into [sji, hji*xji, hji*yji, hji*zji]


🛠️ Refactor suggestion

reuse=True may fail on the first graph build

tf.variable_scope(..., reuse=True) assumes the scope already exists.
When this function is executed in a fresh graph the variables/ops are not yet defined and TensorFlow will raise a ValueError.
Use reuse=tf.AUTO_REUSE (TF >= 1.4) or create the scope without reuse on the first pass.

🤖 Prompt for AI Agents

In deepmd/tf/nvnmd/descriptor/se_atten.py around lines 103 to 110, the use of tf.variable_scope with reuse=True can cause a ValueError if the scope does not exist yet during the first graph build. To fix this, change the reuse argument to tf.AUTO_REUSE to allow automatic reuse or create the variable scope without reuse on the first pass to ensure the scope is created before reuse is applied.

coderabbitai · 2025-06-12T02:54:58Z

deepmd/tf/nvnmd/descriptor/se_atten.py

+def build_recovered(
+    descrpt, t_avg, t_std, atype, Na, ntypes, rcut_r_smth, filter_precision
+):


⚠️ Potential issue

Remove or use the filter_precision parameter

filter_precision is accepted by build_recovered() but never referenced inside the function, which is misleading and may confuse future maintainers.

-def build_recovered( - descrpt, t_avg, t_std, atype, Na, ntypes, rcut_r_smth, filter_precision -): +def build_recovered( + descrpt, t_avg, t_std, atype, Na, ntypes, rcut_r_smth +):

Either drop the argument (and update all call-sites) or apply it where you originally intended.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def build_recovered(

descrpt, t_avg, t_std, atype, Na, ntypes, rcut_r_smth, filter_precision

):

def build_recovered(

descrpt, t_avg, t_std, atype, Na, ntypes, rcut_r_smth

):

...

🤖 Prompt for AI Agents

In deepmd/tf/nvnmd/descriptor/se_atten.py around lines 79 to 81, the function build_recovered() accepts a parameter filter_precision that is never used inside the function. To fix this, either remove the filter_precision parameter from the function signature and update all places where build_recovered() is called to no longer pass this argument, or if filter_precision was intended to be used, incorporate it appropriately within the function logic.

codecov · 2025-06-15T05:31:18Z

Codecov Report

Attention: Patch coverage is 80.39216% with 50 lines in your changes missing coverage. Please review.

Project coverage is 84.71%. Comparing base (2b32af5) to head (9f13087).

Files with missing lines	Patch %	Lines
deepmd/tf/nvnmd/utils/config.py	56.32%	38 Missing ⚠️
deepmd/tf/nvnmd/entrypoints/train.py	16.66%	5 Missing ⚠️
deepmd/tf/nvnmd/entrypoints/wrap.py	93.33%	3 Missing ⚠️
deepmd/tf/nvnmd/descriptor/se_atten.py	97.01%	2 Missing ⚠️
deepmd/tf/nvnmd/entrypoints/mapt.py	94.28%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##            devel    #4800      +/-   ##
==========================================
- Coverage   84.72%   84.71%   -0.01%     
==========================================
  Files         699      699              
  Lines       68182    68286     +104     
  Branches     3542     3541       -1     
==========================================
+ Hits        57768    57850      +82     
- Misses       9280     9304      +24     
+ Partials     1134     1132       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

wanghan-iapcm · 2025-06-17T05:57:41Z

@jiongwalai please fix the precommit issues.

njzjz · 2025-06-19T08:16:34Z

deepmd/tf/nvnmd/data/data.py

-# change the configuration according to the max_nnei
-jdata_config_v0_ni128 = jdata_config_v0.copy()
-jdata_config_v0_ni256 = jdata_config_v0.copy()
+# change the configuration accordng to the max_nnei


There seems to be some meaningless changes like this

There may be differences between the parameters jdata_config_v0_ni128 and jdata_config_v0_ni256, which are used for different training configurations. I think they should be retained.

Why did you change according to accordng?

It has been moved to PyPI. See https://github.com/mpi4py/mpi4py/releases/tag/4.1.0

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

deepmd/tf/nvnmd/entrypoints/mapt.py (1)
149-149: Minor formatting issue in block comment.

The static analysis tool flagged excessive leading '#' characters in the block comment.
-        ## u2s
+        # u2s

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8215320 and d6a1fe9.

📒 Files selected for processing (1)

deepmd/tf/nvnmd/entrypoints/mapt.py (12 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

deepmd/tf/nvnmd/entrypoints/mapt.py (1)

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4219
File: deepmd/utils/learning_rate.py:48-53
Timestamp: 2024-10-15T22:22:24.889Z
Learning: Methods in `deepmd/utils/learning_rate.py` that return NumPy scalar types should have return type annotations using the corresponding NumPy types, such as `np.float64`.

🪛 Flake8 (7.2.0)

deepmd/tf/nvnmd/entrypoints/mapt.py

[error] 149-149: too many leading '#' for block comment

(E266)

🔇 Additional comments (9)

deepmd/tf/nvnmd/entrypoints/mapt.py (9)

10-10: Good addition of explicit precision control.

The import of GLOBAL_NP_FLOAT_PRECISION aligns with the PR objectives to improve numerical precision and consistency throughout the NVNMD pipeline.

95-97: Clear documentation of the new Gs_Gt_mode behavior.

The updated comments clearly explain the new mode 2 behavior with recovered_switch and the corresponding shift parameter values.

140-148: Excellent implementation of 'k' tensor mapping coefficients.

The addition of 'k' tensor mapping coefficients follows the same pattern as 's' and 'h' tensors, maintaining consistency in the codebase. This supports the enhanced mapping capabilities mentioned in the PR objectives.

206-206: Consistent precision handling in numpy array initialization.

The use of GLOBAL_NP_FLOAT_PRECISION ensures consistent numerical precision across the mapping operations.

408-415: Well-implemented 'k' tensor calculation.

The mathematical implementation of the 'k' tensor is correct:

kk = 1 - rmin * s

k = -kk³ + 1

Proper clipping to [0, 1] range

This follows a cubic polynomial form that complements the existing 's' and 'h' tensors.

426-435: Consistent gradient calculation for 'k' tensor.

The gradient calculations for the 'k' tensor follow the same pattern as 's' and 'h' tensors, ensuring mathematical consistency in the mapping table generation.

607-627: Good addition of pylint disable comments.

The pylint disable comments for no-explicit-dtype are appropriate since the dtype is implicitly determined by the input tensors in these TensorFlow operations.

555-560: Mapping resolution update validated by test_nvnmd_entrypoints.py

In deepmd/tf/nvnmd/entrypoints/mapt.py you’ve doubled
• N: 4096→8192
• N2: 16→32
• warning threshold: (smax–smin) > 16.0→32.0

The test source/tests/tf/test_nvnmd_entrypoints.py now uses idx = [..., 4096, 8192] (and even 16384 in one case), confirming the code handles the larger tables correctly.

The change is intentional and covered by tests. Note that doubling N and N2 doubles the mapping‐table size (and related TensorFlow placeholders), which will increase memory usage and computation time.

154-154: Double-check s2g mapping intervals in cfg_s2g
The two segments in deepmd/tf/nvnmd/entrypoints/mapt.py now read:
cfg_s2g = [
    [s[0],   s[256],    s[1]  - s[0],  0,   256],
    [s[0],   s[8192],   s[32] - s[0],  256, 512],
]
Please ensure these index choices and step sizes match the desired higher-resolution mapping ranges.

• File to inspect:

deepmd/tf/nvnmd/entrypoints/mapt.py (cfg_s2g definition)
• Verify in downstream use:

descriptor/se_a.py & descriptor/se_atten.py (table_info = cfg_s2g)

wrap.py & mapping2 calls in entrypoints

test expectations in source/tests/tf/test_nvnmd_entrypoints.py

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d6a1fe9 and 8fdbee8.

📒 Files selected for processing (1)

deepmd/tf/nvnmd/entrypoints/mapt.py (13 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

deepmd/tf/nvnmd/entrypoints/mapt.py (1)

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4219
File: deepmd/utils/learning_rate.py:48-53
Timestamp: 2024-10-15T22:22:24.889Z
Learning: Methods in `deepmd/utils/learning_rate.py` that return NumPy scalar types should have return type annotations using the corresponding NumPy types, such as `np.float64`.

🧬 Code Graph Analysis (1)

deepmd/tf/nvnmd/entrypoints/mapt.py (2)

deepmd/tf/nvnmd/utils/config.py (1)

init_from_jdata (55-81)

deepmd/tf/utils/sess.py (1)

run_sess (12-50)

🪛 Flake8 (7.2.0)

deepmd/tf/nvnmd/entrypoints/mapt.py

[error] 149-149: too many leading '#' for block comment

(E266)

[error] 615-615: undefined name 'GLOBAL_TF_FLOAT_PRECISION'

(F821)

[error] 621-621: undefined name 'GLOBAL_TF_FLOAT_PRECISION'

(F821)

[error] 631-631: undefined name 'GLOBAL_TF_FLOAT_PRECISION'

(F821)

🪛 Ruff (0.11.9)

deepmd/tf/nvnmd/entrypoints/mapt.py

615-615: Undefined name GLOBAL_TF_FLOAT_PRECISION

(F821)

621-621: Undefined name GLOBAL_TF_FLOAT_PRECISION

(F821)

631-631: Undefined name GLOBAL_TF_FLOAT_PRECISION

(F821)

🔇 Additional comments (14)

deepmd/tf/nvnmd/entrypoints/mapt.py (14)

10-10: LGTM: Import addition for numerical precision consistency.

The import of GLOBAL_NP_FLOAT_PRECISION aligns with the explicit dtype usage throughout the file.

81-81: Minor: Consistent with other method signatures.

Removing the return type annotation is consistent with other methods in the class.

140-148: LGTM: Consistent implementation of k component mapping.

The new k component mapping follows the same pattern as s and h components, maintaining consistency in the codebase.

206-206: LGTM: Explicit dtype usage for numerical consistency.

Using GLOBAL_NP_FLOAT_PRECISION ensures consistent floating-point precision throughout the codebase.

262-262: Minor: Consistent with method signature style.

Removing the return type annotation is consistent with the class's method signature style.

403-403: LGTM: Updated return signature to include k component.

The return signature correctly includes the third component for consistency with the new k mapping tensor.

408-415: LGTM: Well-implemented k component calculation.

The k component calculation follows a logical pattern:

Uses the relationship kk = 1 - rmin * s

Applies cubic transformation: k = -kk³ + 1

Clips values to [0, 1] range

Maintains consistent tensor reshaping

The implementation is mathematically sound and consistent with the existing codebase patterns.

426-426: LGTM: Consistent integration of k component gradients.

The k component and its gradients are properly integrated into the gradient calculation pipeline, following the same pattern as s and h components.

Also applies to: 433-435

460-462: LGTM: Improved numerical precision consistency.

Using GLOBAL_NP_FLOAT_PRECISION for array initialization ensures consistent floating-point precision throughout the computation pipeline.

Also applies to: 466-468

480-482: LGTM: Proper initialization of k component values.

The k component values are properly initialized to zero for boundary conditions, consistent with the s and h component handling.

Also applies to: 490-492

494-496: LGTM: Version-specific handling for consistency.

The special handling for version 1 to set s[tt][0] = 0 ensures consistent behavior across different versions.

555-556: LGTM: Consistent parameter scaling.

The parameter changes are well-coordinated:

N increased from 4096 to 8192 (2x)

N2 increased from 16 to 32 (2x)

Warning threshold increased from 16.0 to 32.0 (2x)

This maintains the same precision ratio while supporting higher resolution.

Also applies to: 559-559

568-574: LGTM: Consistent array initialization with explicit precision.

The array initialization uses GLOBAL_NP_FLOAT_PRECISION for consistency, and the logic correctly handles the updated parameter values.

Also applies to: 578-582

154-154: Verify memory and performance impact of doubled s2g intervals

We’ve updated the second interval in cfg_s2g from
[s[0], s[4096], s[16] - s[0], 256, 512]
to
[s[0], s[8192], s[32] - s[0], 256, 512]
which doubles the resolution for the s→g mapping. While tests in
source/tests/tf/test_nvnmd_entrypoints.py (lines 143/592) now expect indices up to 8192, please confirm:

• deepmd/tf/nvnmd/entrypoints/mapt.py (around line 152) – that build_map_coef and mapping2 still perform within acceptable memory and runtime limits with the finer grid
• That generated outputs match the newly updated expected values in the entrypoint tests
• No unintended regressions in other downstream pipelines relying on s→g mappings

coderabbitai · 2025-07-04T09:39:22Z

deepmd/tf/nvnmd/entrypoints/mapt.py

        type_embedding_nei = tf.tile(
-            tf.reshape(type_embedding, [1, padding_ntypes, -1]),
+            tf.reshape(
+                type_embedding, [1, padding_ntypes, -1], dtype=GLOBAL_TF_FLOAT_PRECISION


⚠️ Potential issue

Fix undefined GLOBAL_TF_FLOAT_PRECISION constant.

The code uses GLOBAL_TF_FLOAT_PRECISION but it's not imported, causing static analysis errors. This will result in runtime errors.

Add the missing import at the top of the file:

from deepmd.tf.env import ( GLOBAL_NP_FLOAT_PRECISION, + GLOBAL_TF_FLOAT_PRECISION, op_module, tf, )

Alternatively, if this constant doesn't exist in the codebase, replace the usages with the appropriate TensorFlow precision type (e.g., tf.float64 or tf.float32).

Also applies to: 621-621, 631-631

🧰 Tools

🪛 Ruff (0.11.9)

615-615: Undefined name GLOBAL_TF_FLOAT_PRECISION

(F821)

🪛 Flake8 (7.2.0)

[error] 615-615: undefined name 'GLOBAL_TF_FLOAT_PRECISION'

(F821)

🤖 Prompt for AI Agents

In deepmd/tf/nvnmd/entrypoints/mapt.py at lines 615, 621, and 631, the constant GLOBAL_TF_FLOAT_PRECISION is used but not defined or imported, causing errors. Fix this by either importing GLOBAL_TF_FLOAT_PRECISION from its module at the top of the file if it exists, or replace all occurrences of GLOBAL_TF_FLOAT_PRECISION with the appropriate TensorFlow dtype such as tf.float32 or tf.float64 to ensure the code runs correctly.

coderabbitai · 2025-07-04T09:39:22Z

deepmd/tf/nvnmd/entrypoints/mapt.py

+        # 2 : xyz_scatter = xyz_scatter * two_embd * recovered_switch + xyz_scatter;
+        # Gs + 0, Gt + 0
+        self.Gs_Gt_mode = 2


💡 Verification agent

🧩 Analysis chain

Verify the impact of changing Gs_Gt_mode default from 1 to 2.

The default mode has been changed from 1 to 2, which affects the shift calculations (shift_Gs and shift_Gt both become 0). This is a significant behavioral change that should be verified.

Also applies to: 102-104

🏁 Script executed:

#!/bin/bash # Description: Search for any references to Gs_Gt_mode in tests or documentation # Expected: Find usage patterns and test expectations for the new default echo "=== Searching for Gs_Gt_mode usage ===" rg -A 5 -B 5 "Gs_Gt_mode" echo "=== Searching for shift_Gs and shift_Gt usage ===" rg -A 3 -B 3 "shift_G[st]"

Length of output: 3729

Action Required: Document and Test the New Default Gs_Gt_mode=2

The change of the default from mode 1→2 is not exercised by existing tests (they all set mapObj.Gs_Gt_mode = 2 explicitly) nor documented, so downstream users may be unaware or untested against the new behavior. Please:

deepmd/tf/nvnmd/entrypoints/mapt.py (around line 97): update any in-code comments or docstrings to reflect that the default is now 2.

source/tests/tf/test_nvnmd_entrypoints.py: add a unit test for MapTable.build_map() without overriding Gs_Gt_mode, to verify shift_Gs = shift_Gt = 0 and the expected output.

CLI entrypoints or configuration loaders that instantiate MapTable without manually setting Gs_Gt_mode: confirm they still behave correctly under mode 2 or expose a flag to select the mode.

Documentation/README: note the behavioral change in the release notes or API reference.

🤖 Prompt for AI Agents

In deepmd/tf/nvnmd/entrypoints/mapt.py around lines 95 to 97, update the in-code comments and any relevant docstrings to clearly state that the default value of Gs_Gt_mode is now 2 instead of 1. In source/tests/tf/test_nvnmd_entrypoints.py, add a unit test for MapTable.build_map() that does not override Gs_Gt_mode, verifying that shift_Gs and shift_Gt are both 0 and the output matches expectations under mode 2. Review CLI entrypoints and configuration loaders that create MapTable instances without setting Gs_Gt_mode explicitly to ensure they function correctly with the new default or add an option to select the mode. Finally, update the project documentation or release notes to document this default behavior change for downstream users.

… devel

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

deepmd/tf/nvnmd/entrypoints/mapt.py (1)

95-97: Document the behavioral change from mode 1 to mode 2.

The default Gs_Gt_mode has been changed from 1 to 2, which affects the shift calculations (both shift_Gs and shift_Gt become 0). This is a significant behavioral change that should be documented.

🧹 Nitpick comments (1)

deepmd/tf/nvnmd/entrypoints/mapt.py (1)
607-609: Improve type casting clarity.

The double casting tf.cast(..., filter_precision) where the inner argument already has dtype=filter_precision is redundant and could be simplified.
-last_type = tf.cast(
-    tf.zeros([1, ebd_type.shape[1]], dtype=filter_precision), filter_precision
-)
+last_type = tf.zeros([1, ebd_type.shape[1]], dtype=filter_precision)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9dfb1fa and e20b2d4.

📒 Files selected for processing (1)

deepmd/tf/nvnmd/entrypoints/mapt.py (12 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

deepmd/tf/nvnmd/entrypoints/mapt.py (2)

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4219
File: deepmd/utils/learning_rate.py:48-53
Timestamp: 2024-10-15T22:22:24.889Z
Learning: Methods in `deepmd/utils/learning_rate.py` that return NumPy scalar types should have return type annotations using the corresponding NumPy types, such as `np.float64`.

Learnt from: HydrogenSulfate
PR: deepmodeling/deepmd-kit#4414
File: deepmd/pd/train/training.py:66-66
Timestamp: 2024-11-29T12:15:22.226Z
Learning: The function `nvprof_context` is defined in `deepmd/pd/utils/utils.py`, so importing it in `deepmd/pd/train/training.py` is correct.

🪛 Flake8 (7.2.0)

deepmd/tf/nvnmd/entrypoints/mapt.py

[error] 149-149: too many leading '#' for block comment

(E266)

🔇 Additional comments (7)

deepmd/tf/nvnmd/entrypoints/mapt.py (7)

10-10: Good addition for precision consistency.

Adding GLOBAL_NP_FLOAT_PRECISION import ensures consistent floating-point precision throughout the numpy operations.

154-154: Confirm the s2g mapping resolution increase.

The configuration parameter has been updated from s[16] to s[32], doubling the resolution. This aligns with the N2 increase from 16 to 32 in the run_s2g method.

408-415: Review the k tensor formula implementation.

The k tensor implementation uses the formula k = -kk³ + 1 where kk = 1 - rmin * s. This appears to be a recovery/switch function. The clipping to [0.0, 1.0] range is appropriate.

559-560: Document the warning threshold increase.

The warning threshold for the range of s has been increased from 16.0 to 32.0, which aligns with the doubled resolution. This change should be documented in the code comments or docstring.

460-462: Excellent use of GLOBAL_NP_FLOAT_PRECISION for consistency.

The explicit use of GLOBAL_NP_FLOAT_PRECISION in numpy array creation ensures numerical precision consistency throughout the codebase.

Also applies to: 466-468, 571-574, 578-581

555-557: Assess s2g Resolution Performance and Memory Impact

I didn’t find any existing benchmarks or memory-usage tests covering the change from N=4096→8192 and N2=16→32. Doubling the mapping resolution will increase both compute time and memory footprint, so please:

Manually verify performance (throughput, latency) on representative workloads.

Measure peak memory usage on target hardware.

Add or update any performance tests or benchmarks in source/tests/.

Document the expected impact on resource requirements in your README or developer guide.

File to review:

deepmd/tf/nvnmd/entrypoints/mapt.py (lines 555–557)

140-148: k tensor mapping verified

Downstream components correctly integrate the new k and k_grad tensors using the same patterns as s and h. No further changes needed.

• deepmd/tf/nvnmd/descriptor/se_atten.py – handles k, applies flt_nvnmd(k) and shapes it alongside s and h.
• deepmd/tf/nvnmd/entrypoints/wrap.py – concatenates maps["k"] and maps["k_grad"] when assembling descriptors for version > 0.

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

deepmd/tf/nvnmd/entrypoints/mapt.py (1)

95-97: Document and verify the new default mode behavior.

The default Gs_Gt_mode has been changed from 1 to 2, which is a significant behavioral change. The comments clearly explain the new mode, but this change should be verified for backward compatibility.

🧹 Nitpick comments (1)

deepmd/tf/nvnmd/entrypoints/mapt.py (1)
149-149: Fix comment formatting style issue.

The static analysis tool reports excessive leading '#' characters for the block comment. Consider using a standard comment format.

Apply this fix:
-        ## u2s
+        # u2s

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e20b2d4 and 9e0e6a8.

📒 Files selected for processing (1)

deepmd/tf/nvnmd/entrypoints/mapt.py (12 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

deepmd/tf/nvnmd/entrypoints/mapt.py (2)

Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4219
File: deepmd/utils/learning_rate.py:48-53
Timestamp: 2024-10-15T22:22:24.889Z
Learning: Methods in `deepmd/utils/learning_rate.py` that return NumPy scalar types should have return type annotations using the corresponding NumPy types, such as `np.float64`.

Learnt from: HydrogenSulfate
PR: deepmodeling/deepmd-kit#4414
File: deepmd/pd/train/training.py:66-66
Timestamp: 2024-11-29T12:15:22.226Z
Learning: The function `nvprof_context` is defined in `deepmd/pd/utils/utils.py`, so importing it in `deepmd/pd/train/training.py` is correct.

🪛 Flake8 (7.2.0)

deepmd/tf/nvnmd/entrypoints/mapt.py

[error] 149-149: too many leading '#' for block comment

(E266)

🔇 Additional comments (17)

deepmd/tf/nvnmd/entrypoints/mapt.py (17)

10-10: LGTM: Import addition is correct.

The import of GLOBAL_NP_FLOAT_PRECISION is properly added and used consistently throughout the file for numerical precision.

81-81: LGTM: Removed unnecessary return type annotation.

The return type annotation removal from __init__ is appropriate since constructors don't return meaningful values.

102-104: LGTM: Shift values correctly set for mode 2.

The shift values (shift_Gs = 0, shift_Gt = 0) are correctly set to match the documented behavior for mode 2.

140-148: LGTM: 'k' component mapping added consistently.

The addition of the 'k' component mapping follows the same pattern as the existing 's' and 'h' components, maintaining consistency in the codebase.

154-154: LGTM: s2g mapping intervals doubled.

The s2g mapping configuration intervals have been doubled (4096→8192, 16→32), which should provide higher precision mapping.

206-206: LGTM: Improved precision consistency.

Using GLOBAL_NP_FLOAT_PRECISION for dtype ensures numerical precision consistency throughout the codebase.

262-262: LGTM: Removed unnecessary return type annotation.

The return type annotation removal is appropriate since the method has no implementation (just pass).

408-415: LGTM: 'k' component calculation added correctly.

The 'k' component calculation is mathematically sound and follows the same pattern as the existing 's' and 'h' components, including proper reshaping and clipping.

426-426: LGTM: Updated to handle 'k' component.

The function call correctly unpacks three components including the new 'k' component, maintaining consistency with the build_u2s changes.

433-435: LGTM: 'k' gradient calculation added consistently.

The gradient calculation for the 'k' component follows the same pattern as the existing 's' and 'h' gradient calculations.

460-462: LGTM: Improved precision and added clarifying comment.

Using GLOBAL_NP_FLOAT_PRECISION ensures consistency, and the comment clarifies why N+1 is used for calculating differences.

466-470: LGTM: Improved reference calculation with better precision.

Using np.float64 for dtype and updating the comment improves code clarity and maintains numerical precision.

480-496: LGTM: 'k' component initialization added consistently.

The initialization of 'k' component values follows the same pattern as 's' and 'h' components, properly handling edge cases and version-specific logic.

555-559: LGTM: s2g mapping parameters doubled for higher precision.

The mapping parameters have been consistently doubled (N: 4096→8192, N2: 16→32, warning threshold: 16.0→32.0), which should provide improved mapping precision.

568-574: LGTM: Improved precision and formatting.

Using GLOBAL_NP_FLOAT_PRECISION ensures consistency, and the improved formatting enhances code readability.

578-582: LGTM: Reference calculation improved with better precision.

Using np.float64 for dtype and improved formatting enhances both precision and readability.

607-609: LGTM: Explicit dtype specification added.

Adding explicit dtype specification to tf.zeros and casting to filter_precision improves precision control and consistency.

coderabbitai · 2025-07-04T14:40:54Z

deepmd/tf/nvnmd/entrypoints/mapt.py

                sl.append(s)
                hl.append(h)
-            return sl, hl
+            return sl, hl, sl


⚠️ Potential issue

Fix potential bug: 'sl' returned twice instead of third component.

The return statement return sl, hl, sl returns sl twice, which appears to be a bug. Based on the version 1 implementation that follows, this should return three distinct components.

Consider this fix:

- return sl, hl, sl + return sl, hl, kl # or appropriate third component

You'll need to implement the third component (kl) similar to how k is handled in the version 1 case.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In deepmd/tf/nvnmd/entrypoints/mapt.py at line 403, the return statement mistakenly returns 'sl' twice instead of returning three distinct components. To fix this, compute the third component 'kl' similarly to how 'k' is handled in the version 1 case, and update the return statement to return 'sl', 'hl', and 'kl' respectively.

Signed-off-by: jiongwalai <89775787+jiongwalai@users.noreply.github.com>

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

deepmd/main.py (1)
863-869: LGTM! Implementation is correct and consistent.

The new --init-frz-model option is properly implemented with appropriate type, default value, and help text. It follows the existing pattern in the train-nvnmd parser and is consistent with the similar option in the train parser.

Optional suggestion for future improvement:

Consider grouping --init-model, --restart, and --init-frz-model into a mutually exclusive group (similar to the train parser at lines 240-268) to prevent users from accidentally passing conflicting initialization options. This would improve the CLI UX:
+    parser_train_nvnmd_subgroup = parser_train_nvnmd.add_mutually_exclusive_group()
-    parser_train_nvnmd.add_argument(
+    parser_train_nvnmd_subgroup.add_argument(
         "-i",
         "--init-model",
         type=str,
         default=None,
         help="Initialize the model by the provided path prefix of checkpoint files.",
     )
-    parser_train_nvnmd.add_argument(
+    parser_train_nvnmd_subgroup.add_argument(
         "-r",
         "--restart",
         type=str,
         default=None,
         help="Restart the training from the provided prefix of checkpoint files.",
     )
-    parser_train_nvnmd.add_argument(
+    parser_train_nvnmd_subgroup.add_argument(
         "-f",
         "--init-frz-model",
         type=str,
         default=None,
         help="Initialize the training from the frozen model.",
     )
Note: This suggestion applies to the broader train-nvnmd parser design, not just the newly added option.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 61e7f63 and 0ffdb5c.

📒 Files selected for processing (1)

deepmd/main.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Always run ruff check . and ruff format . before committing changes to Python code

Files:

deepmd/main.py

jiongwalai and others added 15 commits April 16, 2025 16:53

support init_frz_model in nvnmd_train

1896d9a

support for setting random seed of descriptor and fitting_net in nvnm…

64e0feb

…d_train

improve accuracy in nvnmd-v1; support vu13p device

24fcdb2

Merge branch 'deepmodeling:devel' into devel

0df4e9c

initial nvnmd

34cfbdb

Merge branch 'devel' of https://github.com/jiongwalai/deepmd-kit into…

6d02bc2

… devel

change mode for v1 in nvnmd

60b57f7

delete self.use_tebd in nvnmd v0

31d1b4c

test code modified in nvnmd

77d561c

test code modified in nvnmd

51cb3e3

final submit in test code of nvnmd

f54e5d6

remove debug file and code

f0e231e

update test env

ae7e466

Merge pull request #3 from jiongwalai/nvnmd_0611

1ecf90a

update nvnmd training code

update encode.py

065ed29

Signed-off-by: jiongwalai <89775787+jiongwalai@users.noreply.github.com>

github-actions bot added the Python label Jun 11, 2025

[pre-commit.ci] auto fixes from pre-commit.com hooks

0e723bb

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Jun 11, 2025

View reviewed changes

jiongwalai and others added 2 commits June 12, 2025 07:45

update code

dab9a2b

Merge pull request #4 from jiongwalai/nvnmd_0612

e39c09e

update code

coderabbitai bot reviewed Jun 12, 2025

View reviewed changes

Merge branch 'devel' into devel

6194477

wanghan-iapcm requested a review from njzjz June 17, 2025 05:56

njzjz requested changes Jun 25, 2025

View reviewed changes

qo0Alex0op and others added 3 commits June 26, 2025 17:06

fix pre-commit isue and train bug

c2d130a

fix(CI): clean up mpi4py index

5e93e03

It has been moved to PyPI. See https://github.com/mpi4py/mpi4py/releases/tag/4.1.0

install mpich

70e2de5

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

pre-commit-ci bot and others added 4 commits July 3, 2025 02:35

[pre-commit.ci] auto fixes from pre-commit.com hooks

8215320

for more information, see https://pre-commit.ci

cancel tf warnings

0f0575a

Merge branch 'devel_0630' into devel

cd27da9

[pre-commit.ci] auto fixes from pre-commit.com hooks

d6a1fe9

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Jul 4, 2025

View reviewed changes

qo0Alex0op and others added 3 commits July 4, 2025 17:28

point dtype for tf

51d6a92

Merge branch 'devel_0630' into devel

bce8b7a

[pre-commit.ci] auto fixes from pre-commit.com hooks

8fdbee8

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Jul 4, 2025

View reviewed changes

qo0Alex0op and others added 5 commits July 4, 2025 17:46

fix tf dtype isue

0a30b57

Merge branch 'devel_0630' into devel

453e1eb

Merge branch 'devel' of https://github.com/jiongwalai/deepmd-kit into…

9a08ab7

… devel

[pre-commit.ci] auto fixes from pre-commit.com hooks

9dfb1fa

for more information, see https://pre-commit.ci

remove GLOBAL_TF_FLOAT_PRECISION

e20b2d4

coderabbitai bot reviewed Jul 4, 2025

View reviewed changes

[pre-commit.ci] auto fixes from pre-commit.com hooks

9e0e6a8

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Jul 4, 2025

View reviewed changes

jiongwalai added 2 commits July 5, 2025 17:29

Merge branch 'devel' into devel

5584209

Merge branch 'devel' into devel

9f13087

jiongwalai requested a review from njzjz July 17, 2025 06:21

jiongwalai added 4 commits July 22, 2025 19:46

Update data.py

fe86ad6

Signed-off-by: jiongwalai <89775787+jiongwalai@users.noreply.github.com>

Update data.py

40691e3

Signed-off-by: jiongwalai <89775787+jiongwalai@users.noreply.github.com>

Merge branch 'devel' into devel

f201af8

Merge branch 'deepmodeling:devel' into devel

267bbbe

Copilot AI review requested due to automatic review settings August 14, 2025 16:19

Copilot AI reviewed Aug 14, 2025

View reviewed changes

jiongwalai added 3 commits August 28, 2025 09:59

Merge branch 'devel' into devel

559ef12

Merge branch 'deepmodeling:devel' into devel

61e7f63

Merge branch 'deepmodeling:devel' into devel

0ffdb5c

coderabbitai bot reviewed Nov 3, 2025

View reviewed changes

		doc_device = "hardware used by model, vu9p or vu13p"
		doc_net_size_file = (

update NvNMD training code #4800

Are you sure you want to change the base?

update NvNMD training code #4800

Conversation

jiongwalai commented Jun 11, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jun 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

wanghan-iapcm commented Jun 17, 2025

Uh oh!

njzjz Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

jiongwalai Jul 5, 2025

Choose a reason for hiding this comment

Uh oh!

njzjz Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

jiongwalai commented Jun 11, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 11, 2025 •

edited

Loading

codecov bot commented Jun 15, 2025 •

edited

Loading