Skip to content

Conversation

@jiongwalai
Copy link

@jiongwalai jiongwalai commented Jun 11, 2025

Update NvNMD training code:

  1. support different FPGA device: vu9p (Bohrium and NvNMD web) and vu13p (coming soon)
  2. set "stripped_type_embedding": true, "smooth_type_embdding": true, "set_davg_zero": false in nvnmd-v1
  3. support setting seed in nvnmd dict

Summary by CodeRabbit

  • New Features

    • CLI option to initialize training from a frozen model; device selection support (vu9p/vu13p).
  • New/Changed Behavior

    • Descriptor recovery pipeline updated with an additional descriptor component and propagated recovery mask; mapping mode adjusted.
  • Improvements

    • Deep-copying of configs/data, explicit numeric precision, seed propagation, finer mapping resolution, version-aware data handling and packing.
  • Bug Fixes

    • Updated tests and expected outputs to match revised behavior.
  • Chores

    • Logging, validation and minor cleanups.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jun 11, 2025

📝 Walkthrough

Walkthrough

Adds a train CLI flag to initialize from a frozen model, replaces descriptor quantization handling with a recovery pipeline (introducing a new k component and recovered_switch), extends MapTable to support k and higher-resolution s2g, introduces device/version-aware NVNMD configuration and deep-copy semantics, minor type-embedding flag, and updates tests to new expected outputs.

Changes

Cohort / File(s) Summary
CLI / Entrypoints
deepmd/main.py, deepmd/tf/nvnmd/entrypoints/train.py
Added -f/--init-frz-model CLI arg and propagated init_frz_model into train_nvnmd; replaced some dict.copy() with copy.deepcopy() when preparing training/freeze job data.
Descriptor: recovery & filtering
deepmd/tf/nvnmd/descriptor/se_atten.py, deepmd/tf/descriptor/se_atten.py
Removed descrpt2r4; added build_recovered and descrpt2shkr producing recovered descriptor + k/recovered_switch; filter_lower_R42GR signature extended to accept recovered_switch and applies it to two-body terms; se_atten now calls build_recovered under quantize+enable conditions and forwards the switch.
Mapping / MapTable changes
deepmd/tf/nvnmd/entrypoints/mapt.py
Added k and k_grad to u2s and gradient builders, increased s2g resolution (N→8192, N2→32), changed default Gs_Gt_mode to 2 and adjusted shift logic, and enforced GLOBAL_NP_FLOAT_PRECISION for numpy arrays.
Version/device config & deep copies
deepmd/tf/nvnmd/data/data.py, deepmd/tf/nvnmd/utils/config.py, deepmd/tf/nvnmd/entrypoints/wrap.py
Replaced shallow copies with copy.deepcopy(), added/propagated device (vu9p/vu13p), adjusted dscp/ctrl fields (seed, M3, NSTEP, NSTDM, SUB_VERSION, NBIT_NSTEP, set_prefix, descriptor flags), made shape/packing changes for v1 (including k/k_grad), and ensured explicit numeric dtypes.
Type embed minor change
deepmd/tf/utils/type_embed.py
Sets self.use_tebd_bias = True when nvnmd_cfg.enable is true.
Argument definitions
deepmd/utils/argcheck_nvnmd.py
Added device argument to nvnmd_args (str, default "none", doc mentions vu9p/vu13p).
Tests
source/tests/tf/test_nvnmd_entrypoints.py
Updated test setup to concatenate map parts and adjusted numerous expected numeric outputs (mappings, descriptors, energies, byte arrays) to match new behavior.

Sequence Diagram(s)

sequenceDiagram
  participant Inputs as Descriptor Inputs
  participant Build as build_recovered
  participant SE as DescrptSeAtten._pass_filter
  participant Filter as filter_lower_R42GR
  participant Next as Downstream Embedding

  Note over Inputs,Build: When nvnmd_cfg.enable && quantize_descriptor
  Inputs->>Build: descrpt, avg/std, params
  Build-->>SE: recovered_descrpt, recovered_switch
  SE->>Filter: inputs_i, nei_type_vec, recovered_switch
  Filter-->>Next: filtered_embedding
  Note over Filter,Next: recovered_switch masks two-body contributions before filtering
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Pay attention to descriptor recovery math and shapes in deepmd/tf/nvnmd/descriptor/se_atten.py.
  • Verify MapTable k/gradient integration and s2g resolution changes in deepmd/tf/nvnmd/entrypoints/mapt.py.
  • Ensure deep-copy and device logic consistency across data.py, config.py, and wrap code paths.

Possibly related PRs

  • fix: consistent DPA-1 model #4320 — High overlap: also modifies the descriptor pipeline replacing descrpt2r4 with a recovered-descriptor flow and integrating recovered_switch.

Suggested reviewers

  • njzjz
  • wanghan-iapcm

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The pull request title 'update NvNMD training code' is generic and does not clearly convey the main changes. While it references NvNMD, it lacks specificity about what aspects were updated. The raw_summary shows substantial changes across multiple files including support for different FPGA devices (vu9p/vu13p), descriptor recovery mechanisms, mapping table updates, and configuration defaults. The title does not capture these key improvements or the specific nature of the updates. Consider refining the title to be more specific about the primary changes, such as 'Add FPGA device support and descriptor recovery to NvNMD training' or 'Support device-specific configuration and quantized descriptor recovery in NvNMD'. This would better communicate the main objectives of the changeset to developers reviewing the commit history.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🔭 Outside diff range comments (1)
deepmd/main.py (1)

243-248: 🛠️ Refactor suggestion

CLI breaking change: -f short option repurposed

-f used to map to --finetune; it now maps to --init-frz-model, while --finetune is moved to -t.
Existing user scripts will break silently.

Consider either:

  1. Keeping the original mapping and adding a new short flag (e.g. -F) for init-frz-model, or
  2. Providing both aliases (-f & -t) for --finetune and issuing a deprecation warning for -f.

This guards backwards compatibility.

🧹 Nitpick comments (7)
deepmd/main.py (1)

781-787: Duplicate option definition – risk of diverging behaviour

The same --init-frz-model / -f flag is re-defined for the train-nvnmd sub-parser.
Keeping two independent definitions invites accidental drift (e.g. help text, default).

Refactor into a small helper that registers the shared mutually-exclusive group for both sub-parsers, ensuring consistency in one place.

deepmd/tf/nvnmd/entrypoints/train.py (1)

70-75: Seed injection is duplicated – centralise to avoid divergence

seed is picked from jdata_nvnmd_ twice. Extract once into a local variable and reuse:

-        "descriptor": {
-            "seed": jdata_nvnmd_.get("seed", 1),
+        seed = jdata_nvnmd_.get("seed", 1)
+        "descriptor": {
+            "seed": seed,
...
-        "fitting_net": {"seed": jdata_nvnmd_.get("seed", 1)},
+        "fitting_net": {"seed": seed},

Reduces chance of the two defaults diverging in future edits.

deepmd/tf/nvnmd/entrypoints/wrap.py (3)

140-141: Consider keeping f-string formatting for better readability.

The change from f-strings to % formatting is a step backward in terms of Python best practices. F-strings are more readable and performant.

-                log.info("%s: %d x % d bit" % (k, h, w * 4))
-                FioTxt().save("nvnmd/wrap/h%s.txt" % (k), d)
+                log.info(f"{k}: {h} x {w * 4} bit")
+                FioTxt().save(f"nvnmd/wrap/h{k}.txt", d)

469-473: Specify explicit dtype for numpy arrays.

For clarity and consistency, specify the dtype explicitly when creating numpy arrays.

-            nrs = np.zeros(nr)
-            ncs = np.zeros(nc)
-            wrs = np.zeros([nr, nc])
-            wcs = np.zeros([nr, nc])
+            nrs = np.zeros(nr, dtype=GLOBAL_NP_FLOAT_PRECISION)
+            ncs = np.zeros(nc, dtype=GLOBAL_NP_FLOAT_PRECISION)
+            wrs = np.zeros([nr, nc], dtype=GLOBAL_NP_FLOAT_PRECISION)
+            wcs = np.zeros([nr, nc], dtype=GLOBAL_NP_FLOAT_PRECISION)

548-549: Remove commented debug code.

Commented debug code should be removed to keep the codebase clean.

-        # k = 2**23
-        # print(dsws[0][42] * k)
deepmd/tf/nvnmd/entrypoints/mapt.py (1)

455-455: Fix typo in comment.

-# N+1 ranther than N for calculating defference
+# N+1 rather than N for calculating difference
deepmd/tf/nvnmd/utils/config.py (1)

170-177: Remove redundant logging statements.

There are three consecutive log statements for self.config["dscp"] which seems excessive. Consider keeping only one or making them conditional on debug level.

-log.info(self.config["dscp"])
 dp_in = {"type_map": fioObj.get(jdata, "type_map", [])}
 self.config["dpin"] = fioObj.update(dp_in, self.config["dpin"])
 #
-log.info(self.config["dscp"])
 self.init_net_size()
 self.init_value()
 log.info(self.config["dscp"])
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a6042c6 and 0e723bb.

📒 Files selected for processing (11)
  • deepmd/main.py (1 hunks)
  • deepmd/tf/descriptor/se_atten.py (4 hunks)
  • deepmd/tf/nvnmd/data/data.py (13 hunks)
  • deepmd/tf/nvnmd/descriptor/se_atten.py (4 hunks)
  • deepmd/tf/nvnmd/entrypoints/mapt.py (11 hunks)
  • deepmd/tf/nvnmd/entrypoints/train.py (5 hunks)
  • deepmd/tf/nvnmd/entrypoints/wrap.py (12 hunks)
  • deepmd/tf/nvnmd/utils/config.py (18 hunks)
  • deepmd/tf/utils/type_embed.py (1 hunks)
  • deepmd/utils/argcheck_nvnmd.py (2 hunks)
  • source/tests/tf/test_nvnmd_entrypoints.py (9 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
deepmd/tf/descriptor/se_atten.py (1)
deepmd/tf/nvnmd/descriptor/se_atten.py (1)
  • build_recovered (79-123)
deepmd/tf/nvnmd/data/data.py (1)
data/raw/copy_raw.py (1)
  • copy (11-71)
deepmd/tf/nvnmd/entrypoints/mapt.py (4)
deepmd/tf/nvnmd/utils/config.py (1)
  • init_from_jdata (55-81)
deepmd/tf/utils/sess.py (1)
  • run_sess (12-50)
deepmd/tf/infer/deep_eval.py (2)
  • sess (311-314)
  • sess (1223-1226)
deepmd/utils/data.py (1)
  • avg (379-394)
🪛 Ruff (0.11.9)
deepmd/tf/nvnmd/descriptor/se_atten.py

202-202: Local variable M3 is assigned to but never used

Remove assignment to unused variable M3

(F841)

deepmd/tf/nvnmd/utils/config.py

55-55: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

🔇 Additional comments (21)
deepmd/tf/nvnmd/entrypoints/train.py (2)

156-160: Deep-copying template dicts – good, but templates should be immutable

Using copy.deepcopy avoids side-effects – nice!
Consider making jdata_cmd_train / jdata_cmd_freeze constants (UPPER_CASE) and never mutating them; deep-copy whenever you need a writable version.

Also applies to: 193-197


147-151:

✅ Verification successful

Passing init_frz_model to generic train() may raise TypeError

jdata_cmd_train now contains init_frz_model, and the dict is forwarded verbatim:

train(**jdata)

If deepmd.tf.entrypoints.train.train() has not been updated to accept this kw-arg, runtime will fail.

Expected: parameter list includes init_frz_model or **kwargs.
If absent, either update the signature or remove the key before the call.


🏁 Script executed:

#!/bin/bash
# Verify `train()` signature contains the new parameter
rg -n --context 2 $'def +train\\(' deepmd/tf/entrypoints | head

Length of output: 269


🏁 Script executed:

#!/bin/bash
# Show the full signature of train() to verify accepted parameters
sed -n '50,100p' deepmd/tf/entrypoints/train.py

Length of output: 1516


No action needed: train() already accepts init_frz_model

The train() signature in deepmd/tf/entrypoints/train.py (lines 55–68) includes both init_frz_model and **kwargs, so forwarding init_frz_model via train(**jdata) will not raise a TypeError.

source/tests/tf/test_nvnmd_entrypoints.py (1)

475-489: Updated golden numbers only – ensure underlying logic is unchanged

Large blocks of expected values were replaced without explanation in the test docstrings.
Verify that they stem from intentional algorithmic changes and not from an unnoticed regression or random seed drift.

Also applies to: 791-805, 576-704

deepmd/tf/nvnmd/entrypoints/wrap.py (2)

519-546: Good implementation of k component handling.

The addition of the k component to the mapping tables is implemented correctly, and the use of copy.deepcopy prevents reference issues when appending to lists. The kkk factor appropriately handles type boundaries.


566-566: Dimension update correctly reflects the addition of k component.

The change from 2 to 3 dimensions properly accounts for the new k component alongside the existing s and h components.

deepmd/tf/descriptor/se_atten.py (1)

719-731: Proper integration of the new descriptor recovery mechanism.

The conditional block correctly checks for NVNMD quantization before calling build_recovered, and the returned recovered_switch is appropriately stored for later use in filtering.

deepmd/tf/nvnmd/descriptor/se_atten.py (2)

79-124: Well-implemented descriptor recovery function.

The build_recovered function properly normalizes the s and h components, applies filtering operations, and returns both the normalized descriptor and the k component as a smoothing switch. The use of tf.ensure_shape provides good runtime validation.


241-252: Verify the formula change is intentional.

The computation has changed from G = Gs * Gt to G = Gs * Gt + Gs. This adds a residual connection which may be intentional, but please confirm this mathematical change is correct for the NVNMD algorithm.

The modulation of two_embd by recovered_switch is correctly implemented for smoothing.

deepmd/tf/nvnmd/data/data.py (6)

1-1: Good use of deep copy for configuration data isolation.

The change from shallow copy to deep copy ensures that nested dictionaries are properly isolated, preventing unintended mutations when configurations are modified. This is particularly important for configuration data that contains nested structures.

Also applies to: 125-127, 261-263, 335-336, 404-405


15-15: Seed parameters correctly added for reproducibility.

The addition of "seed": 1 to descriptor and fitting network configurations ensures reproducible training runs, which aligns with the PR objectives.

Also applies to: 45-45, 149-149, 179-179


158-158: Verify the device-specific parameter adjustments.

The following changes were made:

  • Added "M3": 2 parameter to v1 configurations
  • Incremented SUB_VERSION from 1 to 2
  • Reduced NSTDM from 128 to 64 and NSTDM_M2 from 4 to 2 in v1_ni256
  • Added "NBIT_NSTEP": 8

Please confirm these parameter adjustments are correct for the vu9p device configuration and document the rationale for these specific values.

Also applies to: 210-210, 266-268, 273-273, 253-253


294-294: Device field correctly added for FPGA support.

The addition of "device": "vu9p" supports the PR objective of enabling different FPGA devices (vu9p and vu13p).

Also applies to: 363-363


344-346: Descriptor configuration properly updated with granular controls.

The replacement of "tebd_input_mode": "strip" with three boolean flags ("stripped_type_embedding": True, "smooth_type_embdding": True, "set_davg_zero": False) provides more precise control over the embedding behavior, as specified in the PR objectives.


331-331: Training data configuration correctly extended.

The addition of "set_prefix": "set" to the training data configuration is appropriate for dataset organization.

Also applies to: 400-400

deepmd/tf/nvnmd/entrypoints/mapt.py (3)

94-104: New mode 2 implementation for recovered switch handling.

The change from Gs_Gt_mode = 1 to Gs_Gt_mode = 2 introduces a new calculation mode that uses xyz_scatter * two_embd * recovered_switch + xyz_scatter with zero shifts for both Gs and Gt. This appears to integrate with the new k tensor (recovered switch) functionality.


407-414: New k tensor (recovered switch) correctly implemented.

The implementation adds a new mapping component k calculated as k = -kk^3 + 1 where kk = 1 - rmin * s, clipped to [0,1]. This recovered switch function and its gradients are properly integrated into the mapping table structure alongside the existing s and h components.

Also applies to: 139-147


550-551: Mapping resolution appropriately increased for finer control.

The s2g mapping resolution has been doubled (N: 4096→8192, N2: 16→32) and the warning threshold increased from 16.0 to 32.0. These changes provide finer granularity in the mapping table, which should improve accuracy for the new device configurations.

Also applies to: 153-153, 554-555

deepmd/tf/nvnmd/utils/config.py (4)

47-47: Device-specific configuration properly implemented.

The code correctly handles device-specific parameters for "vu9p" and "vu13p" devices:

  • NSTEP values are set based on sel thresholds (128, 160, 192, 224, 256)
  • NSTDM parameters differ between devices (vu13p: 32, vu9p: 64)

Please verify these device-specific values are correct for the respective FPGA hardware.

Also applies to: 61-61, 219-247, 286-295


366-381: Sel parameter validation and rounding correctly implemented.

The code properly validates and rounds the sel parameter to hardware-supported values (128, 160, 192, 224, 256) for version 1 configurations. This ensures compatibility with FPGA implementations.


2-2: Deep copy usage ensures proper configuration isolation.

The consistent use of copy.deepcopy() for configuration dictionaries prevents unintended mutations when configurations are modified, maintaining data integrity across different configuration variants.

Also applies to: 50-50, 140-144, 149-153, 438-438


355-356: Warning threshold correctly updated to match mapping changes.

The increase from 16.0 to 32.0 is consistent with the expanded mapping resolution in mapt.py and allows for a wider valid range of switch function values.

Comment on lines +190 to +191
if nvnmd_cfg.enable:
self.use_tebd_bias = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Implicitly overriding use_tebd_bias may break caller expectations

self.use_tebd_bias is force-set to True whenever nvnmd_cfg.enable is True, regardless of the value passed in the constructor.
This silently discards a user-supplied use_tebd_bias=False, which is surprising and makes the public parameter meaningless as soon as NVNMD is on.

-if nvnmd_cfg.enable:
-    self.use_tebd_bias = True
+if nvnmd_cfg.enable and not self.use_tebd_bias:
+    log.debug("nvnmd enabled → enabling type-embedding bias")
+    self.use_tebd_bias = True

At minimum, emit a warning or honour the original flag unless NVNMD strictly requires the bias.
Document the behaviour in the class docstring as well.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if nvnmd_cfg.enable:
self.use_tebd_bias = True
if nvnmd_cfg.enable and not self.use_tebd_bias:
log.debug("nvnmd enabled → enabling type-embedding bias")
self.use_tebd_bias = True
🤖 Prompt for AI Agents
In deepmd/tf/utils/type_embed.py around lines 190 to 191, the code forcibly sets
self.use_tebd_bias to True when nvnmd_cfg.enable is True, ignoring any
user-provided value for use_tebd_bias. To fix this, modify the logic to either
honor the original use_tebd_bias value unless NVNMD strictly requires it to be
True, or emit a warning when overriding it. Additionally, update the class
docstring to clearly document this behavior so users understand when and why
use_tebd_bias might be changed automatically.

Comment on lines +12 to 13
doc_device = "hardware used by model, vu9p or vu13p"
doc_net_size_file = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

device argument lacks value validation

A new "device" option is introduced but any string is currently accepted. Down-stream code (e.g. NvnmdConfig) seems to assume "vu9p" / "vu13p" or "none".

-Argument("device", str, optional=False, default="none", doc=doc_device),
+Argument(
+    "device",
+    str,
+    optional=False,
+    default="none",
+    doc=doc_device,
+    choices=["none", "vu9p", "vu13p"],   # enforce valid values
+),

Fail-fast validation here prevents cryptic errors later in the pipeline.

Also applies to: 30-31

🤖 Prompt for AI Agents
In deepmd/utils/argcheck_nvnmd.py around lines 12-13 and 30-31, the "device"
argument accepts any string but downstream code expects only "vu9p", "vu13p", or
"none". Add validation logic to check if the provided "device" value is one of
these allowed strings and raise an error immediately if not. This fail-fast
validation will prevent obscure errors later in the pipeline.

NBIT_FLTE = nbit["NBIT_FLTE"]
NBIT_NSTEP = nbit["NBIT_NSTEP"]
NIX = dscp["NIX"]
print(dscp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace print statement with proper logging.

Debug print statements should use the logging framework for consistency and control.

-            print(dscp)
+            log.debug("Descriptor configuration: %s", dscp)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
print(dscp)
log.debug("Descriptor configuration: %s", dscp)
🤖 Prompt for AI Agents
In deepmd/tf/nvnmd/entrypoints/wrap.py at line 333, replace the print statement
with a logging call. Import the logging module if not already done, then use an
appropriate logging level such as logging.debug or logging.info to log the
variable dscp instead of printing it directly. This ensures consistent and
configurable output handling.

jiongwalai and others added 2 commits June 12, 2025 07:45
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
deepmd/tf/nvnmd/descriptor/se_atten.py (2)

98-99: Superfluous subtraction

(h - 0) / std_h is equivalent to h / std_h and just adds noise to the expression.

-    h_norm = (h - 0) / std_h
+    h_norm = h / std_h

127-133: Doc-string and dead code drift

The doc-string still says the routine maps to “s and h”, but the code now produces k as well.
Also, ndescrpt is computed but never used.

Please update the documentation and remove the unused variable to keep the intent clear.

🧰 Tools
🪛 Ruff (0.11.9)

132-132: Local variable ndescrpt is assigned to but never used

Remove assignment to unused variable ndescrpt

(F841)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e723bb and e39c09e.

📒 Files selected for processing (2)
  • deepmd/tf/nvnmd/descriptor/se_atten.py (4 hunks)
  • deepmd/tf/nvnmd/entrypoints/wrap.py (12 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • deepmd/tf/nvnmd/entrypoints/wrap.py
🔇 Additional comments (1)
deepmd/tf/nvnmd/descriptor/se_atten.py (1)

197-204: New parameter not reflected in API contract

filter_lower_R42GR() now expects recovered_switch, yet the function doc-string and upstream callers were not updated.
For clarity and to avoid silent breakage:

  1. Document the new argument in the doc-string.
  2. Audit all call-sites to make sure the extra tensor is always supplied.

Comment on lines +103 to +110
s_norm = op_module.flt_nvnmd(s_norm)
log.debug("#s: %s", s_norm)
s_norm = tf.ensure_shape(s_norm, [None, 1])
with tf.variable_scope("h", reuse=True):
h_norm = op_module.flt_nvnmd(h_norm)
log.debug("#h: %s", h_norm)
h_norm = tf.ensure_shape(h_norm, [None, 1])
# merge into [sji, hji*xji, hji*yji, hji*zji]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

reuse=True may fail on the first graph build

tf.variable_scope(..., reuse=True) assumes the scope already exists.
When this function is executed in a fresh graph the variables/ops are not yet defined and TensorFlow will raise a ValueError.
Use reuse=tf.AUTO_REUSE (TF >= 1.4) or create the scope without reuse on the first pass.

🤖 Prompt for AI Agents
In deepmd/tf/nvnmd/descriptor/se_atten.py around lines 103 to 110, the use of
tf.variable_scope with reuse=True can cause a ValueError if the scope does not
exist yet during the first graph build. To fix this, change the reuse argument
to tf.AUTO_REUSE to allow automatic reuse or create the variable scope without
reuse on the first pass to ensure the scope is created before reuse is applied.

Comment on lines +79 to +81
def build_recovered(
descrpt, t_avg, t_std, atype, Na, ntypes, rcut_r_smth, filter_precision
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove or use the filter_precision parameter

filter_precision is accepted by build_recovered() but never referenced inside the function, which is misleading and may confuse future maintainers.

-def build_recovered(
-    descrpt, t_avg, t_std, atype, Na, ntypes, rcut_r_smth, filter_precision
-):
+def build_recovered(
+    descrpt, t_avg, t_std, atype, Na, ntypes, rcut_r_smth
+):

Either drop the argument (and update all call-sites) or apply it where you originally intended.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def build_recovered(
descrpt, t_avg, t_std, atype, Na, ntypes, rcut_r_smth, filter_precision
):
def build_recovered(
descrpt, t_avg, t_std, atype, Na, ntypes, rcut_r_smth
):
...
🤖 Prompt for AI Agents
In deepmd/tf/nvnmd/descriptor/se_atten.py around lines 79 to 81, the function
build_recovered() accepts a parameter filter_precision that is never used inside
the function. To fix this, either remove the filter_precision parameter from the
function signature and update all places where build_recovered() is called to no
longer pass this argument, or if filter_precision was intended to be used,
incorporate it appropriately within the function logic.

@codecov
Copy link

codecov bot commented Jun 15, 2025

Codecov Report

Attention: Patch coverage is 80.39216% with 50 lines in your changes missing coverage. Please review.

Project coverage is 84.71%. Comparing base (2b32af5) to head (9f13087).

Files with missing lines Patch % Lines
deepmd/tf/nvnmd/utils/config.py 56.32% 38 Missing ⚠️
deepmd/tf/nvnmd/entrypoints/train.py 16.66% 5 Missing ⚠️
deepmd/tf/nvnmd/entrypoints/wrap.py 93.33% 3 Missing ⚠️
deepmd/tf/nvnmd/descriptor/se_atten.py 97.01% 2 Missing ⚠️
deepmd/tf/nvnmd/entrypoints/mapt.py 94.28% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #4800      +/-   ##
==========================================
- Coverage   84.72%   84.71%   -0.01%     
==========================================
  Files         699      699              
  Lines       68182    68286     +104     
  Branches     3542     3541       -1     
==========================================
+ Hits        57768    57850      +82     
- Misses       9280     9304      +24     
+ Partials     1134     1132       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@wanghan-iapcm wanghan-iapcm requested a review from njzjz June 17, 2025 05:56
@wanghan-iapcm
Copy link
Collaborator

@jiongwalai please fix the precommit issues.

# change the configuration according to the max_nnei
jdata_config_v0_ni128 = jdata_config_v0.copy()
jdata_config_v0_ni256 = jdata_config_v0.copy()
# change the configuration accordng to the max_nnei
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be some meaningless changes like this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be differences between the parameters jdata_config_v0_ni128 and jdata_config_v0_ni256, which are used for different training configurations. I think they should be retained.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change according to accordng?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
deepmd/tf/nvnmd/entrypoints/mapt.py (1)

149-149: Minor formatting issue in block comment.

The static analysis tool flagged excessive leading '#' characters in the block comment.

-        ## u2s
+        # u2s
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8215320 and d6a1fe9.

📒 Files selected for processing (1)
  • deepmd/tf/nvnmd/entrypoints/mapt.py (12 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
deepmd/tf/nvnmd/entrypoints/mapt.py (1)
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4219
File: deepmd/utils/learning_rate.py:48-53
Timestamp: 2024-10-15T22:22:24.889Z
Learning: Methods in `deepmd/utils/learning_rate.py` that return NumPy scalar types should have return type annotations using the corresponding NumPy types, such as `np.float64`.
🪛 Flake8 (7.2.0)
deepmd/tf/nvnmd/entrypoints/mapt.py

[error] 149-149: too many leading '#' for block comment

(E266)

🔇 Additional comments (9)
deepmd/tf/nvnmd/entrypoints/mapt.py (9)

10-10: Good addition of explicit precision control.

The import of GLOBAL_NP_FLOAT_PRECISION aligns with the PR objectives to improve numerical precision and consistency throughout the NVNMD pipeline.


95-97: Clear documentation of the new Gs_Gt_mode behavior.

The updated comments clearly explain the new mode 2 behavior with recovered_switch and the corresponding shift parameter values.


140-148: Excellent implementation of 'k' tensor mapping coefficients.

The addition of 'k' tensor mapping coefficients follows the same pattern as 's' and 'h' tensors, maintaining consistency in the codebase. This supports the enhanced mapping capabilities mentioned in the PR objectives.


206-206: Consistent precision handling in numpy array initialization.

The use of GLOBAL_NP_FLOAT_PRECISION ensures consistent numerical precision across the mapping operations.


408-415: Well-implemented 'k' tensor calculation.

The mathematical implementation of the 'k' tensor is correct:

  • kk = 1 - rmin * s
  • k = -kk³ + 1
  • Proper clipping to [0, 1] range

This follows a cubic polynomial form that complements the existing 's' and 'h' tensors.


426-435: Consistent gradient calculation for 'k' tensor.

The gradient calculations for the 'k' tensor follow the same pattern as 's' and 'h' tensors, ensuring mathematical consistency in the mapping table generation.


607-627: Good addition of pylint disable comments.

The pylint disable comments for no-explicit-dtype are appropriate since the dtype is implicitly determined by the input tensors in these TensorFlow operations.


555-560: Mapping resolution update validated by test_nvnmd_entrypoints.py

  • In deepmd/tf/nvnmd/entrypoints/mapt.py you’ve doubled
    • N: 4096→8192
    • N2: 16→32
    • warning threshold: (smax–smin) > 16.0→32.0
  • The test source/tests/tf/test_nvnmd_entrypoints.py now uses idx = [..., 4096, 8192] (and even 16384 in one case), confirming the code handles the larger tables correctly.

The change is intentional and covered by tests. Note that doubling N and N2 doubles the mapping‐table size (and related TensorFlow placeholders), which will increase memory usage and computation time.


154-154: Double-check s2g mapping intervals in cfg_s2g
The two segments in deepmd/tf/nvnmd/entrypoints/mapt.py now read:

cfg_s2g = [
    [s[0],   s[256],    s[1]  - s[0],  0,   256],
    [s[0],   s[8192],   s[32] - s[0],  256, 512],
]

Please ensure these index choices and step sizes match the desired higher-resolution mapping ranges.

• File to inspect:

  • deepmd/tf/nvnmd/entrypoints/mapt.py (cfg_s2g definition)
    • Verify in downstream use:
  • descriptor/se_a.py & descriptor/se_atten.py (table_info = cfg_s2g)
  • wrap.py & mapping2 calls in entrypoints
  • test expectations in source/tests/tf/test_nvnmd_entrypoints.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d6a1fe9 and 8fdbee8.

📒 Files selected for processing (1)
  • deepmd/tf/nvnmd/entrypoints/mapt.py (13 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
deepmd/tf/nvnmd/entrypoints/mapt.py (1)
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4219
File: deepmd/utils/learning_rate.py:48-53
Timestamp: 2024-10-15T22:22:24.889Z
Learning: Methods in `deepmd/utils/learning_rate.py` that return NumPy scalar types should have return type annotations using the corresponding NumPy types, such as `np.float64`.
🧬 Code Graph Analysis (1)
deepmd/tf/nvnmd/entrypoints/mapt.py (2)
deepmd/tf/nvnmd/utils/config.py (1)
  • init_from_jdata (55-81)
deepmd/tf/utils/sess.py (1)
  • run_sess (12-50)
🪛 Flake8 (7.2.0)
deepmd/tf/nvnmd/entrypoints/mapt.py

[error] 149-149: too many leading '#' for block comment

(E266)


[error] 615-615: undefined name 'GLOBAL_TF_FLOAT_PRECISION'

(F821)


[error] 621-621: undefined name 'GLOBAL_TF_FLOAT_PRECISION'

(F821)


[error] 631-631: undefined name 'GLOBAL_TF_FLOAT_PRECISION'

(F821)

🪛 Ruff (0.11.9)
deepmd/tf/nvnmd/entrypoints/mapt.py

615-615: Undefined name GLOBAL_TF_FLOAT_PRECISION

(F821)


621-621: Undefined name GLOBAL_TF_FLOAT_PRECISION

(F821)


631-631: Undefined name GLOBAL_TF_FLOAT_PRECISION

(F821)

🔇 Additional comments (14)
deepmd/tf/nvnmd/entrypoints/mapt.py (14)

10-10: LGTM: Import addition for numerical precision consistency.

The import of GLOBAL_NP_FLOAT_PRECISION aligns with the explicit dtype usage throughout the file.


81-81: Minor: Consistent with other method signatures.

Removing the return type annotation is consistent with other methods in the class.


140-148: LGTM: Consistent implementation of k component mapping.

The new k component mapping follows the same pattern as s and h components, maintaining consistency in the codebase.


206-206: LGTM: Explicit dtype usage for numerical consistency.

Using GLOBAL_NP_FLOAT_PRECISION ensures consistent floating-point precision throughout the codebase.


262-262: Minor: Consistent with method signature style.

Removing the return type annotation is consistent with the class's method signature style.


403-403: LGTM: Updated return signature to include k component.

The return signature correctly includes the third component for consistency with the new k mapping tensor.


408-415: LGTM: Well-implemented k component calculation.

The k component calculation follows a logical pattern:

  • Uses the relationship kk = 1 - rmin * s
  • Applies cubic transformation: k = -kk³ + 1
  • Clips values to [0, 1] range
  • Maintains consistent tensor reshaping

The implementation is mathematically sound and consistent with the existing codebase patterns.


426-426: LGTM: Consistent integration of k component gradients.

The k component and its gradients are properly integrated into the gradient calculation pipeline, following the same pattern as s and h components.

Also applies to: 433-435


460-462: LGTM: Improved numerical precision consistency.

Using GLOBAL_NP_FLOAT_PRECISION for array initialization ensures consistent floating-point precision throughout the computation pipeline.

Also applies to: 466-468


480-482: LGTM: Proper initialization of k component values.

The k component values are properly initialized to zero for boundary conditions, consistent with the s and h component handling.

Also applies to: 490-492


494-496: LGTM: Version-specific handling for consistency.

The special handling for version 1 to set s[tt][0] = 0 ensures consistent behavior across different versions.


555-556: LGTM: Consistent parameter scaling.

The parameter changes are well-coordinated:

  • N increased from 4096 to 8192 (2x)
  • N2 increased from 16 to 32 (2x)
  • Warning threshold increased from 16.0 to 32.0 (2x)

This maintains the same precision ratio while supporting higher resolution.

Also applies to: 559-559


568-574: LGTM: Consistent array initialization with explicit precision.

The array initialization uses GLOBAL_NP_FLOAT_PRECISION for consistency, and the logic correctly handles the updated parameter values.

Also applies to: 578-582


154-154: Verify memory and performance impact of doubled s2g intervals

We’ve updated the second interval in cfg_s2g from

[s[0], s[4096], s[16] - s[0], 256, 512]

to

[s[0], s[8192], s[32] - s[0], 256, 512]

which doubles the resolution for the s→g mapping. While tests in
source/tests/tf/test_nvnmd_entrypoints.py (lines 143/592) now expect indices up to 8192, please confirm:

• deepmd/tf/nvnmd/entrypoints/mapt.py (around line 152) – that build_map_coef and mapping2 still perform within acceptable memory and runtime limits with the finer grid
• That generated outputs match the newly updated expected values in the entrypoint tests
• No unintended regressions in other downstream pipelines relying on s→g mappings

type_embedding_nei = tf.tile(
tf.reshape(type_embedding, [1, padding_ntypes, -1]),
tf.reshape(
type_embedding, [1, padding_ntypes, -1], dtype=GLOBAL_TF_FLOAT_PRECISION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix undefined GLOBAL_TF_FLOAT_PRECISION constant.

The code uses GLOBAL_TF_FLOAT_PRECISION but it's not imported, causing static analysis errors. This will result in runtime errors.

Add the missing import at the top of the file:

from deepmd.tf.env import (
    GLOBAL_NP_FLOAT_PRECISION,
+   GLOBAL_TF_FLOAT_PRECISION,
    op_module,
    tf,
)

Alternatively, if this constant doesn't exist in the codebase, replace the usages with the appropriate TensorFlow precision type (e.g., tf.float64 or tf.float32).

Also applies to: 621-621, 631-631

🧰 Tools
🪛 Ruff (0.11.9)

615-615: Undefined name GLOBAL_TF_FLOAT_PRECISION

(F821)

🪛 Flake8 (7.2.0)

[error] 615-615: undefined name 'GLOBAL_TF_FLOAT_PRECISION'

(F821)

🤖 Prompt for AI Agents
In deepmd/tf/nvnmd/entrypoints/mapt.py at lines 615, 621, and 631, the constant
GLOBAL_TF_FLOAT_PRECISION is used but not defined or imported, causing errors.
Fix this by either importing GLOBAL_TF_FLOAT_PRECISION from its module at the
top of the file if it exists, or replace all occurrences of
GLOBAL_TF_FLOAT_PRECISION with the appropriate TensorFlow dtype such as
tf.float32 or tf.float64 to ensure the code runs correctly.

Comment on lines +95 to +97
# 2 : xyz_scatter = xyz_scatter * two_embd * recovered_switch + xyz_scatter;
# Gs + 0, Gt + 0
self.Gs_Gt_mode = 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify the impact of changing Gs_Gt_mode default from 1 to 2.

The default mode has been changed from 1 to 2, which affects the shift calculations (shift_Gs and shift_Gt both become 0). This is a significant behavioral change that should be verified.

Also applies to: 102-104


🏁 Script executed:

#!/bin/bash
# Description: Search for any references to Gs_Gt_mode in tests or documentation
# Expected: Find usage patterns and test expectations for the new default

echo "=== Searching for Gs_Gt_mode usage ==="
rg -A 5 -B 5 "Gs_Gt_mode"

echo "=== Searching for shift_Gs and shift_Gt usage ==="
rg -A 3 -B 3 "shift_G[st]"

Length of output: 3729


Action Required: Document and Test the New Default Gs_Gt_mode=2

The change of the default from mode 1→2 is not exercised by existing tests (they all set mapObj.Gs_Gt_mode = 2 explicitly) nor documented, so downstream users may be unaware or untested against the new behavior. Please:

  • deepmd/tf/nvnmd/entrypoints/mapt.py (around line 97): update any in-code comments or docstrings to reflect that the default is now 2.
  • source/tests/tf/test_nvnmd_entrypoints.py: add a unit test for MapTable.build_map() without overriding Gs_Gt_mode, to verify shift_Gs = shift_Gt = 0 and the expected output.
  • CLI entrypoints or configuration loaders that instantiate MapTable without manually setting Gs_Gt_mode: confirm they still behave correctly under mode 2 or expose a flag to select the mode.
  • Documentation/README: note the behavioral change in the release notes or API reference.
🤖 Prompt for AI Agents
In deepmd/tf/nvnmd/entrypoints/mapt.py around lines 95 to 97, update the in-code
comments and any relevant docstrings to clearly state that the default value of
Gs_Gt_mode is now 2 instead of 1. In source/tests/tf/test_nvnmd_entrypoints.py,
add a unit test for MapTable.build_map() that does not override Gs_Gt_mode,
verifying that shift_Gs and shift_Gt are both 0 and the output matches
expectations under mode 2. Review CLI entrypoints and configuration loaders that
create MapTable instances without setting Gs_Gt_mode explicitly to ensure they
function correctly with the new default or add an option to select the mode.
Finally, update the project documentation or release notes to document this
default behavior change for downstream users.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
deepmd/tf/nvnmd/entrypoints/mapt.py (1)

95-97: Document the behavioral change from mode 1 to mode 2.

The default Gs_Gt_mode has been changed from 1 to 2, which affects the shift calculations (both shift_Gs and shift_Gt become 0). This is a significant behavioral change that should be documented.

🧹 Nitpick comments (1)
deepmd/tf/nvnmd/entrypoints/mapt.py (1)

607-609: Improve type casting clarity.

The double casting tf.cast(..., filter_precision) where the inner argument already has dtype=filter_precision is redundant and could be simplified.

-last_type = tf.cast(
-    tf.zeros([1, ebd_type.shape[1]], dtype=filter_precision), filter_precision
-)
+last_type = tf.zeros([1, ebd_type.shape[1]], dtype=filter_precision)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9dfb1fa and e20b2d4.

📒 Files selected for processing (1)
  • deepmd/tf/nvnmd/entrypoints/mapt.py (12 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
deepmd/tf/nvnmd/entrypoints/mapt.py (2)
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4219
File: deepmd/utils/learning_rate.py:48-53
Timestamp: 2024-10-15T22:22:24.889Z
Learning: Methods in `deepmd/utils/learning_rate.py` that return NumPy scalar types should have return type annotations using the corresponding NumPy types, such as `np.float64`.
Learnt from: HydrogenSulfate
PR: deepmodeling/deepmd-kit#4414
File: deepmd/pd/train/training.py:66-66
Timestamp: 2024-11-29T12:15:22.226Z
Learning: The function `nvprof_context` is defined in `deepmd/pd/utils/utils.py`, so importing it in `deepmd/pd/train/training.py` is correct.
🪛 Flake8 (7.2.0)
deepmd/tf/nvnmd/entrypoints/mapt.py

[error] 149-149: too many leading '#' for block comment

(E266)

🔇 Additional comments (7)
deepmd/tf/nvnmd/entrypoints/mapt.py (7)

10-10: Good addition for precision consistency.

Adding GLOBAL_NP_FLOAT_PRECISION import ensures consistent floating-point precision throughout the numpy operations.


154-154: Confirm the s2g mapping resolution increase.

The configuration parameter has been updated from s[16] to s[32], doubling the resolution. This aligns with the N2 increase from 16 to 32 in the run_s2g method.


408-415: Review the k tensor formula implementation.

The k tensor implementation uses the formula k = -kk³ + 1 where kk = 1 - rmin * s. This appears to be a recovery/switch function. The clipping to [0.0, 1.0] range is appropriate.


559-560: Document the warning threshold increase.

The warning threshold for the range of s has been increased from 16.0 to 32.0, which aligns with the doubled resolution. This change should be documented in the code comments or docstring.


460-462: Excellent use of GLOBAL_NP_FLOAT_PRECISION for consistency.

The explicit use of GLOBAL_NP_FLOAT_PRECISION in numpy array creation ensures numerical precision consistency throughout the codebase.

Also applies to: 466-468, 571-574, 578-581


555-557: Assess s2g Resolution Performance and Memory Impact

I didn’t find any existing benchmarks or memory-usage tests covering the change from N=4096→8192 and N2=16→32. Doubling the mapping resolution will increase both compute time and memory footprint, so please:

  • Manually verify performance (throughput, latency) on representative workloads.
  • Measure peak memory usage on target hardware.
  • Add or update any performance tests or benchmarks in source/tests/.
  • Document the expected impact on resource requirements in your README or developer guide.

File to review:

  • deepmd/tf/nvnmd/entrypoints/mapt.py (lines 555–557)

140-148: k tensor mapping verified

Downstream components correctly integrate the new k and k_grad tensors using the same patterns as s and h. No further changes needed.

• deepmd/tf/nvnmd/descriptor/se_atten.py – handles k, applies flt_nvnmd(k) and shapes it alongside s and h.
• deepmd/tf/nvnmd/entrypoints/wrap.py – concatenates maps["k"] and maps["k_grad"] when assembling descriptors for version > 0.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
deepmd/tf/nvnmd/entrypoints/mapt.py (1)

95-97: Document and verify the new default mode behavior.

The default Gs_Gt_mode has been changed from 1 to 2, which is a significant behavioral change. The comments clearly explain the new mode, but this change should be verified for backward compatibility.

🧹 Nitpick comments (1)
deepmd/tf/nvnmd/entrypoints/mapt.py (1)

149-149: Fix comment formatting style issue.

The static analysis tool reports excessive leading '#' characters for the block comment. Consider using a standard comment format.

Apply this fix:

-        ## u2s
+        # u2s
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e20b2d4 and 9e0e6a8.

📒 Files selected for processing (1)
  • deepmd/tf/nvnmd/entrypoints/mapt.py (12 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.
deepmd/tf/nvnmd/entrypoints/mapt.py (2)
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4219
File: deepmd/utils/learning_rate.py:48-53
Timestamp: 2024-10-15T22:22:24.889Z
Learning: Methods in `deepmd/utils/learning_rate.py` that return NumPy scalar types should have return type annotations using the corresponding NumPy types, such as `np.float64`.
Learnt from: HydrogenSulfate
PR: deepmodeling/deepmd-kit#4414
File: deepmd/pd/train/training.py:66-66
Timestamp: 2024-11-29T12:15:22.226Z
Learning: The function `nvprof_context` is defined in `deepmd/pd/utils/utils.py`, so importing it in `deepmd/pd/train/training.py` is correct.
🪛 Flake8 (7.2.0)
deepmd/tf/nvnmd/entrypoints/mapt.py

[error] 149-149: too many leading '#' for block comment

(E266)

🔇 Additional comments (17)
deepmd/tf/nvnmd/entrypoints/mapt.py (17)

10-10: LGTM: Import addition is correct.

The import of GLOBAL_NP_FLOAT_PRECISION is properly added and used consistently throughout the file for numerical precision.


81-81: LGTM: Removed unnecessary return type annotation.

The return type annotation removal from __init__ is appropriate since constructors don't return meaningful values.


102-104: LGTM: Shift values correctly set for mode 2.

The shift values (shift_Gs = 0, shift_Gt = 0) are correctly set to match the documented behavior for mode 2.


140-148: LGTM: 'k' component mapping added consistently.

The addition of the 'k' component mapping follows the same pattern as the existing 's' and 'h' components, maintaining consistency in the codebase.


154-154: LGTM: s2g mapping intervals doubled.

The s2g mapping configuration intervals have been doubled (4096→8192, 16→32), which should provide higher precision mapping.


206-206: LGTM: Improved precision consistency.

Using GLOBAL_NP_FLOAT_PRECISION for dtype ensures numerical precision consistency throughout the codebase.


262-262: LGTM: Removed unnecessary return type annotation.

The return type annotation removal is appropriate since the method has no implementation (just pass).


408-415: LGTM: 'k' component calculation added correctly.

The 'k' component calculation is mathematically sound and follows the same pattern as the existing 's' and 'h' components, including proper reshaping and clipping.


426-426: LGTM: Updated to handle 'k' component.

The function call correctly unpacks three components including the new 'k' component, maintaining consistency with the build_u2s changes.


433-435: LGTM: 'k' gradient calculation added consistently.

The gradient calculation for the 'k' component follows the same pattern as the existing 's' and 'h' gradient calculations.


460-462: LGTM: Improved precision and added clarifying comment.

Using GLOBAL_NP_FLOAT_PRECISION ensures consistency, and the comment clarifies why N+1 is used for calculating differences.


466-470: LGTM: Improved reference calculation with better precision.

Using np.float64 for dtype and updating the comment improves code clarity and maintains numerical precision.


480-496: LGTM: 'k' component initialization added consistently.

The initialization of 'k' component values follows the same pattern as 's' and 'h' components, properly handling edge cases and version-specific logic.


555-559: LGTM: s2g mapping parameters doubled for higher precision.

The mapping parameters have been consistently doubled (N: 4096→8192, N2: 16→32, warning threshold: 16.0→32.0), which should provide improved mapping precision.


568-574: LGTM: Improved precision and formatting.

Using GLOBAL_NP_FLOAT_PRECISION ensures consistency, and the improved formatting enhances code readability.


578-582: LGTM: Reference calculation improved with better precision.

Using np.float64 for dtype and improved formatting enhances both precision and readability.


607-609: LGTM: Explicit dtype specification added.

Adding explicit dtype specification to tf.zeros and casting to filter_precision improves precision control and consistency.

sl.append(s)
hl.append(h)
return sl, hl
return sl, hl, sl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix potential bug: 'sl' returned twice instead of third component.

The return statement return sl, hl, sl returns sl twice, which appears to be a bug. Based on the version 1 implementation that follows, this should return three distinct components.

Consider this fix:

-            return sl, hl, sl
+            return sl, hl, kl  # or appropriate third component

You'll need to implement the third component (kl) similar to how k is handled in the version 1 case.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In deepmd/tf/nvnmd/entrypoints/mapt.py at line 403, the return statement
mistakenly returns 'sl' twice instead of returning three distinct components. To
fix this, compute the third component 'kl' similarly to how 'k' is handled in
the version 1 case, and update the return statement to return 'sl', 'hl', and
'kl' respectively.

@jiongwalai jiongwalai requested a review from njzjz July 17, 2025 06:21
Signed-off-by: jiongwalai <89775787+jiongwalai@users.noreply.github.com>
Signed-off-by: jiongwalai <89775787+jiongwalai@users.noreply.github.com>
Copilot AI review requested due to automatic review settings August 14, 2025 16:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
deepmd/main.py (1)

863-869: LGTM! Implementation is correct and consistent.

The new --init-frz-model option is properly implemented with appropriate type, default value, and help text. It follows the existing pattern in the train-nvnmd parser and is consistent with the similar option in the train parser.

Optional suggestion for future improvement:

Consider grouping --init-model, --restart, and --init-frz-model into a mutually exclusive group (similar to the train parser at lines 240-268) to prevent users from accidentally passing conflicting initialization options. This would improve the CLI UX:

+    parser_train_nvnmd_subgroup = parser_train_nvnmd.add_mutually_exclusive_group()
-    parser_train_nvnmd.add_argument(
+    parser_train_nvnmd_subgroup.add_argument(
         "-i",
         "--init-model",
         type=str,
         default=None,
         help="Initialize the model by the provided path prefix of checkpoint files.",
     )
-    parser_train_nvnmd.add_argument(
+    parser_train_nvnmd_subgroup.add_argument(
         "-r",
         "--restart",
         type=str,
         default=None,
         help="Restart the training from the provided prefix of checkpoint files.",
     )
-    parser_train_nvnmd.add_argument(
+    parser_train_nvnmd_subgroup.add_argument(
         "-f",
         "--init-frz-model",
         type=str,
         default=None,
         help="Initialize the training from the frozen model.",
     )

Note: This suggestion applies to the broader train-nvnmd parser design, not just the newly added option.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 61e7f63 and 0ffdb5c.

📒 Files selected for processing (1)
  • deepmd/main.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Always run ruff check . and ruff format . before committing changes to Python code

Files:

  • deepmd/main.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants