Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(pt): optimize graph memory usage #4006

Merged
merged 4 commits into from
Jul 24, 2024

Conversation

iProzd
Copy link
Collaborator

@iProzd iProzd commented Jul 23, 2024

  • Remove atomic virial graph.
  • Remove force graph during inference.

After this, the lammps memory saves 50% for dpa1 (attn_layer=0) and 80% for dpa2 (layer=12).

Summary by CodeRabbit

  • New Features

    • Introduced a new inference parameter to key model functions, enhancing flexibility for inference scenarios during model execution.
    • Added functionality to output a mapping array to a CSV file, improving data handling capabilities.
  • Bug Fixes

    • Improved the behavior of the model during inference versus training, potentially impacting downstream processing based on the output.

Copy link
Contributor

coderabbitai bot commented Jul 23, 2024

Walkthrough

Walkthrough

The recent changes introduce a new inference parameter to several forward_lower and related functions across various model files. This enhancement allows the functions to explicitly handle inference scenarios, with most parameters set to default to True. The updates improve the flexibility of function calls while maintaining the existing logic flow.

Changes

Files Change Summary
deepmd/pt/model/model/dipole_model.py, deepmd/pt/model/model/dos_model.py, Added inference=True parameter to forward_lower functions, enhancing inference capabilities.
deepmd/pt/model/model/dp_zbl_model.py, deepmd/pt/model/model/ener_model.py,
deepmd/pt/model/model/polar_model.py
deepmd/pt/model/model/make_model.py Introduced inference=False and create_graph=self.training parameters to forward_common_lower, distinguishing inference modes.
deepmd/pt/model/model/spin_model.py Added inference parameters to both forward_lower (default True) and forward_common_lower (default False).
deepmd/pt/model/model/transform_output.py Added inference=False parameter to multiple functions for flexibility in gradient computations.
deepmd/pt/entrypoints/main.py Modified the freeze function to set the model to evaluation mode before scripting with Torch.
source/lmp/pair_deepmd.cpp Enhanced PairDeepMD class with new mapping functionality for atom tags and CSV output.

Possibly related issues


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@iProzd iProzd requested review from njzjz and wanghan-iapcm July 23, 2024 07:44
@iProzd iProzd requested a review from CaRoLZhangxy July 23, 2024 07:48
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (8)
deepmd/pt/model/model/dp_zbl_model.py (1)

114-115: Ensure all calls to forward_lower include the inference parameter.

The forward_lower function in deepmd/pt/model/model/dp_zbl_model.py has been updated to include the inference parameter. However, many calls to this function in the codebase do not include this parameter, which could lead to runtime errors or unexpected behavior.

  • Files and lines to update:
    • source/tests/pt/model/test_dp_model.py
      • Lines: 0, 1, 2
    • source/tests/pt/model/test_forward_lower.py
      • Lines: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
    • source/tests/pt/model/test_ener_spin_model.py
      • Lines: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
    • source/tests/universal/common/cases/model/utils.py
      • Lines: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
    • source/tests/universal/pt/backend.py
      • Lines: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15

Please update these calls to include the inference parameter to ensure consistency and correct functionality.

Analysis chain

Verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to forward_lower match the new signature.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `forward_lower` match the new signature.

# Test: Search for the function usage. Expect: Only occurances of the new signature.
rg --type python -A 5 $'forward_lower'

Length of output: 68


Script:

#!/bin/bash
# Description: Verify all function calls to `forward_lower` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type py -A 5 $'forward_lower'

Length of output: 8436


Script:

#!/bin/bash
# Description: Verify the function definition of `forward_lower` in `deepmd/pt/model/model/dp_zbl_model.py` includes the `inference` parameter
# and check if all calls to this function in the codebase include this parameter.

# Test: Search for the function definition of `forward_lower` in the specified file.
ast-grep --lang python --pattern $'def forward_lower($$$):\n    $$$' deepmd/pt/model/model/dp_zbl_model.py

# Test: Search for all calls to `forward_lower` in the codebase and check if they include the `inference` parameter.
rg --type py 'forward_lower' -A 5

Length of output: 11331

deepmd/pt/model/model/transform_output.py (3)

125-126: Update function call to match new signature

The function call to take_deriv in deepmd/pt/model/model/transform_output.py does not include the new inference parameter. Please update the function call to match the new signature.

  • deepmd/pt/model/model/transform_output.py:
    dr, dc = take_deriv(
        vv,
        model_ret[kk_redu],
        vdef,
        coord_ext,
        do_virial=vdef.c_differentiable,
        inference=False  # Add this line
    )
Analysis chain

Verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to take_deriv match the new signature.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `take_deriv` match the new signature.

# Test: Search for the function usage. Expect: Only occurances of the new signature.
rg --type python -A 5 $'take_deriv'

Length of output: 65


Script:

#!/bin/bash
# Description: Verify all function calls to `take_deriv` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type py -A 5 $'take_deriv'

Length of output: 896


167-168: Update function calls to match new signature

The function fit_output_to_model_output has a new parameter inference set to False by default. However, the following function calls do not include this new parameter and need to be updated:

  • deepmd/pt/model/model/make_model.py
  • deepmd/dpmodel/model/make_model.py

Please update these function calls to include the inference parameter to match the new signature.

Analysis chain

Verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to fit_output_to_model_output match the new signature.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `fit_output_to_model_output` match the new signature.

# Test: Search for the function usage. Expect: Only occurances of the new signature.
rg --type python -A 5 $'fit_output_to_model_output'

Length of output: 81


Script:

#!/bin/bash
# Description: Verify all function calls to `fit_output_to_model_output` match the new signature.

# Search for the function definition to understand its context.
ast-grep --lang python --pattern $'def fit_output_to_model_output($_, $_, $_, $_, $_, $_, $_, $_, $_, inference=False):\n    $$$'

# Search for the function usage in the codebase.
rg 'fit_output_to_model_output'

Length of output: 626


Script:

#!/bin/bash
# Description: Verify all function calls to `fit_output_to_model_output` match the new signature.

# Check the function calls in the identified files.
rg 'fit_output_to_model_output' deepmd/pt/model/model/make_model.py -A 3
rg 'fit_output_to_model_output' deepmd/dpmodel/model/make_model.py -A 3

Length of output: 590


75-76: Update all calls to task_deriv_one to match the new signature.

The task_deriv_one function has been updated to include a new inference parameter, but not all calls to this function reflect this change. Ensure that all calls to task_deriv_one include the inference parameter to maintain consistency and avoid potential issues.

  • deepmd/pt/model/model/transform_output.py: Update the call to task_deriv_one to include the inference parameter.
Analysis chain

Verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to task_deriv_one match the new signature.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `task_deriv_one` match the new signature.

# Test: Search for the function usage. Expect: Only occurances of the new signature.
rg --type python -A 5 $'task_deriv_one'

Length of output: 69


Script:

#!/bin/bash
# Description: Verify all function calls to `task_deriv_one` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg 'task_deriv_one' --glob '*.py' -A 5

Length of output: 895


Script:

#!/bin/bash
# Description: Verify all function calls and definitions of `task_deriv_one` match the new signature.

# Test: Search for the function definition and usage. Expect: Only occurrences of the new signature.
rg 'def task_deriv_one' --glob '*.py' -A 5
rg 'task_deriv_one(' --glob '*.py' -A 5

Length of output: 594


Script:

#!/bin/bash
# Description: Verify all function calls to `task_deriv_one` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg 'task_deriv_one\(' --glob '*.py' -A 5

Length of output: 899

deepmd/pt/model/model/make_model.py (2)

243-244: Update the docstring for clarity.

The docstring should be updated to match the new parameter name and its purpose.

- inference
-     Whether only perform inference rather than undergoing training.
+ inference
+     Whether to perform inference only, rather than undergoing training.

218-218: Ensure all calls to forward_common_lower include the new inference parameter.

The following function calls to forward_common_lower need to be updated to match the new signature:

  • source/tests/pt/model/test_ener_spin_model.py
  • source/tests/pt/model/test_dp_model.py
  • deepmd/pt/model/model/spin_model.py
  • deepmd/pt/model/model/polar_model.py
  • deepmd/pt/model/model/make_model.py
  • deepmd/pt/model/model/ener_model.py
  • deepmd/pt/model/model/dp_zbl_model.py
  • deepmd/pt/model/model/dos_model.py
  • deepmd/pt/model/model/dipole_model.py
Analysis chain

LGTM! But verify the function usage in the codebase.

The addition of the inference parameter is approved.

However, ensure that all function calls to forward_common_lower match the new signature.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `forward_common_lower` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type python -A 5 $'forward_common_lower'

Length of output: 75


Script:

#!/bin/bash
# Description: Verify all function calls to `forward_common_lower` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg -A 5 'forward_common_lower'

Length of output: 8451

deepmd/pt/model/model/spin_model.py (2)

608-608: Verify and update forward_lower function calls to match the new signature.

The addition of the inference parameter to forward_lower is approved. However, some function calls do not include this parameter. Please update the following instances to ensure they match the new signature:

  • source/tests/universal/common/cases/model/utils.py
    • Line: ret_lower.append(module.forward_lower(**input_dict_lower))
  • source/tests/pt/model/test_dp_model.py
    • Line: ret2 = md0.forward_lower(*args, do_atomic_virial=True)
    • Line: ret0 = md0.forward_lower(*args)
    • Line: ret1 = md1.forward_lower(*args)
    • Line: ret0 = md0.forward_lower(*args, do_atomic_virial=True)
    • Line: ret1 = md1.forward_lower(*args, do_atomic_virial=True)
  • source/api_cc/src/DeepPotPT.cc
    • Line: .run_method("forward_lower", coord_wrapped_Tensor, atype_Tensor, firstneigh_tensor, optional_tensor, fparam_tensor, aparam_tensor, do_atom_virial_tensor, comm_dict)
    • Line: .run_method("forward_lower", coord_wrapped_Tensor, atype_Tensor, firstneigh_tensor, optional_tensor, fparam_tensor, aparam_tensor, do_atom_virial_tensor)

Please ensure that all calls to forward_lower include the inference parameter set to True.

Analysis chain

LGTM! But verify the function usage in the codebase.

The addition of the inference parameter is approved.

However, ensure that all function calls to forward_lower match the new signature.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `forward_lower` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type python -A 5 $'forward_lower'

Length of output: 68


Script:

#!/bin/bash
# Description: Verify all function calls to `forward_lower` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg -A 5 'forward_lower'

Length of output: 9418


462-462: Ensure all calls to forward_common_lower include the inference parameter.

The inference parameter is not found in the function calls to forward_common_lower in the following files:

  • source/tests/pt/model/test_ener_spin_model.py
  • source/tests/pt/model/test_dp_model.py
  • deepmd/pt/model/model/spin_model.py
  • deepmd/pt/model/model/polar_model.py
  • deepmd/pt/model/model/make_model.py
  • deepmd/pt/model/model/ener_model.py
  • deepmd/pt/model/model/dp_zbl_model.py
  • deepmd/pt/model/model/dos_model.py
  • deepmd/pt/model/model/dipole_model.py

Please update these calls to match the new function signature.

Analysis chain

LGTM! But verify the function usage in the codebase.

The addition of the inference parameter is approved.

However, ensure that all function calls to forward_common_lower match the new signature.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `forward_common_lower` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type python -A 5 $'forward_common_lower'

Length of output: 75


Script:

#!/bin/bash
# Description: Verify all function calls to `forward_common_lower` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type py -A 5 $'forward_common_lower'

Length of output: 8461


Script:

#!/bin/bash
# Description: Check if the `inference` parameter is included in the function calls to `forward_common_lower`.

# Test: Search for the function calls and check for the `inference` parameter.
rg --type py 'forward_common_lower' -A 5 | grep -B 5 'inference'

Length of output: 62

Copy link

codecov bot commented Jul 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.85%. Comparing base (a6ea2c1) to head (0f44a60).
Report is 106 commits behind head on devel.

Additional details and impacted files
@@           Coverage Diff           @@
##            devel    #4006   +/-   ##
=======================================
  Coverage   82.84%   82.85%           
=======================================
  Files         522      522           
  Lines       50920    50922    +2     
  Branches     3015     3015           
=======================================
+ Hits        42186    42189    +3     
  Misses       7796     7796           
+ Partials      938      937    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

deepmd/pt/model/model/make_model.py Outdated Show resolved Hide resolved
deepmd/pt/model/model/transform_output.py Show resolved Hide resolved
deepmd/pt/model/model/transform_output.py Show resolved Hide resolved
deepmd/pt/model/model/transform_output.py Outdated Show resolved Hide resolved
@iProzd iProzd requested a review from njzjz July 23, 2024 09:32
@iProzd iProzd added this pull request to the merge queue Jul 24, 2024
Merged via the queue into deepmodeling:devel with commit 7f9300d Jul 24, 2024
60 checks passed
@iProzd iProzd deleted the fix_create_graph branch July 24, 2024 08:51
iProzd added a commit to iProzd/deepmd-kit that referenced this pull request Jul 25, 2024
mtaillefumier pushed a commit to mtaillefumier/deepmd-kit that referenced this pull request Sep 18, 2024
- Remove atomic virial graph.
- Remove force graph during inference.

After this, the lammps memory saves **50% for dpa1** (attn_layer=0) and
**80% for dpa2** (layer=12).

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new `inference` parameter to key model functions,
enhancing flexibility for inference scenarios during model execution.
- Added functionality to output a mapping array to a CSV file, improving
data handling capabilities.

- **Bug Fixes**
- Improved the behavior of the model during inference versus training,
potentially impacting downstream processing based on the output.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] CUDA out of memory, when only 1600 atoms, using the pytorch model with spin
3 participants