-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
breaking(pt/tf/dp): disable bias in type embedding #3958
Conversation
WalkthroughWalkthroughThe updates introduce a new Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant DescrptDPA1
participant DescrptDPA2
participant SEAttenV2
participant T_Network
participant TypeEmbedNetConsistent
User ->> DescrptDPA1: Create instance (use_tebd_bias)
DescrptDPA1 ->> DescrptDPA1: Initialize with use_tebd_bias
DescrptDPA1 ->> DescrptDPA1: Serialize with use_tebd_bias
User ->> DescrptDPA2: Create instance (use_tebd_bias)
DescrptDPA2 ->> DescrptDPA2: Initialize with use_tebd_bias
DescrptDPA2 ->> DescrptDPA2: Serialize with use_tebd_bias
User ->> SEAttenV2: Create instance (use_tebd_bias)
SEAttenV2 ->> SEAttenV2: Initialize with use_tebd_bias
SEAttenV2 ->> SEAttenV2: Serialize with use_tebd_bias
User ->> T_Network: Create instance (bias)
T_Network ->> T_Network: Initialize with bias
T_Network ->> T_Network: Serialize with bias
User ->> TypeEmbedNetConsistent: Create instance (use_tebd_bias)
TypeEmbedNetConsistent ->> TypeEmbedNetConsistent: Initialize with use_tebd_bias
TypeEmbedNetConsistent ->> TypeEmbedNetConsistent: Serialize with use_tebd_bias
Recent review detailsConfiguration used: CodeRabbit UI Files selected for processing (3)
Files skipped from review as they are similar to previous changes (3)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
@@ -275,6 +281,7 @@ | |||
concat_output_tebd, | |||
precision, | |||
use_econf_tebd, | |||
use_tebd_bias, |
Check notice
Code scanning / CodeQL
Unused local variable Note test
@@ -178,6 +181,7 @@ | |||
precision, | |||
add_tebd_to_repinit_out, | |||
use_econf_tebd, | |||
use_tebd_bias, |
Check notice
Code scanning / CodeQL
Unused local variable Note test
@@ -206,6 +210,7 @@ | |||
precision, | |||
add_tebd_to_repinit_out, | |||
use_econf_tebd, | |||
use_tebd_bias, |
Check notice
Code scanning / CodeQL
Unused local variable Note test
@@ -234,6 +239,7 @@ | |||
precision, | |||
add_tebd_to_repinit_out, | |||
use_econf_tebd, | |||
use_tebd_bias, |
Check notice
Code scanning / CodeQL
Unused local variable Note test
@@ -298,6 +304,7 @@ | |||
precision, | |||
add_tebd_to_repinit_out, | |||
use_econf_tebd, | |||
use_tebd_bias, |
Check notice
Code scanning / CodeQL
Unused local variable Note test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Outside diff range and nitpick comments (7)
deepmd/dpmodel/utils/network.py (1)
Line range hint
278-279
:
Document the newbias
parameter.The docstring should be updated to include a description of the new
bias
parameter.mixed_prec The input dict which stores the mixed precision setting for the embedding net + bias : bool, Optional + Whether to use bias in the embedding layer.deepmd/dpmodel/descriptor/dpa1.py (2)
204-205
: Document the new parameter.The new parameter
use_tebd_bias
should be documented in the docstring to explain its purpose and usage.
499-499
: Update versioning information.The version number should be updated to reflect the addition of the new
use_tebd_bias
parameter.deepmd/pt/model/network/network.py (2)
576-576
: Add docstring for the new parameter.The
use_tebd_bias
parameter is missing from the docstring. Adding it will improve the documentation.""" - Construct a type embedding net. + Construct a type embedding net. + + Args: + use_tebd_bias (bool): Whether to use bias in the type embedding layer. """
661-662
: Add docstring for the new parameter.The
use_tebd_bias
parameter is missing from the docstring. Adding it will improve the documentation.""" - Whether to use bias in the type embedding layer. """deepmd/tf/descriptor/se_atten.py (2)
2082-2083
: Document the newuse_tebd_bias
parameter.The docstring should include a description for the
use_tebd_bias
parameter.use_tebd_bias : bool, Optional Whether to use bias in the type embedding layer.
2121-2121
: Document the newuse_tebd_bias
parameter.The docstring should include a description for the
use_tebd_bias
parameter.use_tebd_bias : bool, Optional Whether to use bias in the type embedding layer.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## devel #3958 +/- ##
==========================================
- Coverage 82.83% 82.82% -0.01%
==========================================
Files 520 520
Lines 50832 50869 +37
Branches 3015 3015
==========================================
+ Hits 42106 42133 +27
- Misses 7790 7800 +10
Partials 936 936 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the comments from @njzjz, and it will LGTM when the comments are addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range, codebase verification and nitpick comments (3)
deepmd/dpmodel/utils/type_embed.py (1)
48-49
: Document the new parameteruse_tebd_bias
.The new parameter
use_tebd_bias
should be documented in the class docstring for better clarity.use_econf_tebd: bool, Optional Whether to use electronic configuration type embedding. + use_tebd_bias : bool, Optional + Whether to use bias in the type embedding layer. type_map: List[str], Optional A list of strings. Give the name to each type of atoms.deepmd/tf/utils/type_embed.py (1)
103-104
: Document the new parameteruse_tebd_bias
.The new parameter
use_tebd_bias
should be documented in the class docstring for better clarity.use_econf_tebd: bool, Optional Whether to use electronic configuration type embedding. + use_tebd_bias : bool, Optional + Whether to use bias in the type embedding layer. type_map: List[str], Optional A list of strings. Give the name to each type of atoms.deepmd/pt/model/descriptor/dpa2.py (1)
125-126
: Document the new parameteruse_tebd_bias
.The new parameter
use_tebd_bias
should be documented in the class docstring for better clarity.use_econf_tebd : bool, Optional Whether to use electronic configuration type embedding. + use_tebd_bias : bool, Optional + Whether to use bias in the type embedding layer. type_map : List[str], Optional A list of strings. Give the name to each type of atoms.
This PR addresses an issue observed during training with DPA2 on complex datasets, such as `mptraj`. Specifically, the **learning curves of energy** from the **2024Q1-based branch** and the **devel branch** show significant differences at the very beginning when setting `tebd_dim` = 256 (and thus descriptor `dim_out` = 128 + 256). The issue is illustrated in the following image: <img src="https://github.com/deepmodeling/deepmd-kit/assets/50307526/701835a4-126f-4a93-91c7-f9e685c4dc9d" alt="Example Image" width="500"> After removing the bias in the type embedding, which affects the standard deviation of the descriptor when `tebd_dim` is very large, the learning curve improves significantly: <img src="https://github.com/deepmodeling/deepmd-kit/assets/50307526/8915e7dd-1813-42bc-8617-fe8209bc6da1" alt="Example Image" width="500"> Notably, this behavior is not prominent when using a `tebd_dim` that is relatively smaller than the descriptor itself, such as when using DPA2 with `tebd_dim` = 8 or using DPA1. The same issue exists in econf of type embedding, which will be solved in a separated PR. **NOTE** **This PR disables bias in type embedding in all backends, which is a breaking change.** <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced `use_tebd_bias` and `bias` parameters across various components to control the use of bias in type embeddings and networks. - **Updates** - Updated serialization and deserialization methods to include the new parameters and ensure version compatibility. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
This PR addresses an issue observed during training with DPA2 on complex datasets, such as
mptraj
. Specifically, the learning curves of energy from the 2024Q1-based branch and the devel branch show significant differences at the very beginning when settingtebd_dim
= 256 (and thus descriptordim_out
= 128 + 256). The issue is illustrated in the following image:After removing the bias in the type embedding, which affects the standard deviation of the descriptor when
tebd_dim
is very large, the learning curve improves significantly:Notably, this behavior is not prominent when using a
tebd_dim
that is relatively smaller than the descriptor itself, such as when using DPA2 withtebd_dim
= 8 or using DPA1.The same issue exists in econf of type embedding, which will be solved in a separated PR.
NOTE
This PR disables bias in type embedding in all backends, which is a breaking change.
Summary by CodeRabbit
New Features
use_tebd_bias
andbias
parameters across various components to control the use of bias in type embeddings and networks.Updates