-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SMACT Metallicity Handling Enhancements #367
Conversation
WalkthroughThe pull request introduces enhancements to the SMACT package's capabilities for metallicity analysis. The Changes
Sequence DiagramsequenceDiagram
participant User
participant SmactValidity
participant MetallicityModule
User->>SmactValidity: Call with composition
alt Metallicity Check Enabled
SmactValidity->>MetallicityModule: Calculate metallicity score
MetallicityModule-->>SmactValidity: Return score
SmactValidity->>SmactValidity: Compare score to threshold
end
SmactValidity->>User: Return validity result
Possibly related PRs
Suggested labels
Poem
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (7)
smact/intermetallics.py (2)
68-68
: Consider handling missing electronegativity values appropriatelyCurrently, if any electronegativity values are
None
, the function returns a mismatch score of0.0
, which may inaccurately imply a perfect match. It might be more appropriate to raise a warning or return a distinct value indicating that the score could not be computed.Apply this diff to handle missing values:
if None in electronegativities: - return 0.0 + # Return a distinct value or raise an error + return float('nan') # Alternatively, raise a ValueError
23-24
: Reduce code duplication in similar functionsThe functions
get_metal_fraction
andget_d_electron_fraction
share similar structures. Consider refactoring to eliminate duplication and improve maintainability.You could abstract the common logic into a helper function:
def get_element_fraction(composition: Composition, element_set: set[str]) -> float: total_amt = sum(composition.values()) target_amt = sum( amt for el, amt in composition.items() if el.symbol in element_set ) return target_amt / total_amt def get_metal_fraction(composition: Composition) -> float: return get_element_fraction(composition, smact.metals) def get_d_electron_fraction(composition: Composition) -> float: return get_element_fraction(composition, smact.d_block)Also applies to: 37-38
new_intermetallics_mod_usage_example.py (1)
35-38
: Enhance output formatting for clarityConsider improving the output formatting to make the results more readable and informative.
Apply this diff to enhance the print statements:
print(f"\nCompound: {compound}") -print(f"Standard validity: {is_valid_standard}") -print(f"With intermetallic detection: {is_valid_intermetallic}") -print(f"Intermetallic score: {score:.2f}") +print(f"Standard Validity: {is_valid_standard}") +print(f"Intermetallic Detection Validity: {is_valid_intermetallic}") +print(f"Intermetallic Score: {score:.2f}")smact/tests/test_intermetallics.py (2)
43-46
: UseassertAlmostEqual
for floating-point comparisonsWhen comparing floating-point numbers, especially results of calculations, it's safer to use
assertAlmostEqual
to account for potential precision errors.Apply this diff to update the assertions:
- self.assertEqual(get_metal_fraction(Composition("Fe3Al")), 1.0) + self.assertAlmostEqual(get_metal_fraction(Composition("Fe3Al")), 1.0, places=6) - self.assertEqual(get_metal_fraction(Composition("SiO2")), 0.0) + self.assertAlmostEqual(get_metal_fraction(Composition("SiO2")), 0.0, places=6)
79-86
: Provide informative assertion messagesIncluding informative messages in your assertions can help identify failures more easily during testing.
Apply this diff to add messages to assertions:
for formula in self.intermetallics: score = intermetallic_score(formula) - self.assertTrue(score > 0.7, f"Expected high score (>0.7) for {formula}, got {score}") + self.assertTrue( + score > 0.7, + msg=f"Expected high intermetallic score (>0.7) for {formula}, but got {score:.2f}", + ) for formula in self.non_intermetallics: score = intermetallic_score(formula) - self.assertTrue(score < 0.5, f"Expected low score (<0.5) for {formula}, got {score}") + self.assertTrue( + score < 0.5, + msg=f"Expected low intermetallic score (<0.5) for {formula}, but got {score:.2f}", + )docs/intermetallics_readme.md (2)
89-90
: Add explanation for example scores.The example scores would be more helpful if accompanied by brief explanations of why these compounds receive their respective scores. For instance, explain why Fe3Al scores ~0.85 in terms of its metallic fraction, d-electron contribution, etc.
Also applies to: 93-94
178-178
: Fix grammatical issues.Please apply these corrections:
- Line 178: Add a period after "in these cases"
- Line 231: Add a comma after "here" in "parameters coded here while conceptually similar"
Also applies to: 231-231
🧰 Tools
🪛 LanguageTool
[uncategorized] ~178-~178: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
docs/intermetallics_readme.md
(1 hunks)new_intermetallics_mod_usage_example.py
(1 hunks)smact/intermetallics.py
(1 hunks)smact/screening.py
(4 hunks)smact/tests/test_intermetallics.py
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/intermetallics_readme.md
[uncategorized] ~80-~80: A determiner appears to be missing. Consider inserting it.
Context: ...# intermetallic_score(composition)
- Main scoring function combining multiple met...
(AI_EN_LECTOR_MISSING_DETERMINER)
[uncategorized] ~178-~178: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...
(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)
[uncategorized] ~187-~187: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...
(AI_EN_LECTOR_MISSING_DETERMINER)
[uncategorized] ~231-~231: A comma might be missing here.
Context: ...rameters coded here, while conceptually similar have not implemented Miedema directly. ...
(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)
🔇 Additional comments (5)
smact/intermetallics.py (1)
119-119
: Re-evaluate the defaultvec_factor
valueWhen the valence electron count cannot be calculated, the
vec_factor
defaults to0.5
. Consider whether this default appropriately reflects the lack of information, or if a different value or approach would be more suitable.Would you like to revisit the default value for
vec_factor
to ensure it aligns with the scoring intentions?smact/screening.py (4)
21-21
: LGTM!The import statement follows Python best practices by importing only the required function.
441-441
: LGTM!The new parameter is well-documented and follows consistent naming conventions. The default threshold value of 0.7 aligns with the documentation's recommendations for intermetallic classification.
Also applies to: 447-447, 463-464
477-477
: LGTM!The added empty line improves code readability by separating logical blocks.
479-481
: LGTM!The implementation cleanly integrates the new intermetallic scoring system, providing a more sophisticated approach to classifying intermetallic compounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (5)
smact/intermetallics.py (2)
1-11
: Enhance module documentation to better describe its purpose and functionality.The current docstring is quite minimal. Consider expanding it to include:
- Overview of what intermetallic compounds are
- Key features and functions provided by the module
- Example usage
- References to relevant literature or methodologies
-"""Utility functions for handling intermetallic compounds in SMACT.""" +"""Functions for analysing and scoring intermetallic compounds in SMACT. + +This module provides a comprehensive set of utilities for identifying and +characterising intermetallic compounds, moving beyond simple binary classification +to a more nuanced scoring system. Key features include: + +- Metal fraction analysis +- D-electron contribution assessment +- Pauling electronegativity ordering evaluation +- Composite intermetallic scoring + +Example: + >>> from smact.intermetallics import intermetallic_score + >>> score = intermetallic_score("Fe3Al") + >>> print(f"Intermetallic character score: {score:.2f}") + +References: + 1. Pauling, L. (1947). J. Am. Chem. Soc., 69(3), 542-553. + 2. Villars, P. (1983). J. Less Common Met., 92(2), 215-238. +"""
102-154
: Make scoring configuration more flexible and enhance documentation.Consider making the weights and thresholds configurable and documenting the scoring components in more detail.
+# Default weights for scoring components +DEFAULT_WEIGHTS = { + "metal_fraction": 0.3, + "d_electron": 0.2, + "n_metals": 0.2, + "vec": 0.15, + "pauling": 0.15, +} + -def intermetallic_score(composition: str | Composition) -> float: +def intermetallic_score( + composition: str | Composition, + weights: dict[str, float] | None = None, + max_metals: int = 3, + target_vec: float = 8.0, +) -> float: """Calculate a score indicating how likely a composition is to be an intermetallic compound. The score is based on several heuristics: - 1. Fraction of metallic elements - 2. Number of distinct metals - 3. Presence of d-block elements - 4. Electronegativity differences - 5. Valence electron count + 1. Fraction of metallic elements (weight: 0.3) + Higher fraction indicates stronger intermetallic character + 2. Number of distinct metals (weight: 0.2) + Normalised to max_metals (default: 3) + 3. Presence of d-block elements (weight: 0.2) + Higher d-electron fraction indicates stronger intermetallic character + 4. Electronegativity differences (weight: 0.15) + Lower Pauling mismatch indicates stronger intermetallic character + 5. Valence electron count (weight: 0.15) + Normalised around target_vec (default: 8.0) Args: composition: Chemical formula as string or pymatgen Composition + weights: Optional custom weights for scoring components + max_metals: Maximum number of metals for normalisation + target_vec: Target valence electron count for normalisation Returns: float: Score between 0 and 1, where higher values indicate more intermetallic character """ if isinstance(composition, str): composition = Composition(composition) + # Use default weights if none provided + weights = weights or DEFAULT_WEIGHTS + # 1. Basic metrics metal_fraction = get_metal_fraction(composition) d_electron_fraction = get_d_electron_fraction(composition) n_metals = get_distinct_metal_count(composition) # 2. Electronic structure indicators try: vec = valence_electron_count(composition.reduced_formula) - vec_factor = 1.0 - (abs(vec - 8.0) / 8.0) # Normalized around VEC=8 + vec_factor = 1.0 - (abs(vec - target_vec) / target_vec) except ValueError: vec_factor = 0.5 # Default if we can't calculate VEC # 3. Bonding character pauling_mismatch = get_pauling_test_mismatch(composition) - # 4. Calculate weighted score - # These weights can be tuned based on empirical testing - weights = {"metal_fraction": 0.3, "d_electron": 0.2, "n_metals": 0.2, "vec": 0.15, "pauling": 0.15} - score = ( weights["metal_fraction"] * metal_fraction + weights["d_electron"] * d_electron_fraction - + weights["n_metals"] * min(1.0, n_metals / 3.0) # Normalized to max of 3 metals + + weights["n_metals"] * min(1.0, n_metals / max_metals) + weights["vec"] * vec_factor + weights["pauling"] * (1.0 - pauling_mismatch) # Invert mismatch score )smact/tests/test_intermetallics.py (3)
5-41
: Standardise testing framework usage.The test file mixes
unittest
andpytest
frameworks. Choose one framework consistently:
- If using
unittest
, replacepytest.raises
withself.assertRaises
- If using
pytest
, removeunittest.TestCase
inheritance and use pytest fixturesThe test data setup is well-organised and comprehensive.
42-67
: Consider parametrizing test cases for better maintainability.The test cases for
get_element_fraction
could be parametrized to make them more maintainable and easier to extend.+ @pytest.mark.parametrize( + "composition,element_set,expected,message", + [ + ("Fe3Al", smact.metals, 1.0, "Expected all elements in Fe3Al to be metals"), + ("Fe2Nb", smact.d_block, 1.0, "Expected all elements in Fe2Nb to be d-block"), + ("Fe3Al", set(), 0.0, "Expected zero fraction for empty element set"), + ], + ) + def test_get_element_fraction(self, composition, element_set, expected, message): + """Test the helper function for element fraction calculations.""" + self.assertAlmostEqual( + get_element_fraction(composition, element_set), + expected, + places=6, + msg=message, + )
160-175
: Expand edge case coverage and standardise exception testing.
- Add more edge cases
- Standardise exception testing with the chosen framework
def test_edge_cases(self): """Test edge cases and error handling.""" # Single element score = intermetallic_score("Fe") self.assertTrue( 0.0 <= score <= 1.0, msg=f"Expected score between 0 and 1 for Fe, got {score:.2f}", ) + # Test with very large composition + large_comp = "Fe" * 100 + "Al" * 100 + score = intermetallic_score(large_comp) + self.assertTrue( + 0.0 <= score <= 1.0, + msg=f"Expected score between 0 and 1 for large composition, got {score:.2f}", + ) + # Empty composition should not crash - with pytest.raises(ValueError, match="Empty composition"): + with self.assertRaisesRegex(ValueError, "Empty composition"): intermetallic_score("") # Invalid formula should raise error - with pytest.raises(ValueError, match="Invalid formula"): + with self.assertRaisesRegex(ValueError, "Invalid formula"): intermetallic_score("NotAnElement") + + # Test with special characters + with self.assertRaisesRegex(ValueError, "Invalid formula"): + intermetallic_score("Fe@Al")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
new_intermetallics_mod_usage_example.py
(1 hunks)smact/intermetallics.py
(1 hunks)smact/tests/test_intermetallics.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- new_intermetallics_mod_usage_example.py
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: test (3.12, windows-latest)
- GitHub Check: test (3.12, macos-latest)
- GitHub Check: test (3.12, ubuntu-latest)
- GitHub Check: test (3.11, windows-latest)
- GitHub Check: test (3.11, macos-latest)
- GitHub Check: test (3.11, ubuntu-latest)
- GitHub Check: test (3.10, windows-latest)
- GitHub Check: test (3.10, macos-latest)
- GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (3)
smact/intermetallics.py (2)
71-100
:⚠️ Potential issueAddress Python version compatibility and clarify logic.
- Remove Python 3.10+ specific
strict=False
parameter- Clarify the metal-nonmetal pair logic with better variable names and comments
- Consider using absolute values for consistent comparison
# Calculate pairwise differences mismatches = [] - for i, (el1, eneg1) in enumerate(zip(elements, electronegativities, strict=False)): - for el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :], strict=False): + for i, (el1, eneg1) in enumerate(zip(elements, electronegativities)): + for el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :]): # For metal pairs, we expect small electronegativity differences if el1.symbol in smact.metals and el2.symbol in smact.metals: - mismatches.append(abs(eneg1 - eneg2)) + metal_pair_mismatch = abs(eneg1 - eneg2) + mismatches.append(metal_pair_mismatch) # For metal-nonmetal pairs, we expect larger differences elif (el1.symbol in smact.metals) != (el2.symbol in smact.metals): - mismatches.append(1.0 - abs(eneg1 - eneg2)) + # Invert the difference for metal-nonmetal pairs as we expect + # larger differences (closer to 1.0) to indicate ionic rather + # than intermetallic character + ionic_character = abs(eneg1 - eneg2) + intermetallic_character = 1.0 - ionic_character + mismatches.append(intermetallic_character)Likely invalid or redundant comment.
32-57
: 🛠️ Refactor suggestionMaintain consistency with input handling across functions.
Apply the same input flexibility pattern to these functions for consistency with
get_element_fraction
.-def get_metal_fraction(composition: Composition) -> float: +def get_metal_fraction(composition: str | Composition) -> float: """Calculate the fraction of metallic elements in a composition. Implemented using get_element_fraction helper with smact.metals set. Args: - composition: A pymatgen Composition object + composition: Chemical formula as string or pymatgen Composition object Returns: float: Fraction of the composition that consists of metallic elements (0-1) + + Raises: + ValueError: If the composition is invalid or empty """ return get_element_fraction(composition, smact.metals) -def get_d_electron_fraction(composition: Composition) -> float: +def get_d_electron_fraction(composition: str | Composition) -> float: """Calculate the fraction of d-block elements in a composition. Implemented using get_element_fraction helper with smact.d_block set. Args: - composition: A pymatgen Composition object + composition: Chemical formula as string or pymatgen Composition object Returns: float: Fraction of the composition that consists of d-block elements (0-1) + + Raises: + ValueError: If the composition is invalid or empty """ return get_element_fraction(composition, smact.d_block)Likely invalid or redundant comment.
smact/tests/test_intermetallics.py (1)
129-141
: 🛠️ Refactor suggestionEnhance Pauling test assertions and documentation.
The test could be more explicit about the expected behaviour and use absolute values as suggested in past reviews.
def test_get_pauling_test_mismatch(self): - """Test Pauling electronegativity mismatch calculation.""" + """Test Pauling electronegativity mismatch calculation. + + The mismatch score should be: + - Lower for intermetallic compounds (similar electronegativities) + - Higher for ionic compounds (large electronegativity differences) + """ # Ionic compounds should have high mismatch nacl_mismatch = get_pauling_test_mismatch(Composition("NaCl")) # Intermetallics should have lower mismatch fe3al_mismatch = get_pauling_test_mismatch(Composition("Fe3Al")) self.assertTrue( - fe3al_mismatch < nacl_mismatch, + abs(fe3al_mismatch) < abs(nacl_mismatch), msg=f"Expected lower Pauling mismatch for Fe3Al ({fe3al_mismatch:.2f}) than NaCl ({nacl_mismatch:.2f})", )Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
smact/intermetallics.py (1)
107-108
:⚠️ Potential issueRemove Python 3.10+ specific parameter.
The
strict
parameter inzip
is only available in Python 3.10 and above.Apply this diff to maintain compatibility with earlier Python versions:
-for i, (el1, eneg1) in enumerate(zip(elements, electronegativities, strict=False)): +for i, (el1, eneg1) in enumerate(zip(elements, electronegativities)): - for el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :], strict=False): + for el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :]):
🧹 Nitpick comments (1)
smact/intermetallics.py (1)
119-170
: Consider making weights configurable.The scoring weights are hardcoded and might need adjustment based on empirical testing. Consider making them configurable through function parameters or a configuration file.
Apply this diff to make weights configurable:
-def intermetallic_score(composition: str | Composition) -> float: +def intermetallic_score( + composition: str | Composition, + weights: dict[str, float] | None = None, +) -> float: """Calculate a score indicating how likely a composition is to be an intermetallic compound. Args: composition: Chemical formula as string or pymatgen Composition + weights: Optional dictionary of weights for each factor. + Default weights are used if not provided. """ comp = _ensure_composition(composition) # 1. Basic metrics metal_fraction = get_metal_fraction(comp) d_electron_fraction = get_d_electron_fraction(comp) n_metals = get_distinct_metal_count(comp) # 2. Electronic structure indicators try: vec = valence_electron_count(comp.reduced_formula) vec_factor = 1.0 - (abs(vec - 8.0) / 8.0) # Normalized around VEC=8 except ValueError: vec_factor = 0.5 # Default if we can't calculate VEC # 3. Bonding character pauling_mismatch = get_pauling_test_mismatch(comp) # 4. Calculate weighted score - # These weights can be tuned based on empirical testing - weights = {"metal_fraction": 0.3, "d_electron": 0.2, "n_metals": 0.2, "vec": 0.15, "pauling": 0.15} + default_weights = { + "metal_fraction": 0.3, + "d_electron": 0.2, + "n_metals": 0.2, + "vec": 0.15, + "pauling": 0.15 + } + weights = weights or default_weights
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
new_intermetallics_mod_usage_example.py
(1 hunks)smact/intermetallics.py
(1 hunks)smact/screening.py
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- new_intermetallics_mod_usage_example.py
🔇 Additional comments (5)
smact/intermetallics.py (4)
13-24
: LGTM! Well-structured helper function.The function is well-documented, properly typed, and follows the single responsibility principle.
47-57
: LGTM! Clean implementation.Good use of the helper function to avoid code duplication.
61-71
: LGTM! Clean implementation.Good use of the helper function to avoid code duplication.
74-84
: LGTM! Efficient implementation.The function effectively uses a generator expression for counting unique metals.
smact/screening.py (1)
441-442
: LGTM! Well-implemented intermetallic validation.The implementation follows the suggested approach of keeping intermetallic validation separate from the existing alloy validation. The new parameters are well-documented and the logic is clear.
Also applies to: 481-485
smact/intermetallics.py
Outdated
def get_element_fraction(composition: str | Composition, element_set: set[str]) -> float: | ||
"""Calculate the fraction of elements from a given set in a composition. | ||
This helper function is used to avoid code duplication in functions that | ||
calculate fractions of specific element types (e.g., metals, d-block elements). | ||
|
||
Args: | ||
composition: Chemical formula as string or pymatgen Composition | ||
element_set: Set of element symbols to check for | ||
|
||
Returns: | ||
float: Fraction of the composition that consists of elements from the set (0-1) | ||
""" | ||
comp = _ensure_composition(composition) | ||
total_amt = sum(comp.values()) | ||
target_amt = sum(amt for el, amt in comp.items() if el.symbol in element_set) | ||
return target_amt / total_amt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for edge cases.
The function should handle empty compositions and potential division by zero errors.
Apply this diff to add error handling:
def get_element_fraction(composition: str | Composition, element_set: set[str]) -> float:
comp = _ensure_composition(composition)
total_amt = sum(comp.values())
+ if not total_amt:
+ raise ValueError("Empty composition")
target_amt = sum(amt for el, amt in comp.items() if el.symbol in element_set)
return target_amt / total_amt
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def get_element_fraction(composition: str | Composition, element_set: set[str]) -> float: | |
"""Calculate the fraction of elements from a given set in a composition. | |
This helper function is used to avoid code duplication in functions that | |
calculate fractions of specific element types (e.g., metals, d-block elements). | |
Args: | |
composition: Chemical formula as string or pymatgen Composition | |
element_set: Set of element symbols to check for | |
Returns: | |
float: Fraction of the composition that consists of elements from the set (0-1) | |
""" | |
comp = _ensure_composition(composition) | |
total_amt = sum(comp.values()) | |
target_amt = sum(amt for el, amt in comp.items() if el.symbol in element_set) | |
return target_amt / total_amt | |
def get_element_fraction(composition: str | Composition, element_set: set[str]) -> float: | |
"""Calculate the fraction of elements from a given set in a composition. | |
This helper function is used to avoid code duplication in functions that | |
calculate fractions of specific element types (e.g., metals, d-block elements). | |
Args: | |
composition: Chemical formula as string or pymatgen Composition | |
element_set: Set of element symbols to check for | |
Returns: | |
float: Fraction of the composition that consists of elements from the set (0-1) | |
""" | |
comp = _ensure_composition(composition) | |
total_amt = sum(comp.values()) | |
if not total_amt: | |
raise ValueError("Empty composition") | |
target_amt = sum(amt for el, amt in comp.items() if el.symbol in element_set) | |
return target_amt / total_amt |
validated the approximate values in the docs with the new intermetallic scoring function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
docs/intermetallics_readme.md (1)
21-23
: 🛠️ Refactor suggestionInconsistent API usage examples.
The example shows both string and Composition input methods, but based on previous feedback, the API should be simplified to accept string formulas directly.
Update the example to show the preferred usage pattern:
-print(get_metal_fraction("Fe3Al")) # Works with string formula -print(get_metal_fraction(Composition("Fe3Al"))) # Works with Composition +print(get_metal_fraction("Fe3Al")) # Returns 1.0
🧹 Nitpick comments (2)
docs/intermetallics_readme.md (2)
179-179
: Improve error handling display.Using string 'nan' for NaN values is not a consistent way to handle numerical formatting.
Consider using a more Pythonic approach:
-print(f"Pauling mismatch: {'nan' if math.isnan(pauling) else f'{pauling:.2f}'}") +print(f"Pauling mismatch: {pauling:.2f if not math.isnan(pauling) else 'N/A'}")
307-316
: Enhance reference formatting.The references section could be improved for better readability and accessibility.
Consider using a consistent format for all references:
1. Original SMACT paper: - Title: "SMACT: Semiconducting Materials by Analogy and Chemical Theory" - Journal: Journal of Open Source Software - DOI: 10.21105/joss.01361 2. Intermetallics Theory: - Author: D.G. Pettifor - Title: "A chemical scale for crystal-structure maps" - Journal: Solid State Communications, 51 (1984) 31-34 - DOI: 10.1016/0038-1098(84)90765-8 3. Miedema's Framework: - Author: A.R. Miedema - Title: "Cohesion in alloys - fundamentals of a semi-empirical model" - DOI: 10.1016/0378-4363(80)90054-6
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/intermetallics_readme.md
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: test (3.12, windows-latest)
- GitHub Check: test (3.12, macos-latest)
- GitHub Check: test (3.12, ubuntu-latest)
- GitHub Check: test (3.11, windows-latest)
- GitHub Check: test (3.11, macos-latest)
- GitHub Check: test (3.11, ubuntu-latest)
- GitHub Check: test (3.10, windows-latest)
- GitHub Check: test (3.10, macos-latest)
- GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (2)
docs/intermetallics_readme.md (2)
92-98
: Verify numerical values in examples.The example scores appear to be inconsistent with actual implementation results.
Let's verify these values by running the actual implementation:
#!/bin/bash # Search for test files containing these specific compounds and their scores rg -A 5 "Fe3Al|Ni3Ti|NaCl|Fe2O3|SiO2.*score.*=.*0\.[0-9]+" --type py
133-133
: Clarify threshold value selection.The examples use 0.7 as the threshold, but there's no explanation of why this value was chosen or how it relates to the classification performance.
Add a note explaining:
- How the 0.7 threshold was determined
- What classification performance it achieves
- When users might want to adjust it
Also applies to: 142-142
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
docs/intermetallics_readme.md (2)
150-151
: 🛠️ Refactor suggestionReplace wildcard import with explicit imports.
Using wildcard imports (
from smact.intermetallics import *
) is not a recommended practice as it can lead to namespace pollution.Apply this diff:
-from smact.intermetallics import * +from smact.intermetallics import ( + get_metal_fraction, + get_d_electron_fraction, + get_distinct_metal_count, + get_pauling_test_mismatch, + intermetallic_score +)
242-242
: 🛠️ Refactor suggestionReplace wildcard import in the advanced usage example.
Another instance of wildcard import that should be replaced with explicit imports.
Apply this diff:
-from smact.intermetallics import * +from smact.intermetallics import ( + get_metal_fraction, + get_d_electron_fraction, + get_distinct_metal_count, + get_pauling_test_mismatch, + intermetallic_score +)
🧹 Nitpick comments (2)
docs/intermetallics_readme.md (2)
309-311
: Add inline citation for the Pettifor number.While the reference is provided, it would be helpful to add an inline citation to the specific page or section where the Pettifor number concept is introduced.
Apply this diff:
- - D.G. Pettifor introduced the concept of a single "chemical scale" or "structure map" coordinate (Pettifor number) to systematically separate compound classes. The new intermetallicscore is a step in that direction but customized to SMACT's internal data structures. + - D.G. Pettifor introduced the concept of a single "chemical scale" or "structure map" coordinate (Pettifor number) to systematically separate compound classes [1, p. 31]. The new intermetallicscore is a step in that direction but customized to SMACT's internal data structures.
313-314
: Fix grammatical issues in the Miedema reference.The sentence about Miedema's framework needs grammatical corrections.
Apply this diff:
- - Also, The role of charge transfer and atomic size mismatch is pivotal in stabilizing intermetallic phases. Miedema's framework quantifies these effects, making it useful for predicting alloying behaviors and crystal structure, the parameters coded here, while conceptually similar have not implemented Miedema directly. + - The role of charge transfer and atomic size mismatch is pivotal in stabilizing intermetallic phases. Miedema's framework quantifies these effects, making it useful for predicting alloying behaviors and crystal structure. The parameters coded here, while conceptually similar, have not implemented Miedema's model directly.🧰 Tools
🪛 LanguageTool
[uncategorized] ~313-~313: A comma might be missing here.
Context: ...rameters coded here, while conceptually similar have not implemented Miedema directly. ...(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/intermetallics_readme.md
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/intermetallics_readme.md
[uncategorized] ~82-~82: A determiner appears to be missing. Consider inserting it.
Context: ...# intermetallic_score(composition)
- Main scoring function combining multiple met...
(AI_EN_LECTOR_MISSING_DETERMINER)
[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...
(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)
[uncategorized] ~269-~269: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...
(AI_EN_LECTOR_MISSING_DETERMINER)
[uncategorized] ~313-~313: A comma might be missing here.
Context: ...rameters coded here, while conceptually similar have not implemented Miedema directly. ...
(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)
🔇 Additional comments (3)
docs/intermetallics_readme.md (3)
91-97
: Verify the intermetallic scores in the examples.The example shows specific scores (e.g., ~0.83 for Fe3Al, ~0.87 for Ni3Ti) that need verification based on previous feedback about numerical discrepancies.
Run this script to verify the actual outputs:
✅ Verification successful
The intermetallic scores in the examples are accurate and consistent with the implementation.
The scores align with the implementation's scoring heuristics and test cases. The codebase confirms:
- Fe3Al (~0.83) and Ni3Ti (~0.87) are classified as known intermetallics with expected high scores
- NaCl (0.63), Fe2O3 (0.64), and SiO2 (0.25) are correctly classified as non-intermetallics with appropriately lower scores
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Create a minimal test script echo 'from smact.intermetallics import intermetallic_score compounds = ["Fe3Al", "Ni3Ti", "NaCl", "Fe2O3", "SiO2"] for compound in compounds: print(f"{compound}: {intermetallic_score(compound)}")' > test_scores.py # Execute the test script python3 test_scores.pyLength of output: 325
Script:
#!/bin/bash # Find the implementation of intermetallic_score ast-grep --pattern 'def intermetallic_score($$$):' # Search for test files containing these compounds rg -l "Fe3Al|Ni3Ti|NaCl|Fe2O3|SiO2" --type pyLength of output: 312
Script:
#!/bin/bash # Check the implementation echo "=== Implementation in intermetallics.py ===" rg -A 20 "def intermetallic_score" smact/intermetallics.py echo -e "\n=== Test cases in test_intermetallics.py ===" rg -A 5 -B 5 "Fe3Al|Ni3Ti|NaCl|Fe2O3|SiO2" smact/tests/test_intermetallics.pyLength of output: 5039
269-271
: Clarify the default threshold value.The documentation mentions a default threshold of 0.7, but there's inconsistency in the codebase regarding this value. Please verify and document:
- The current default value
- The rationale for this specific value
- Guidelines for when to adjust it
Run this script to verify the default threshold:
✅ Verification successful
The default threshold value of 0.7 is correctly documented and consistently implemented.
The threshold value:
- Is defined as 0.7 in the source code (smact/screening.py)
- Is validated by test cases expecting scores above 0.7 for known intermetallics
- Aligns with the statistical distribution of ion pair similarities in the codebase
Guidelines for adjustment:
- Consider lowering for more inclusive screening of potential intermetallics
- Consider raising for stricter filtering in high-precision applications
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for the default threshold value in the codebase ast-grep --pattern 'intermetallic_threshold = $_' rg -i "default.*threshold.*=.*0\.[0-9]+"Length of output: 94
Script:
#!/bin/bash # Search with broader patterns rg "0\.7" --type py --type yaml --type json -A 2 -B 2 rg "threshold" --type py --type yaml --type json -A 2 -B 2 # Look for configuration files fd -e yaml -e json -e toml -e ini -e cfg --exec cat {} # Search for threshold-related constants ast-grep --pattern 'THRESHOLD = $_' ast-grep --pattern 'threshold = $_'Length of output: 71635
🧰 Tools
🪛 LanguageTool
[uncategorized] ~269-~269: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...(AI_EN_LECTOR_MISSING_DETERMINER)
21-25
: Verify the numerical output in the example.The example shows
get_metal_fraction("Fe2O3")
returning 0.4, but this value needs verification as previous reviewers reported discrepancies in the numerical outputs.Run this script to verify the actual output:
✅ Verification successful
Documentation example is accurate
The example showing
get_metal_fraction("Fe2O3")
returning 0.4 is mathematically correct. The test suite confirms that Fe2O3 should return a fractional value between 0 and 1, and the calculation (2 metal atoms / 5 total atoms = 0.4) is accurate.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Create a minimal test script echo 'from smact.intermetallics import get_metal_fraction print(get_metal_fraction("Fe2O3"))' > test_metal_fraction.py # Execute the test script python3 test_metal_fraction.pyLength of output: 226
Script:
#!/bin/bash # Find the implementation of get_metal_fraction ast-grep --pattern 'def get_metal_fraction($_): $$$' # Also search with ripgrep for more context rg "def get_metal_fraction" -A 10Length of output: 813
Script:
#!/bin/bash # Find get_element_fraction implementation ast-grep --pattern 'def get_element_fraction($_): $$$' # Look for metals set definition rg "metals = " -A 5 # Also check if there's any direct usage of Fe2O3 in tests rg "Fe2O3" --type pyLength of output: 1647
Script:
#!/bin/bash # Get the full test case for Fe2O3 rg "Fe2O3" smact/tests/test_intermetallics.py -B 2 -A 2 # Get the get_element_fraction implementation ast-grep --pattern 'def get_element_fraction($params): $$$'Length of output: 522
intermetallics_readme
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
docs/intermetallics_readme.md (4)
132-133
: Document the default threshold value.The example shows a threshold of 0.7, but it would be helpful to explicitly mention this is the default value in the documentation.
244-252
: Show both string and Composition usage in advanced example.Consider showing both string and Composition usage to demonstrate the API's flexibility:
# Detailed analysis of a compound -comp = Composition("Fe3Al") +# Using string formula +metrics_str = { + "metal_fraction": get_metal_fraction("Fe3Al"), + "d_electron_fraction": get_d_electron_fraction("Fe3Al"), + "distinct_metals": get_distinct_metal_count("Fe3Al"), + "pauling_mismatch": get_pauling_test_mismatch("Fe3Al"), + "overall_score": intermetallic_score("Fe3Al"), +} + +# Using Composition object +comp = Composition("Fe3Al") metrics = { "metal_fraction": get_metal_fraction(comp), "d_electron_fraction": get_d_electron_fraction(comp),
260-260
: Add missing period.Add a period at the end of the sentence.
- - Falls back to default behavior in these cases + - Falls back to default behavior in these cases.🧰 Tools
🪛 LanguageTool
[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)
257-261
: Add version compatibility information.Consider adding information about minimum required versions of dependencies (pymatgen, etc.) in the limitations section.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/intermetallics_readme.md
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/intermetallics_readme.md
[uncategorized] ~82-~82: A determiner appears to be missing. Consider inserting it.
Context: ...# intermetallic_score(composition)
- Main scoring function combining multiple met...
(AI_EN_LECTOR_MISSING_DETERMINER)
[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...
(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)
[uncategorized] ~269-~269: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...
(AI_EN_LECTOR_MISSING_DETERMINER)
🔇 Additional comments (2)
docs/intermetallics_readme.md (2)
150-151
: Replace wildcard import with explicit imports.Using wildcard imports is not recommended as it can lead to namespace pollution and make it unclear which functions are being used.
-from smact.intermetallics import * +from smact.intermetallics import ( + get_metal_fraction, + get_d_electron_fraction, + get_distinct_metal_count, + get_pauling_test_mismatch, + intermetallic_score +)
242-242
: Replace wildcard import in advanced usage example.-from smact.intermetallics import * +from smact.intermetallics import ( + get_metal_fraction, + get_d_electron_fraction, + get_distinct_metal_count, + get_pauling_test_mismatch, + intermetallic_score +)
test_edge_cases, test_get_pauling_test_mismatch, test_intermetallic_score
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
docs/intermetallics_readme.md (2)
150-151
: 🛠️ Refactor suggestionReplace wildcard import with explicit imports.
Using wildcard imports can lead to namespace pollution and make it unclear which functions are being used.
-from smact.intermetallics import * +from smact.intermetallics import ( + get_metal_fraction, + get_d_electron_fraction, + get_distinct_metal_count, + get_pauling_test_mismatch, + intermetallic_score +)
242-242
: 🛠️ Refactor suggestionReplace wildcard import in advanced usage example.
Another instance of wildcard import that should be replaced with explicit imports.
-from smact.intermetallics import * +from smact.intermetallics import ( + get_metal_fraction, + get_d_electron_fraction, + get_distinct_metal_count, + get_pauling_test_mismatch, + intermetallic_score +)
🧹 Nitpick comments (4)
docs/intermetallics_readme.md (1)
22-25
: Clarify example outputs and input types.The examples should:
- Clarify that numerical outputs are approximate values that may vary.
- Show both string and Composition object usage examples since both are supported.
-print(get_metal_fraction("Fe3Al")) # Works with string formula (& Composition object) +# Using string input +print(get_metal_fraction("Fe3Al")) # Returns 1.0 + +# Using Composition object +from pymatgen.core import Composition +print(get_metal_fraction(Composition("Fe3Al"))) # Returns 1.0 -print(get_metal_fraction("Fe2O3")) # 0.4 +print(get_metal_fraction("Fe2O3")) # Returns ~0.4 (exact value may vary)smact/tests/test_intermetallics.py (3)
42-67
: Consider adding negative test caseThe test coverage is thorough for valid cases. Consider adding a test case with an invalid composition to verify error handling.
142-159
: Define score thresholds as class constantsConsider defining the score thresholds (0.7 and 0.5) as class constants to improve maintainability and make it easier to adjust these values if needed.
Example:
class TestIntermetallics(unittest.TestCase): HIGH_SCORE_THRESHOLD = 0.7 LOW_SCORE_THRESHOLD = 0.5
1-2
: Add docstring coverage verificationConsider adding a test to verify that all public functions in the intermetallics module have proper docstrings.
Example:
def test_docstring_coverage(self): """Verify all public functions have docstrings.""" for func in [get_metal_fraction, get_d_electron_fraction, get_distinct_metal_count, get_pauling_test_mismatch, intermetallic_score]: self.assertIsNotNone(func.__doc__, f"{func.__name__} missing docstring")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
docs/intermetallics_readme.md
(1 hunks)smact/intermetallics.py
(1 hunks)smact/tests/test_intermetallics.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- smact/intermetallics.py
🧰 Additional context used
🪛 LanguageTool
docs/intermetallics_readme.md
[uncategorized] ~82-~82: A determiner appears to be missing. Consider inserting it.
Context: ...# intermetallic_score(composition)
- Main scoring function combining multiple met...
(AI_EN_LECTOR_MISSING_DETERMINER)
[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...
(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)
[uncategorized] ~269-~269: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...
(AI_EN_LECTOR_MISSING_DETERMINER)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: test (3.12, windows-latest)
- GitHub Check: test (3.12, macos-latest)
- GitHub Check: test (3.12, ubuntu-latest)
- GitHub Check: test (3.11, windows-latest)
- GitHub Check: test (3.11, macos-latest)
- GitHub Check: test (3.11, ubuntu-latest)
- GitHub Check: test (3.10, windows-latest)
- GitHub Check: test (3.10, macos-latest)
- GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (6)
docs/intermetallics_readme.md (1)
110-143
: Well-structured examples demonstrating all validation paths!The examples effectively showcase different validation methods with clear comments and consistent formatting.
smact/tests/test_intermetallics.py (5)
5-8
: Standardise the testing frameworkThe test suite is mixing
unittest
andpytest
frameworks. Since the class inherits fromunittest.TestCase
, we should maintain consistency by usingunittest
throughout.
24-40
: Well-structured test data setupExcellent selection of test cases covering various types of intermetallics and non-intermetallics. The setup provides a robust foundation for comprehensive testing.
68-92
: Excellent boundary testingComprehensive test coverage with clear boundary cases (0%, 100%, and intermediate metal fractions) and descriptive error messages.
167-172
: 🛠️ Refactor suggestionUse unittest assertions consistently
Replace
pytest.raises
withunittest
'sassertRaises
to maintain consistency with the chosen testing framework.Apply this diff:
- with pytest.raises(ValueError, match="Empty composition"): + with self.assertRaises(ValueError) as cm: intermetallic_score("") + self.assertEqual(str(cm.exception), "Empty composition") - with pytest.raises(ValueError, match="Invalid formula"): + with self.assertRaises(ValueError) as cm: intermetallic_score("NotAnElement") + self.assertEqual(str(cm.exception), "Invalid formula")Likely invalid or redundant comment.
129-141
: 🛠️ Refactor suggestionUpdate Pauling test comparison
The comparison should use absolute values to properly measure the deviation from zero, as per previous feedback. Additionally, consider adding edge cases with known electronegativity values.
Apply this diff:
- fe3al_mismatch < nacl_mismatch, + abs(fe3al_mismatch) < abs(nacl_mismatch),Likely invalid or redundant comment.
docs/intermetallics_readme.md
Outdated
print(get_pauling_test_mismatch("Fe3Al")) # 0.22 | ||
|
||
# Ionic compound - high mismatch | ||
print(get_pauling_test_mismatch("NaCl")) # -1.23 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add disclaimers for numerical values.
The numerical values shown in the examples should be clearly marked as approximate values that may vary based on the implementation. Additionally, emphasise the relative relationships rather than specific values.
-print(get_pauling_test_mismatch("Fe3Al")) # 0.22
-print(get_pauling_test_mismatch("NaCl")) # -1.23
+print(get_pauling_test_mismatch("Fe3Al")) # Returns ~0.22 (intermetallics have lower mismatch)
+print(get_pauling_test_mismatch("NaCl")) # Returns ~-1.23 (ionic compounds have higher mismatch)
-print(intermetallic_score("Fe3Al")) # ~0.83
-print(intermetallic_score("Ni3Ti")) # ~0.87
+print(intermetallic_score("Fe3Al")) # Returns ~0.83 (high score indicates strong intermetallic character)
+print(intermetallic_score("Ni3Ti")) # Returns ~0.87 (high score indicates strong intermetallic character)
-print(intermetallic_score("NaCl")) # 0.63
-print(intermetallic_score("Fe2O3")) # 0.64
-print(intermetallic_score("SiO2")) # 0.25
+print(intermetallic_score("NaCl")) # Returns ~0.63 (ionic compounds have lower scores)
+print(intermetallic_score("Fe2O3")) # Returns ~0.64 (metal oxides have lower scores)
+print(intermetallic_score("SiO2")) # Returns ~0.25 (non-metallic compounds have lowest scores)
Also applies to: 91-97
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
smact/intermetallics.py (3)
13-35
: Enhance error messages for better debuggingThe error messages could be more specific to help users identify the exact issue with their input.
Apply this diff to improve error messages:
- raise ValueError("Invalid formula") from exc + raise ValueError(f"Invalid formula '{composition}': {str(exc)}") from exc
56-70
: Enhance function documentationConsider adding more detailed docstrings including:
- Return value ranges
- Example usage
- Edge cases
Example enhancement for
get_metal_fraction
:def get_metal_fraction(composition: str | Composition) -> float: """Calculate the fraction of metallic elements in a composition. Args: composition: Chemical formula as string or pymatgen Composition Returns: float: Fraction between 0 and 1, where 1 indicates all elements are metals Example: >>> get_metal_fraction("Fe2O3") 0.4 # 2/(2+3) as only Fe is metallic """
107-157
: Make scoring parameters configurableConsider extracting magic numbers into configurable parameters to allow for different scoring strategies:
- The scale factor of 3.0 for pauling mismatch
- The target VEC value of 8.0
- The weights dictionary
Example implementation:
def intermetallic_score( composition: str | Composition, *, pauling_scale: float = 3.0, target_vec: float = 8.0, weights: dict[str, float] | None = None, ) -> float: """Calculate a score (0-1) indicating how intermetallic a composition is. Args: composition: Chemical formula or pymatgen Composition pauling_scale: Scale factor for pauling mismatch penalty (default: 3.0) target_vec: Target valence electron count (default: 8.0) weights: Custom weights for each component (default: None, uses standard weights) """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
smact/intermetallics.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: test (3.12, windows-latest)
- GitHub Check: test (3.12, macos-latest)
- GitHub Check: test (3.12, ubuntu-latest)
- GitHub Check: test (3.11, windows-latest)
- GitHub Check: test (3.11, macos-latest)
- GitHub Check: test (3.11, ubuntu-latest)
- GitHub Check: test (3.10, windows-latest)
- GitHub Check: test (3.10, macos-latest)
- GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (3)
smact/intermetallics.py (3)
1-11
: Well-structured module setup!The imports are properly organised, and the module's purpose is clearly documented.
38-54
: Add error handling for edge casesThe function should handle empty compositions and potential division by zero errors.
97-98
: Ensure compatibility with Python versions earlier than 3.10The use of the
strict
parameter in thezip
function is only available in Python 3.10 and above.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #367 +/- ##
===========================================
+ Coverage 75.47% 76.17% +0.70%
===========================================
Files 31 33 +2
Lines 2642 2749 +107
===========================================
+ Hits 1994 2094 +100
- Misses 648 655 +7 ☔ View full report in Codecov by Sentry. |
fix has been made, dismissing to request fresh review
Hey Anthony, thanks for all the reviews and requested changes, hopefully resolved by the changes made today. Tests are passing which is great, only test not working is pre-commit because there's a new Pyright version out for the pre-commit hook. Other than that, I hope this is a step in the right direction for this chemical filter and I look forward to future discussions and contributions towards its enhancement. Have a lovely weekend and a great start at your new job next week! All the very best <3 |
Hey Anthony, latest commit just moved the Jupyter notebook example into the examples docs folder, there are no inherent code changes, though the tests are failing cause of something to do with mp_api being none which I think is a separate issue you're working on so don't worry. |
metallicity - allowing us to be more general with the features added opening up the possibility of adding more features for enhanced metal detection in the future Includes addition of tests, docs, and tutorials changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (8)
smact/metallicity.py (3)
38-53
: Consider caching compositions for repeated calculations.
Repeatedly calling_ensure_composition
converts the string input to aComposition
object each time. If performance becomes a concern, consider memoising or reusing the converted object in long workflows.
72-76
: Clarify usage of smact-specific metals list.
Thesmact.metals
set might differ from other sources. Provide a clear explanation in your documentation to ensure users understand which elements are considered metals by SMACT.
142-157
: Expose weighting as an optional parameter.
The weighting dict inmetallicity_score
is hard-coded, which limits user customisation. Consider exposing this as a function parameter or a configuration for better flexibility.smact/tests/test_metallicity.py (1)
7-8
: Be cautious mixing unittest and pytest.
Combiningunittest.TestCase
withpytest
features can work, but might complicate debugging or test discovery. Ensure your CI pipeline consistently handles both frameworks.docs/metallicity_readme.md (2)
201-201
: Simplify the wording.Replace "in conjunction with" with "alongside" or "with" for better readability.
🧰 Tools
🪛 LanguageTool
[style] ~201-~201: ‘in conjunction with’ might be wordy. Consider a shorter alternative.
Context: ...n The metallicity module has been used in conjunction with machine learning approaches to predict ...(EN_WORDINESS_PREMIUM_IN_CONJUNCTION_WITH)
80-98
: Add error handling examples.The examples for
metallicity_score
should include cases demonstrating how the function handles invalid inputs (e.g., unknown elements, invalid formulas).Add examples like:
# Error handling examples try: print(metallicity_score("Xx2O3")) # Invalid element except ValueError as e: print(f"Error: {e}") try: print(metallicity_score("Fe2O3]")) # Invalid formula except ValueError as e: print(f"Error: {e}")smact/screening.py (2)
481-486
: Consider reordering the validation checks for better performance.The metallicity check, which involves more complex calculations, is performed before the simpler metal alloys check. Consider reordering these checks to perform the less computationally intensive checks first:
- # Check for intermetallic compounds if enabled - if check_metallicity: - score = metallicity_score(composition) - if score >= metallicity_threshold: - return True - # Check for simple metal alloys if enabled if include_alloys: is_metal_list = [elem_s in smact.metals for elem_s in elem_symbols] if all(is_metal_list): return True + + # Check for intermetallic compounds if enabled + if check_metallicity: + score = metallicity_score(composition) + if score >= metallicity_threshold: + return True
464-466
: Enhance the docstring for the new parameters.The descriptions for
check_metallicity
andmetallicity_threshold
could be more detailed. Consider adding:
- Example values for
metallicity_threshold
- Explanation of how these parameters interact with
include_alloys
- Reference to the metallicity scoring system documentation
- check_metallicity (bool): If True, uses the metallicity scoring system to validate potential metallic/alloy compounds. - metallicity_threshold (float): Score threshold (0-1) above which a compound is considered a valid metallic/alloy. - Only used if check_metallicity is True. + check_metallicity (bool): If True, uses the metallicity scoring system to validate potential + metallic/alloy compounds. This provides a more nuanced approach compared to include_alloys. + See smact.metallicity module documentation for details on the scoring system. + metallicity_threshold (float): Score threshold (0-1) above which a compound is considered + a valid metallic/alloy. Only used if check_metallicity is True. Default is 0.7. + Example: 0.8 for stricter metallic character, 0.6 for more lenient validation.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
docs/metallicity_readme.md
(1 hunks)docs/smact.metallicity.rst
(1 hunks)docs/smact.rst
(1 hunks)docs/tutorials.rst
(1 hunks)smact/metallicity.py
(1 hunks)smact/screening.py
(4 hunks)smact/tests/test_metallicity.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/smact.metallicity.rst
🧰 Additional context used
🪛 LanguageTool
docs/metallicity_readme.md
[style] ~201-~201: ‘in conjunction with’ might be wordy. Consider a shorter alternative.
Context: ...n The metallicity module has been used in conjunction with machine learning approaches to predict ...
(EN_WORDINESS_PREMIUM_IN_CONJUNCTION_WITH)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: test (3.12, windows-latest)
- GitHub Check: test (3.12, macos-latest)
- GitHub Check: test (3.12, ubuntu-latest)
- GitHub Check: test (3.11, windows-latest)
- GitHub Check: test (3.11, macos-latest)
- GitHub Check: test (3.11, ubuntu-latest)
- GitHub Check: test (3.10, windows-latest)
- GitHub Check: test (3.10, macos-latest)
- GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (9)
smact/metallicity.py (2)
107-141
: Check for potential confusion between smact.Element and pymatgen elements.
By re-instantiating elements viaElement(el.symbol)
, you rely on smact’s element definitions rather than the existing pymatgen ones. Confirm that any differences in property data or naming do not cause unintended behaviour.
13-35
: Ensure Python version compatibility.
The usage ofzip(..., strict=False)
(lines 97–98) requires Python 3.10 or later. If support for earlier versions is desired, consider removingstrict=False
or implementing an alternative strategy.smact/tests/test_metallicity.py (3)
24-32
: Commendable range of metallic examples.
Your selection of intermetallic compounds (Fe3Al, Ni3Ti, Cu3Au, etc.) provides strong coverage for real-world metallic systems. This makes the tests more representative and beneficial.
42-66
: Excellent use of boundary testing.
Verifying that the fraction is 1.0, fractional, or 0.0 for different compositions ensures comprehensive coverage of the calculation logic.
160-172
: Robust edge case handling.
Testing empty strings and invalid formula inputs is a good practice. This secures your code against unexpected user inputs.docs/tutorials.rst (1)
13-13
: Great addition of the metallicity classification tutorial.
Including the new tutorial reference ensures that users can learn how to apply the metallicity scoring functions in practice.docs/smact.rst (1)
32-32
: LGTM! Documentation updated correctly.The new
smact.metallicity
module has been properly added to the list of submodules in alphabetical order.docs/metallicity_readme.md (1)
95-97
: Review the scoring thresholds for non-metallic compounds.The example scores for non-metallic compounds (NaCl: 0.63, Fe2O3: 0.64) seem quite high, considering they are ionic/covalent compounds. This might lead to false positives when using the default threshold of 0.7. Consider adjusting the scoring weights or providing guidance on appropriate thresholds for different compound types.
smact/screening.py (1)
441-442
: Implementation aligns with previous feedback.The separation of metallicity functionality from
include_alloys
usingcheck_metallicity
parameter aligns well with the previous review feedback about maintaining backwards compatibility.Also applies to: 481-486
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
requirements.txt (1)
177-178
: Consider making CUDA support optional.The NVIDIA NCCL dependency is currently required for Linux systems. Consider making this an optional dependency to avoid unnecessary installations for users not requiring GPU acceleration.
pyproject.toml (2)
94-94
: Consider adding version constraints for ML dependencies.The new ML dependencies (xgboost, shap) should have version constraints to ensure reproducible builds and prevent compatibility issues.
-ml = ["xgboost","shap","smact[featurisers]"] +ml = ["xgboost>=2.1.3,<3.0.0", "shap>=0.46.0,<0.47.0", "smact[featurisers]"]
108-108
: Document the ML feature requirements.The addition of
smact[ml]
to optional dependencies introduces new machine learning capabilities. Consider adding a note in the README or documentation about these optional features and their requirements.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
pyproject.toml
(2 hunks)requirements.txt
(12 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: test (3.12, windows-latest)
- GitHub Check: test (3.12, macos-latest)
- GitHub Check: test (3.12, ubuntu-latest)
- GitHub Check: test (3.11, windows-latest)
- GitHub Check: test (3.11, macos-latest)
- GitHub Check: test (3.11, ubuntu-latest)
- GitHub Check: test (3.10, windows-latest)
- GitHub Check: test (3.10, macos-latest)
- GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (1)
requirements.txt (1)
330-331
: Verify SHAP and its dependencies compatibility.The addition of SHAP and its dependencies (slicer) introduces new machine learning capabilities. Please ensure these versions are compatible with the minimum Python version requirement (3.10) specified in pyproject.toml.
Also applies to: 338-339
✅ Verification successful
SHAP 0.46.0 and its dependencies are compatible with Python 3.10
Both SHAP 0.46.0 (requires Python >=3.9) and its dependency slicer 0.0.8 (requires Python >=3.6) are fully compatible with the project's minimum Python requirement of 3.10.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify package compatibility with Python 3.10 pip debug --verbose | grep -A 5 "Compatible tags" pip index versions shap==0.46.0 | grep "Requires-Python" pip index versions slicer==0.0.8 | grep "Requires-Python"Length of output: 924
Script:
#!/bin/bash # Fetch package metadata from PyPI curl -s https://pypi.org/pypi/shap/0.46.0/json | jq -r '.info | {requires_python, requires_dist}' curl -s https://pypi.org/pypi/slicer/0.0.8/json | jq -r '.info | {requires_python, requires_dist}'Length of output: 2134
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
docs/examples/metallicity.ipynb (2)
355-384
: Enhance theanalyze_compound
function with type hints and comprehensive docstring.The function would benefit from better documentation and type safety.
-def analyze_compound(formula): - """Perform comprehensive analysis of a compound.""" +def analyze_compound(formula: str) -> None: + """Perform comprehensive analysis of a compound. + + Args: + formula: Chemical formula of the compound (e.g., "Fe3Al") + + Raises: + ValueError: If the formula is invalid + """Also, consider handling potential NaN values more gracefully throughout the function.
7-11
: Enhance the introduction with version and performance information.Consider adding:
- Minimum version requirements for SMACT and its dependencies
- Performance characteristics for large-scale screening
- Memory usage guidelines for different compound sizes
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/examples.rst
(1 hunks)docs/examples/metallicity.ipynb
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: test (3.12, windows-latest)
- GitHub Check: test (3.12, macos-latest)
- GitHub Check: test (3.12, ubuntu-latest)
- GitHub Check: test (3.11, windows-latest)
- GitHub Check: test (3.11, macos-latest)
- GitHub Check: test (3.11, ubuntu-latest)
- GitHub Check: test (3.10, windows-latest)
- GitHub Check: test (3.10, macos-latest)
- GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (2)
docs/examples.rst (2)
23-23
: Logical placement in the documentation structure.The addition of the metallicity example to the toctree is well-structured and aligns with the PR's objectives of enhancing metallicity handling capabilities.
23-23
: Verify the existence of the metallicity example file.Please ensure that the referenced file
examples/metallicity
exists and contains appropriate documentation for the new metallicity features.✅ Verification successful
Toctree entry correctly references the metallicity example file
The referenced file exists at
docs/examples/metallicity.ipynb
and contains comprehensive documentation for the new metallicity features.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the existence and content of the metallicity example file # Check if the file exists fd -t f "metallicity" docs/examples/ # If found, display its content if [ $? -eq 0 ]; then echo "File found. Displaying content:" cat $(fd -t f "metallicity" docs/examples/) else echo "Warning: Example file not found" fiLength of output: 17723
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docs/examples/metallicity.ipynb (2)
242-277
: Enhance validation examples with diverse test cases.The validation examples effectively demonstrate different paths but could be improved by:
- Adding edge cases (e.g., compounds with mixed metallic character)
- Including comments explaining why certain compounds pass/fail specific validation paths
- Demonstrating the impact of different metallicity thresholds
from smact.screening import smact_validity -# Test with different validation methods -compound = "Fe3Al" +# Test cases demonstrating different validation scenarios +test_cases = [ + "Fe3Al", # Pure metallic - passes alloy and metallicity checks + "Fe2O3", # Metal oxide - passes standard check + "Cu3Au", # Ordered intermetallic - behaviour varies with threshold +] -# 1. Standard validation (no alloy/metallicity check) -is_valid_standard = smact_validity( - compound, use_pauling_test=True, include_alloys=False, check_metallicity=False -) -print(f"Is {compound} valid without an alloy check? {is_valid_standard}") +for compound in test_cases: + print(f"\nValidation results for {compound}:") + + # 1. Standard validation (no alloy/metallicity check) + is_valid_standard = smact_validity( + compound, use_pauling_test=True, include_alloys=False, check_metallicity=False + ) + print(f"Standard validation: {is_valid_standard}") + + # 2. With alloy detection + is_valid_alloy = smact_validity( + compound, use_pauling_test=True, include_alloys=True, check_metallicity=False + ) + print(f"Alloy validation: {is_valid_alloy}") + + # 3. Test different metallicity thresholds + for threshold in [0.5, 0.7, 0.9]: + is_valid_metallic = smact_validity( + compound, + use_pauling_test=True, + include_alloys=False, + check_metallicity=True, + metallicity_threshold=threshold, + ) + print(f"Metallicity validation (threshold={threshold}): {is_valid_metallic}")
355-374
: Enhance function documentation and type safety.The
analyze_compound
function would benefit from:
- Type hints for parameters and return values
- A more detailed docstring following NumPy or Google style
- Return values instead of just printing results
-def analyze_compound(formula): - """Perform comprehensive analysis of a compound.""" +from typing import Dict, Union + +def analyze_compound(formula: str) -> Dict[str, Union[float, bool]]: + """Perform comprehensive analysis of a compound's metallic properties. + + Args: + formula: Chemical formula of the compound (e.g., "Fe3Al") + + Returns: + Dict containing analysis results with the following keys: + - metal_fraction: Fraction of metallic elements + - d_electron_fraction: Fraction of d-block elements + - distinct_metals: Count of unique metallic elements + - pauling_mismatch: Deviation from ideal electronegativity ordering + - metallicity_score: Overall metallic character score + - valid_standard: Result of standard validation + - valid_alloy: Result of alloy validation + - valid_metallic: Result of metallicity validation + + Raises: + ValueError: If the formula is invalid + """ # Basic metrics metal_frac = get_metal_fraction(formula) d_frac = get_d_electron_fraction(formula) n_metals = get_distinct_metal_count(formula) pauling = get_pauling_test_mismatch(formula) score = metallicity_score(formula) # Validity checks valid_standard = smact_validity( formula, use_pauling_test=True, include_alloys=False, check_metallicity=False ) valid_alloy = smact_validity( formula, use_pauling_test=True, include_alloys=True, check_metallicity=False ) valid_metallic = smact_validity( formula, use_pauling_test=True, include_alloys=False, check_metallicity=True ) + results = { + "metal_fraction": metal_frac, + "d_electron_fraction": d_frac, + "distinct_metals": n_metals, + "pauling_mismatch": pauling if not math.isnan(pauling) else None, + "metallicity_score": score, + "valid_standard": valid_standard, + "valid_alloy": valid_alloy, + "valid_metallic": valid_metallic + } + + # Print results for backwards compatibility print(f"Analysis of {formula}:") - print(f"Metal fraction: {metal_frac:.2f}") - print(f"d-electron fraction: {d_frac:.2f}") - print(f"Distinct metals: {n_metals}") - print(f"Pauling mismatch: {'nan' if math.isnan(pauling) else f'{pauling:.2f'}") - print(f"Metallicity score: {score:.2f}") - print(f"Valid (standard): {valid_standard}") - print(f"Valid (alloy): {valid_alloy}") - print(f"Valid (metallic): {valid_metallic}") + for key, value in results.items(): + if isinstance(value, float): + print(f"{key}: {value:.2f}") + else: + print(f"{key}: {value}") + + return results
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/examples/metallicity.ipynb
(1 hunks)
🔇 Additional comments (2)
docs/examples/metallicity.ipynb (2)
46-54
: Add comprehensive error handling and input validation to example code.The example code cells would benefit from demonstrating error handling for invalid chemical formulae and showing input validation for edge cases. Additionally, include examples with both string and Composition object inputs.
Here's an example enhancement for the
metallicity_score
demonstration:from smact.metallicity import metallicity_score -# Metallic compounds - high scores -print(f"The metallicity score of Fe3Al is {metallicity_score('Fe3Al'):.2f}") # ~0.85 -print(f"The metallicity score of Ni3Ti is {metallicity_score('Ni3Ti'):.2f}") # ~0.91 +# Example with string input and error handling +try: + # Metallic compounds - high scores + print(f"The metallicity score of Fe3Al is {metallicity_score('Fe3Al'):.2f}") # ~0.85 + print(f"The metallicity score of Ni3Ti is {metallicity_score('Ni3Ti'):.2f}") # ~0.91 + + # Invalid formula + print(metallicity_score("NotACompound")) +except ValueError as e: + print(f"Error: {e}") -# Non-metallic compounds - low scores -print(f"The metallicity score of NaCl is {metallicity_score('NaCl'):.2f}") # 0.33 -print(f"The metallicity score of Fe2O3 is {metallicity_score('Fe2O3'):.2f}") # 0.46 -print(f"The metallicity score of SiO2 is {metallicity_score('SiO2'):.2f}") # 0.17 +# Example with Composition object +from pymatgen.core import Composition + +# Non-metallic compounds - low scores +compounds = ["NaCl", "Fe2O3", "SiO2"] +for formula in compounds: + comp = Composition(formula) + print(f"The metallicity score of {comp.reduced_formula} is {metallicity_score(comp):.2f}")Also applies to: 80-88, 117-128, 157-165, 197-208
351-351
: Replace wildcard import with explicit imports.Using
from smact.metallicity import *
is not recommended as it can lead to namespace pollution and make it unclear which functions are being used.
smact/metallicity.py
Outdated
return get_element_fraction(composition, smact.metals) | ||
|
||
|
||
def get_d_electron_fraction(composition: str | Composition) -> float: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest renaming this to get_d_block_element_fraction
or something similar. electron
in the function name would imply that we're trying to count the fraction of d electrons available in the compound over all the electrons which is quite different from the intended functionality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha! I'll fix this in my next commit
from smact.properties import valence_electron_count | ||
|
||
|
||
def _ensure_composition(composition: str | Composition) -> Composition: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this function is needed. Could you explain if it is?
I believe the Composition
class is already robust enough with its error handling.
In each function call in the rest of module, you could probably have comp=Composition(composition)
even if composition is an instance of the Composition class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this might be a redundancy on my end, but I originally wrote this to ensure the composition was a pymatgen composition object even in contexts where they may enter a composition that's otherwise - ill do some testing on my end and possibly get rid of the function should this be the case and incorporate your suggested changes.
get_d_block_element_fraction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
docs/examples/metallicity.ipynb (2)
46-54
: 🛠️ Refactor suggestionAdd error handling and clarify input types.
The example would benefit from demonstrating error handling and being more explicit about input types.
from smact.metallicity import get_metal_fraction +from pymatgen.core import Composition -# Pure metallic compound - returns 1.0 -print(f'The metallic fraction of Fe3Al is {get_metal_fraction("Fe3Al")}') # Works with string formula (& Composition object) +try: + # Example with string input + print(f'The metallic fraction of Fe3Al is {get_metal_fraction("Fe3Al")}') -# Mixed compound - returns fraction of how much of the compound is metallic -print(f'The metallic fraction of Fe2O3 {get_metal_fraction("Fe2O3")}') # 0.4 + # Example with Composition object + comp = Composition("Fe2O3") + print(f'The metallic fraction of {comp.reduced_formula} is {get_metal_fraction(comp)}') + + # Example with invalid input + print(get_metal_fraction("NotACompound")) +except ValueError as e: + print(f"Error: {e}")
351-351
: 🛠️ Refactor suggestionReplace wildcard import with explicit imports.
Using wildcard imports can lead to namespace pollution and make it unclear which functions are being used.
-from smact.metallicity import * +from smact.metallicity import ( + get_metal_fraction, + get_d_block_element_fraction, + get_distinct_metal_count, + get_pauling_test_mismatch, + metallicity_score +)
🧹 Nitpick comments (8)
smact/metallicity.py (4)
56-61
: Enhance the docstring with more details.While the implementation is correct, the docstring could be more descriptive by including:
- Input/output types and ranges
- Example usage
- Return value interpretation
def get_metal_fraction(composition: str | Composition) -> float: - """Calculate the fraction of metallic elements in a composition. - - Implemented using get_element_fraction helper with smact.metals set. - """ + """Calculate the fraction of metallic elements in a composition. + + Args: + composition: Chemical formula as string or pymatgen Composition + + Returns: + float: Fraction between 0-1 representing the proportion of metallic elements + + Example: + >>> get_metal_fraction("Fe3Al") # Returns 1.0 (all elements are metals) + >>> get_metal_fraction("Fe2O3") # Returns 0.4 (2/5 elements are metals) + """
97-98
: Remove redundantstrict=False
parameter in zip.The
strict
parameter is redundant here asFalse
is the default value.- for i, (_el1, eneg1) in enumerate(zip(elements, electronegativities, strict=False)): - for _el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :], strict=False): + for i, (_el1, eneg1) in enumerate(zip(elements, electronegativities)): + for _el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :]):
138-140
: Document the rationale for the scale factor.The scale factor of 3.0 for the Pauling mismatch penalty appears to be a magic number. Consider adding a comment explaining its significance.
- scale = 3.0 + # Scale factor of 3.0 chosen as typical Pauling electronegativity + # differences >3.0 are rare and indicate extreme ionic character + scale = 3.0
107-118
: Enhance the docstring with comprehensive details.The docstring should document the scoring components, weights, and interpretation of the score.
def metallicity_score(composition: str | Composition) -> float: - """Calculate a score (0-1) indicating the degree of a compound's metallic/alloy nature. - - 1. Fraction of metallic elements - 2. Number of distinct metals - 3. d-block element fraction - 4. Electronegativity difference (Pauling mismatch) - 5. Valence electron count proximity to 8 - - Args: - composition: Chemical formula or pymatgen Composition - """ + """Calculate a score (0-1) indicating the degree of a compound's metallic/alloy nature. + + The score is a weighted sum of the following components: + 1. Metal fraction (30%): Proportion of metallic elements + 2. d-block element fraction (20%): Proportion of d-block elements + 3. Distinct metals count (20%): Number of unique metals (capped at 3) + 4. Valence electron count (15%): Proximity to 8 electrons + 5. Pauling mismatch (15%): Electronegativity ordering deviation + + Args: + composition: Chemical formula as string or pymatgen Composition + + Returns: + float: Score between 0-1 where: + - Higher scores (>0.7) suggest metallic/alloy character + - Lower scores (<0.5) suggest non-metallic character + + Example: + >>> metallicity_score("Fe3Al") # Returns ~0.85 (metallic) + >>> metallicity_score("NaCl") # Returns ~0.33 (ionic) + """smact/tests/test_metallicity.py (4)
93-109
: Add test case for mixed d-block/non-d-block compounds.The test method covers pure d-block and non-d-block compounds but would benefit from testing mixed compounds.
def test_get_d_block_element_fraction(self): """Test d-block element fraction calculation.""" # Should be 1.0 for pure transition metal compounds self.assertAlmostEqual( get_d_block_element_fraction(Composition("Fe2Nb")), 1.0, places=6, msg="Expected all d-block elements in Fe2Nb", ) # Should be 0.0 for compounds with no d-block elements self.assertAlmostEqual( get_d_block_element_fraction(Composition("NaCl")), 0.0, places=6, msg="Expected no d-block elements in NaCl", ) + + # Should be fractional for mixed compounds + self.assertAlmostEqual( + get_d_block_element_fraction(Composition("Fe3Al")), + 0.75, + places=6, + msg="Expected 75% d-block elements in Fe3Al", + )
111-127
: Add test case for complex multi-metal compounds.The test method would benefit from testing compounds with more than two metals.
def test_get_distinct_metal_count(self): """Test counting of distinct metals.""" self.assertEqual( get_distinct_metal_count(Composition("Fe3Al")), 2, msg="Expected 2 distinct metals in Fe3Al", ) self.assertEqual( get_distinct_metal_count(Composition("NaCl")), 1, msg="Expected 1 distinct metal in NaCl", ) self.assertEqual( get_distinct_metal_count(Composition("SiO2")), 0, msg="Expected no metals in SiO2", ) + # Test complex multi-metal compound + self.assertEqual( + get_distinct_metal_count(Composition("NbTiAlCr")), + 4, + msg="Expected 4 distinct metals in NbTiAlCr", + )
129-140
: Add test case for compounds with missing electronegativity data.The test method should verify that NaN is returned when electronegativity data is missing.
def test_get_pauling_test_mismatch(self): """Test Pauling electronegativity mismatch calculation.""" # Ionic compounds should have high mismatch nacl_mismatch = get_pauling_test_mismatch(Composition("NaCl")) # Metallic compounds should have lower mismatch fe3al_mismatch = get_pauling_test_mismatch(Composition("Fe3Al")) self.assertTrue( fe3al_mismatch < nacl_mismatch, msg=f"Expected lower Pauling mismatch for Fe3Al ({fe3al_mismatch:.2f}) than NaCl ({nacl_mismatch:.2f})", ) + + # Test compound with missing electronegativity data + import math + mismatch = get_pauling_test_mismatch(Composition("Og2O")) # Oganesson has no known electronegativity + self.assertTrue( + math.isnan(mismatch), + msg="Expected NaN for compound with missing electronegativity data", + )
142-172
: Add test cases for boundary conditions.The test methods would benefit from testing boundary conditions of the metallicity score.
def test_metallicity_score(self): """Test the overall metallicity scoring function.""" # Known metallic compounds should score high for formula in self.metallic_compounds: score = metallicity_score(formula) self.assertTrue( score > 0.7, msg=f"Expected high metallicity score (>0.7) for {formula}, but got {score:.2f}", ) # Non-metallic compounds should score low for formula in self.non_metallic_compounds: score = metallicity_score(formula) self.assertTrue( score < 0.5, msg=f"Expected low metallicity score (<0.5) for {formula}, but got {score:.2f}", ) + + # Test boundary cases + # Compound with maximum possible score + max_score = metallicity_score("Fe2Nb") # All metrics should be optimal + self.assertAlmostEqual( + max_score, + 1.0, + places=6, + msg="Expected maximum score for optimal compound", + ) + + # Compound with minimum possible score + min_score = metallicity_score("F2") # All metrics should be minimal + self.assertAlmostEqual( + min_score, + 0.0, + places=6, + msg="Expected minimum score for non-metallic compound", + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
docs/examples/metallicity.ipynb
(1 hunks)smact/metallicity.py
(1 hunks)smact/tests/test_metallicity.py
(1 hunks)
🔇 Additional comments (8)
smact/metallicity.py (5)
13-35
: LGTM! The error handling is comprehensive.The function provides consistent error handling and user-friendly error messages. The author's explanation about its use case in the past review comments justifies its existence.
38-53
: Well-designed helper function that promotes code reuse.The implementation is efficient and the docstring clearly explains the function's purpose and usage.
64-69
: Enhance the docstring with more details.The implementation is correct and the name has been updated as suggested. However, like
get_metal_fraction
, the docstring could be more descriptive.def get_d_block_element_fraction(composition: str | Composition) -> float: - """Calculate the fraction of d-block elements in a composition. - - Implemented using get_element_fraction helper with smact.d_block set. - """ + """Calculate the fraction of d-block elements in a composition. + + Args: + composition: Chemical formula as string or pymatgen Composition + + Returns: + float: Fraction between 0-1 representing the proportion of d-block elements + + Example: + >>> get_d_block_element_fraction("Fe2Nb") # Returns 1.0 (all d-block) + >>> get_d_block_element_fraction("Fe3Al") # Returns 0.75 (Fe is d-block) + """
72-75
: Enhance the docstring with more details.The implementation is efficient but the docstring could be more descriptive.
def get_distinct_metal_count(composition: str | Composition) -> int: - """Count the number of distinct metallic elements in a composition.""" + """Count the number of distinct metallic elements in a composition. + + Args: + composition: Chemical formula as string or pymatgen Composition + + Returns: + int: Number of unique metallic elements in the composition + + Example: + >>> get_distinct_metal_count("Fe3Al") # Returns 2 + >>> get_distinct_metal_count("NbTiAlCr") # Returns 4 + >>> get_distinct_metal_count("SiO2") # Returns 0 + """
78-87
: Enhance the docstring with more details.The docstring could be more descriptive by including input/output types and examples.
def get_pauling_test_mismatch(composition: str | Composition) -> float: - """Calculate a score for how much the composition deviates from ideal - Pauling electronegativity ordering. - - Higher mismatch => more difference (ionic, e.g. NaCl). - Lower mismatch => metal-metal bonds (e.g. Fe-Al). - - Returns: - float: Mismatch score (0=perfect match, higher=more deviation, NaN=missing data) - """ + """Calculate a score for how much the composition deviates from ideal + Pauling electronegativity ordering. + + Higher mismatch indicates more ionic character (e.g., NaCl), + while lower mismatch suggests metal-metal bonds (e.g., Fe-Al). + + Args: + composition: Chemical formula as string or pymatgen Composition + + Returns: + float: Mismatch score where: + - 0.0 indicates perfect match (metallic-like bonding) + - Higher values indicate more ionic character + - NaN if any element lacks known electronegativity + + Example: + >>> get_pauling_test_mismatch("Fe3Al") # Returns ~0.22 (metallic) + >>> get_pauling_test_mismatch("NaCl") # Returns ~2.23 (ionic) + """smact/tests/test_metallicity.py (3)
24-40
: LGTM! Well-organized test data setup.The test compounds are well-chosen with clear comments explaining their significance.
42-66
: LGTM! Comprehensive test coverage.The test method covers all important cases with appropriate assertions and descriptive error messages.
68-91
: LGTM! Tests cover edge cases effectively.The test method effectively validates the function's behaviour with pure metallic, non-metallic, and mixed compounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
SMACT Metallicity/Alloy Handling Enhancement
This document describes the enhancements made to SMACT to better handle metallic compounds and their classification. The changes introduce a scoring system for identifying and validating metallic compounds, moving beyond simple binary classification to provide a more nuanced understanding of a compound's metallic character.
Key Changes
1. Module:
smact.metallicity
A dedicated module for handling metallic compounds with several specialized functions:
get_metal_fraction(composition)
get_d_electron_fraction(composition)
get_distinct_metal_count(composition)
get_pauling_test_mismatch(composition)
metallicity_score(composition)
2. Enhanced
smact_validity
The existing
smact_validity
function insmact.screening
has been enhanced with three validation paths:Example usage showing all validation paths:
3. Comprehensive Analysis Example
Here's how to perform a detailed analysis of a compound:
Future Development Directions
Specialized Submodules
Development of specific modules for different types of metallic systems:
Integration with structure prediction
Incorporate atomic size factors
Add structure prediction capabilities
Include formation energy estimates - linking back to Miedema's model DOI Link
Validation and Refinement
Extended Functionality
Contributing
Contributions to improve the metallicity functionality are welcome! Areas particularly in need of development:
References
Original SMACT paper: SMACT: Semiconducting Materials by Analogy and Chemical Theory
Intermetallics theory and classification literature sources:
D.G. Pettifor introduced the concept of a single "chemical scale" or "structure map" coordinate (Pettifor number) to systematically separate compound classes [1, p. 31]. The new intermetallicscore is a step in that direction but customized to SMACT's internal data structures.
The role of charge transfer and atomic size mismatch is pivotal in stabilizing intermetallic phases. Miedema's framework quantifies these effects, making it useful for predicting alloying behaviors and crystal structure. The parameters coded here, while conceptually similar, have not implemented Miedema's model directly.
Reference: A.R. Miedema, Cohesion in alloys - fundamentals of a semi-empirical model.DOI Link
Electronegativity scales and their role in predicting bonding character (Pauling electronegativity)
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Test A - Unit Tests (smact/tests/test_intermetallics.py):
Test B: Machine Learning Classification
The metallicity module has been used in conjunction with machine learning approaches to predict the metallic nature of compounds. The
metallicity_classification.ipynb
notebook demonstrates how these chemical descriptors can be used to train models for predicting whether a compound will exhibit metallic behavior.The classification task uses features derived from the metallicity module including:
These features are used to train models that can predict the likelihood of metallic behavior in compounds, which is more general than specifically predicting intermetallic behavior. This approach allows for a broader understanding of metallic character in materials, encompassing various types of metallic systems including:
Classification Workflow Details:
Classification results:
Reproduction Instructions:
Key Results
Unit Test Results:
Test Configuration:
Checklist
Summary by CodeRabbit
Release Notes
New Features
Documentation
Dependencies
The update provides more sophisticated tools for analysing and validating metallic compounds, with improved scoring mechanisms and comprehensive documentation.