Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SMACT Metallicity Handling Enhancements #367

Merged
merged 26 commits into from
Jan 30, 2025

Conversation

ryannduma
Copy link
Collaborator

@ryannduma ryannduma commented Jan 16, 2025

SMACT Metallicity/Alloy Handling Enhancement

This document describes the enhancements made to SMACT to better handle metallic compounds and their classification. The changes introduce a scoring system for identifying and validating metallic compounds, moving beyond simple binary classification to provide a more nuanced understanding of a compound's metallic character.

Key Changes

1. Module: smact.metallicity

A dedicated module for handling metallic compounds with several specialized functions:

get_metal_fraction(composition)

  • Calculates the fraction of metallic elements in a composition
  • Input: Chemical formula (str) or pymatgen Composition object
  • Output: Float between 0-1
  • Example:
from smact.metallicity import get_metal_fraction

# Pure metallic compound - returns 1.0
print(get_metal_fraction("Fe3Al"))  # Works with string formula (& Composition object)

# Mixed compound - returns fraction of how much of the compound is metallic
print(get_metal_fraction("Fe2O3"))  # 0.4

get_d_electron_fraction(composition)

  • Calculates the fraction of d-block elements
  • Important for transition metal compounds
  • Example:
from smact.metallicity import get_d_electron_fraction

# Pure transition metal compound - returns 1.0
print(get_d_electron_fraction("Fe2Nb"))

# Mixed compound - returns fraction
print(get_d_electron_fraction("Fe3Al"))  # 0.75 (Fe is d-block, Al is not)

get_distinct_metal_count(composition)

  • Counts unique metallic elements
  • Useful for identifying complex metallic systems
  • Example:
from smact.metallicity import get_distinct_metal_count

# Binary metallic compound
print(get_distinct_metal_count("Fe3Al"))  # Returns 2

# Complex multi-component composition
print(get_distinct_metal_count("NbTiAlCr"))  # Returns 4

# Non-metallic compound
print(get_distinct_metal_count("SiO2"))  # Returns 0

get_pauling_test_mismatch(composition)

  • Calculates deviation from ideal electronegativity ordering
  • Handles both metal-metal and metal-nonmetal pairs
  • Lower scores indicate more metallic-like bonding
  • Example:
from smact.metallicity import get_pauling_test_mismatch

# Metallic compound - low mismatch
print(get_pauling_test_mismatch("Fe3Al"))  # 0.22

# Ionic compound - high mismatch
print(get_pauling_test_mismatch("NaCl"))  # -1.23

metallicity_score(composition)

  • Main scoring function combining multiple metrics
  • Returns a score between 0-1
  • Higher scores indicate more metallic character
  • Example:
from smact.metallicity import metallicity_score

# Metallic compounds - high scores
print(metallicity_score("Fe3Al"))  # ~0.83
print(metallicity_score("Ni3Ti"))  # ~0.87

# Non-metallic compounds - low scores
print(metallicity_score("NaCl"))  # 0.63
print(metallicity_score("Fe2O3"))  # 0.64
print(metallicity_score("SiO2"))  # 0.25

2. Enhanced smact_validity

The existing smact_validity function in smact.screening has been enhanced with three validation paths:

  1. Standard validation (charge neutrality and electronegativity)
  2. Simple metal validation
  3. Metallicity scoring validation

Example usage showing all validation paths:

from smact.screening import smact_validity

# Test with different validation methods
compound = "Fe3Al"

# 1. Standard validation (no alloy/metallicity check)
is_valid_standard = smact_validity(
    compound, use_pauling_test=True, include_alloys=False, check_metallicity=False
)

# 2. With alloy detection
is_valid_alloy = smact_validity(
    compound, use_pauling_test=True, include_alloys=True, check_metallicity=False
)

# 3. With metallicity detection
is_valid_metallic = smact_validity(
    compound,
    use_pauling_test=True,
    include_alloys=False,
    check_metallicity=True,
    metallicity_threshold=0.7,
)

# Or combine methods
is_valid = smact_validity(
    compound,
    use_pauling_test=True,
    include_alloys=True,
    check_metallicity=True,
    metallicity_threshold=0.7,
)

3. Comprehensive Analysis Example

Here's how to perform a detailed analysis of a compound:

from smact.metallicity import *
from smact.screening import smact_validity


def analyze_compound(formula):
    """Perform comprehensive analysis of a compound."""
    # Basic metrics
    metal_frac = get_metal_fraction(formula)
    d_frac = get_d_electron_fraction(formula)
    n_metals = get_distinct_metal_count(formula)
    pauling = get_pauling_test_mismatch(formula)
    score = metallicity_score(formula)

    # Validity checks
    valid_standard = smact_validity(
        formula, use_pauling_test=True, include_alloys=False, check_metallicity=False
    )
    valid_alloy = smact_validity(
        formula, use_pauling_test=True, include_alloys=True, check_metallicity=False
    )
    valid_metallic = smact_validity(
        formula, use_pauling_test=True, include_alloys=False, check_metallicity=True
    )

    print(f"Analysis of {formula}:")
    print(f"Metal fraction: {metal_frac:.2f}")
    print(f"d-electron fraction: {d_frac:.2f}")
    print(f"Distinct metals: {n_metals}")
    print(f"Pauling mismatch: {'nan' if math.isnan(pauling) else f'{pauling:.2f}'}")
    print(f"Metallicity score: {score:.2f}")
    print(f"Valid (standard): {valid_standard}")
    print(f"Valid (alloy): {valid_alloy}")
    print(f"Valid (metallic): {valid_metallic}")


# Example usage
compounds = [
    "Fe3Al",  # Classic intermetallic
    "NaCl",  # Ionic
    "Fe2O3",  # Metal oxide
    "Cu2MgSn",  # Heusler alloy
    "NbTiAlCr",  # High-entropy alloy
]

for compound in compounds:
    analyze_compound(compound)
    print("-" * 50)

Future Development Directions

  1. Specialized Submodules

    • Development of specific modules for different types of metallic systems:

      • Intermetallic compound prediction
      • Solid solution formation
      • Amorphous metal formation
    • Integration with structure prediction

    • Incorporate atomic size factors

    • Add structure prediction capabilities

    • Include formation energy estimates - linking back to Miedema's model DOI Link

  2. Validation and Refinement

    • Benchmark against experimental databases
    • Refine scoring weights with more data
    • Add support for mixed-valence compounds
  3. Extended Functionality

    • Add support for partial occupancies
    • Include temperature-dependent properties
    • Integrate with phase diagram tools

Contributing

Contributions to improve the metallicity functionality are welcome! Areas particularly in need of development:

  1. Additional test cases and validation
  2. Refinement of scoring weights
  3. Integration with external databases
  4. Development of specialized submodules for specific metallic systems
  5. Performance optimization for large-scale screening

References

  1. Original SMACT paper: SMACT: Semiconducting Materials by Analogy and Chemical Theory

  2. Intermetallics theory and classification literature sources:

    • D.G. Pettifor introduced the concept of a single "chemical scale" or "structure map" coordinate (Pettifor number) to systematically separate compound classes [1, p. 31]. The new intermetallicscore is a step in that direction but customized to SMACT's internal data structures.

      • Reference: D.G. Pettifor, "A chemical scale for crystal-structure maps," Solid State Communications. 51 (1984) 31–34. DOI Link
    • The role of charge transfer and atomic size mismatch is pivotal in stabilizing intermetallic phases. Miedema's framework quantifies these effects, making it useful for predicting alloying behaviors and crystal structure. The parameters coded here, while conceptually similar, have not implemented Miedema's model directly.

    • Reference: A.R. Miedema, Cohesion in alloys - fundamentals of a semi-empirical model.DOI Link

  3. Electronegativity scales and their role in predicting bonding character (Pauling electronegativity)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Test A - Unit Tests
  • Test B - XGBoost Classification exercise

Test A - Unit Tests (smact/tests/test_intermetallics.py):

  • Tests individual feature functions
  • Validates edge cases
  • Key test cases include
  # Known intermetallics
  self.intermetallics = [
      "Fe3Al",  # Classic intermetallic
      "Ni3Ti",  # Superalloy component
      "Cu3Au",  # Ordered alloy
      "Fe2Nb",  # Laves phase
  ]

  # Known non-intermetallics
  self.non_intermetallics = [
      "NaCl",   # Ionic
      "SiO2",   # Covalent
      "Fe2O3",  # Metal oxide
      "CuSO4",  # Complex ionic
  ]

Test B: Machine Learning Classification

The metallicity module has been used in conjunction with machine learning approaches to predict the metallic nature of compounds. The metallicity_classification.ipynb notebook demonstrates how these chemical descriptors can be used to train models for predicting whether a compound will exhibit metallic behavior.

The classification task uses features derived from the metallicity module including:

  • Metal fraction
  • d-electron fraction
  • Number of distinct metals
  • Pauling electronegativity mismatch
  • Overall metallicity score

These features are used to train models that can predict the likelihood of metallic behavior in compounds, which is more general than specifically predicting intermetallic behavior. This approach allows for a broader understanding of metallic character in materials, encompassing various types of metallic systems including:

  • Traditional metals and alloys
  • Intermetallic compounds
  • Solid solutions
  • Metallic glasses
  • Mixed-character compounds
  • Uses matbench_expt_is_metal dataset
  • Implements XGBoost classifier
  • Performs feature importance analysis

Classification Workflow Details:

  1. Data Loading & Preparation:
from matminer.datasets import load_dataset
df = load_dataset("matbench_expt_is_metal")  # 4921 entries
  1. Feature Extraction:
  • Metal fraction
  • d-electron fraction
  • Distinct metal count
  • Pauling mismatch
  • Overall intermetallic score
  1. Model Training:
  • XGBoost classifier with hyperparameter tuning
  • Cross-validation with threshold optimization
  • Feature importance analysis using SHAP

Classification results:

Test Accuracy: 0.858 with threshold=0.5

Classification Report:
              precision    recall  f1-score   support

       False       0.84      0.87      0.85       494
        True       0.86      0.83      0.85       491

    accuracy                           0.85       985
   macro avg       0.85      0.85      0.85       985
weighted avg       0.85      0.85      0.85       985

confusion matrix

Reproduction Instructions:

  1. Environment Setup:
# Create conda environment
conda create -n smact-env python=3.11
conda activate smact-env

# Install dependencies
pip install smact matminer xgboost shap scikit-learn pandas numpy matplotlib seaborn
  1. Run Unit Tests
python -m pytest smact/tests/test_intermetallics.py -v
  1. Run Classification Notebook:
jupyter notebook metallicity_classification.ipynb

Key Results

Unit Test Results:

  • All feature functions pass edge cases
  • Intermetallic scoring correctly differentiates between known compounds
  • Error handling works as expected
  1. Classification Performance:
  • High accuracy in metal vs. non-metal classification
  • Feature importance ranking shows:
    1. Metal fraction
    2. d-electron fraction
    3. Intermetallic score
    4. Pauling mismatch
    5. Distinct metal count

Test Configuration:

  • Python version: 3.11
  • Operating System: Linux
  • Key package versions:
    • smact: latest
    • pymatgen: latest
    • xgboost: latest
    • scikit-learn: latest
    • pandas: latest
    • numpy: latest
    • shap: latest

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a metallicity scoring system for chemical compounds.
    • Added advanced validation for metallic compounds.
    • Enhanced screening functionality with metallicity checks.
    • New utility functions for analysing metallic characteristics.
    • Comprehensive tutorial on metallicity classification added.
  • Documentation

    • Expanded documentation for the metallicity module.
    • Included new examples related to metallicity in the documentation.
    • Updated tutorials to include metallicity classification.
  • Dependencies

    • Introduced machine learning-related optional dependencies.
    • Updated project requirements to support new functionality.

The update provides more sophisticated tools for analysing and validating metallic compounds, with improved scoring mechanisms and comprehensive documentation.

@ryannduma ryannduma requested a review from AntObi January 16, 2025 17:35
@ryannduma ryannduma self-assigned this Jan 16, 2025
Copy link
Contributor

coderabbitai bot commented Jan 16, 2025

Walkthrough

The pull request introduces enhancements to the SMACT package's capabilities for metallicity analysis. The smact/screening.py file's smact_validity function is extended to support validation of metallic compounds, incorporating new parameters for metallicity checks. A new smact.metallicity module is added, which includes utility functions for calculating metallicity scores. Additionally, comprehensive documentation, tests, and examples are created to support the new functionality, ensuring clarity and usability for users.

Changes

File Change Summary
smact/screening.py Updated smact_validity function with new parameters check_metallicity and metallicity_threshold
smact/metallicity.py New module with utility functions for metallicity analysis, including metallicity_score, get_metal_fraction, etc.
smact/tests/test_metallicity.py Comprehensive test suite for metallicity module functions
docs/* Added documentation for metallicity module, tutorials, and examples
pyproject.toml Added new optional ML dependency group
requirements.txt Added new dependencies like shap, xgboost
docs/examples.rst New entry for metallicity examples
docs/examples/metallicity.ipynb Updated Jupyter notebook detailing metallicity functions and validation

Sequence Diagram

sequenceDiagram
    participant User
    participant SmactValidity
    participant MetallicityModule
    
    User->>SmactValidity: Call with composition
    alt Metallicity Check Enabled
        SmactValidity->>MetallicityModule: Calculate metallicity score
        MetallicityModule-->>SmactValidity: Return score
        SmactValidity->>SmactValidity: Compare score to threshold
    end
    SmactValidity->>User: Return validity result
Loading

Possibly related PRs

  • Add oxidation states option to SMACT validity function #282: This PR adds an oxidation states option to the smact_validity function, which is directly related to the changes in the main PR that enhance the smact_validity function to support validation of metallic compounds with new parameters.
  • Change default oxidation states from SMACT14 to ICSD24 #346: This PR changes the default oxidation states from SMACT14 to ICSD24 in the smact_filter and smact_validity functions, which aligns with the main PR's focus on updating the smact_validity function to include checks for metallic compounds.
  • Structure prediction fixes #336: This PR introduces a new tutorial for the Structure Prediction module, which may include examples of using the smact_validity function, thus relating to the enhancements made in the main PR.

Suggested labels

docs, bug

Poem

🐰 A Rabbit's Ode to Metallicity 🔬
In circuits of atoms, where electrons dance,
We measure their metallic, shimmering stance.
With scores that reveal their silvery might,
A scientific ballet of elemental delight!

— CodeRabbit's Metallurgical Verse 🌟

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (7)
smact/intermetallics.py (2)

68-68: Consider handling missing electronegativity values appropriately

Currently, if any electronegativity values are None, the function returns a mismatch score of 0.0, which may inaccurately imply a perfect match. It might be more appropriate to raise a warning or return a distinct value indicating that the score could not be computed.

Apply this diff to handle missing values:

 if None in electronegativities:
-    return 0.0
+    # Return a distinct value or raise an error
+    return float('nan')  # Alternatively, raise a ValueError

23-24: Reduce code duplication in similar functions

The functions get_metal_fraction and get_d_electron_fraction share similar structures. Consider refactoring to eliminate duplication and improve maintainability.

You could abstract the common logic into a helper function:

def get_element_fraction(composition: Composition, element_set: set[str]) -> float:
    total_amt = sum(composition.values())
    target_amt = sum(
        amt for el, amt in composition.items() if el.symbol in element_set
    )
    return target_amt / total_amt

def get_metal_fraction(composition: Composition) -> float:
    return get_element_fraction(composition, smact.metals)

def get_d_electron_fraction(composition: Composition) -> float:
    return get_element_fraction(composition, smact.d_block)

Also applies to: 37-38

new_intermetallics_mod_usage_example.py (1)

35-38: Enhance output formatting for clarity

Consider improving the output formatting to make the results more readable and informative.

Apply this diff to enhance the print statements:

 print(f"\nCompound: {compound}")
-print(f"Standard validity: {is_valid_standard}")
-print(f"With intermetallic detection: {is_valid_intermetallic}")
-print(f"Intermetallic score: {score:.2f}")
+print(f"Standard Validity: {is_valid_standard}")
+print(f"Intermetallic Detection Validity: {is_valid_intermetallic}")
+print(f"Intermetallic Score: {score:.2f}")
smact/tests/test_intermetallics.py (2)

43-46: Use assertAlmostEqual for floating-point comparisons

When comparing floating-point numbers, especially results of calculations, it's safer to use assertAlmostEqual to account for potential precision errors.

Apply this diff to update the assertions:

-        self.assertEqual(get_metal_fraction(Composition("Fe3Al")), 1.0)
+        self.assertAlmostEqual(get_metal_fraction(Composition("Fe3Al")), 1.0, places=6)

-        self.assertEqual(get_metal_fraction(Composition("SiO2")), 0.0)
+        self.assertAlmostEqual(get_metal_fraction(Composition("SiO2")), 0.0, places=6)

79-86: Provide informative assertion messages

Including informative messages in your assertions can help identify failures more easily during testing.

Apply this diff to add messages to assertions:

         for formula in self.intermetallics:
             score = intermetallic_score(formula)
-            self.assertTrue(score > 0.7, f"Expected high score (>0.7) for {formula}, got {score}")

+            self.assertTrue(
+                score > 0.7,
+                msg=f"Expected high intermetallic score (>0.7) for {formula}, but got {score:.2f}",
+            )
 
         for formula in self.non_intermetallics:
             score = intermetallic_score(formula)
-            self.assertTrue(score < 0.5, f"Expected low score (<0.5) for {formula}, got {score}")

+            self.assertTrue(
+                score < 0.5,
+                msg=f"Expected low intermetallic score (<0.5) for {formula}, but got {score:.2f}",
+            )
docs/intermetallics_readme.md (2)

89-90: Add explanation for example scores.

The example scores would be more helpful if accompanied by brief explanations of why these compounds receive their respective scores. For instance, explain why Fe3Al scores ~0.85 in terms of its metallic fraction, d-electron contribution, etc.

Also applies to: 93-94


178-178: Fix grammatical issues.

Please apply these corrections:

  • Line 178: Add a period after "in these cases"
  • Line 231: Add a comma after "here" in "parameters coded here while conceptually similar"

Also applies to: 231-231

🧰 Tools
🪛 LanguageTool

[uncategorized] ~178-~178: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a92e123 and e4f4d84.

📒 Files selected for processing (5)
  • docs/intermetallics_readme.md (1 hunks)
  • new_intermetallics_mod_usage_example.py (1 hunks)
  • smact/intermetallics.py (1 hunks)
  • smact/screening.py (4 hunks)
  • smact/tests/test_intermetallics.py (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/intermetallics_readme.md

[uncategorized] ~80-~80: A determiner appears to be missing. Consider inserting it.
Context: ...# intermetallic_score(composition) - Main scoring function combining multiple met...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~178-~178: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)


[uncategorized] ~187-~187: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~231-~231: A comma might be missing here.
Context: ...rameters coded here, while conceptually similar have not implemented Miedema directly. ...

(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)

🔇 Additional comments (5)
smact/intermetallics.py (1)

119-119: Re-evaluate the default vec_factor value

When the valence electron count cannot be calculated, the vec_factor defaults to 0.5. Consider whether this default appropriately reflects the lack of information, or if a different value or approach would be more suitable.

Would you like to revisit the default value for vec_factor to ensure it aligns with the scoring intentions?

smact/screening.py (4)

21-21: LGTM!

The import statement follows Python best practices by importing only the required function.


441-441: LGTM!

The new parameter is well-documented and follows consistent naming conventions. The default threshold value of 0.7 aligns with the documentation's recommendations for intermetallic classification.

Also applies to: 447-447, 463-464


477-477: LGTM!

The added empty line improves code readability by separating logical blocks.


479-481: LGTM!

The implementation cleanly integrates the new intermetallic scoring system, providing a more sophisticated approach to classifying intermetallic compounds.

AntObi
AntObi previously requested changes Jan 16, 2025
@AntObi AntObi added enhancement python Pull requests that update Python code feature labels Jan 16, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
smact/intermetallics.py (2)

1-11: Enhance module documentation to better describe its purpose and functionality.

The current docstring is quite minimal. Consider expanding it to include:

  • Overview of what intermetallic compounds are
  • Key features and functions provided by the module
  • Example usage
  • References to relevant literature or methodologies
-"""Utility functions for handling intermetallic compounds in SMACT."""
+"""Functions for analysing and scoring intermetallic compounds in SMACT.
+
+This module provides a comprehensive set of utilities for identifying and
+characterising intermetallic compounds, moving beyond simple binary classification
+to a more nuanced scoring system. Key features include:
+
+- Metal fraction analysis
+- D-electron contribution assessment
+- Pauling electronegativity ordering evaluation
+- Composite intermetallic scoring
+
+Example:
+    >>> from smact.intermetallics import intermetallic_score
+    >>> score = intermetallic_score("Fe3Al")
+    >>> print(f"Intermetallic character score: {score:.2f}")
+
+References:
+    1. Pauling, L. (1947). J. Am. Chem. Soc., 69(3), 542-553.
+    2. Villars, P. (1983). J. Less Common Met., 92(2), 215-238.
+"""

102-154: Make scoring configuration more flexible and enhance documentation.

Consider making the weights and thresholds configurable and documenting the scoring components in more detail.

+# Default weights for scoring components
+DEFAULT_WEIGHTS = {
+    "metal_fraction": 0.3,
+    "d_electron": 0.2,
+    "n_metals": 0.2,
+    "vec": 0.15,
+    "pauling": 0.15,
+}
+
-def intermetallic_score(composition: str | Composition) -> float:
+def intermetallic_score(
+    composition: str | Composition,
+    weights: dict[str, float] | None = None,
+    max_metals: int = 3,
+    target_vec: float = 8.0,
+) -> float:
     """Calculate a score indicating how likely a composition is to be an intermetallic compound.

     The score is based on several heuristics:
-    1. Fraction of metallic elements
-    2. Number of distinct metals
-    3. Presence of d-block elements
-    4. Electronegativity differences
-    5. Valence electron count
+    1. Fraction of metallic elements (weight: 0.3)
+       Higher fraction indicates stronger intermetallic character
+    2. Number of distinct metals (weight: 0.2)
+       Normalised to max_metals (default: 3)
+    3. Presence of d-block elements (weight: 0.2)
+       Higher d-electron fraction indicates stronger intermetallic character
+    4. Electronegativity differences (weight: 0.15)
+       Lower Pauling mismatch indicates stronger intermetallic character
+    5. Valence electron count (weight: 0.15)
+       Normalised around target_vec (default: 8.0)

     Args:
         composition: Chemical formula as string or pymatgen Composition
+        weights: Optional custom weights for scoring components
+        max_metals: Maximum number of metals for normalisation
+        target_vec: Target valence electron count for normalisation

     Returns:
         float: Score between 0 and 1, where higher values indicate more intermetallic character
     """
     if isinstance(composition, str):
         composition = Composition(composition)

+    # Use default weights if none provided
+    weights = weights or DEFAULT_WEIGHTS
+
     # 1. Basic metrics
     metal_fraction = get_metal_fraction(composition)
     d_electron_fraction = get_d_electron_fraction(composition)
     n_metals = get_distinct_metal_count(composition)

     # 2. Electronic structure indicators
     try:
         vec = valence_electron_count(composition.reduced_formula)
-        vec_factor = 1.0 - (abs(vec - 8.0) / 8.0)  # Normalized around VEC=8
+        vec_factor = 1.0 - (abs(vec - target_vec) / target_vec)
     except ValueError:
         vec_factor = 0.5  # Default if we can't calculate VEC

     # 3. Bonding character
     pauling_mismatch = get_pauling_test_mismatch(composition)

-    # 4. Calculate weighted score
-    # These weights can be tuned based on empirical testing
-    weights = {"metal_fraction": 0.3, "d_electron": 0.2, "n_metals": 0.2, "vec": 0.15, "pauling": 0.15}
-
     score = (
         weights["metal_fraction"] * metal_fraction
         + weights["d_electron"] * d_electron_fraction
-        + weights["n_metals"] * min(1.0, n_metals / 3.0)  # Normalized to max of 3 metals
+        + weights["n_metals"] * min(1.0, n_metals / max_metals)
         + weights["vec"] * vec_factor
         + weights["pauling"] * (1.0 - pauling_mismatch)  # Invert mismatch score
     )
smact/tests/test_intermetallics.py (3)

5-41: Standardise testing framework usage.

The test file mixes unittest and pytest frameworks. Choose one framework consistently:

  1. If using unittest, replace pytest.raises with self.assertRaises
  2. If using pytest, remove unittest.TestCase inheritance and use pytest fixtures

The test data setup is well-organised and comprehensive.


42-67: Consider parametrizing test cases for better maintainability.

The test cases for get_element_fraction could be parametrized to make them more maintainable and easier to extend.

+    @pytest.mark.parametrize(
+        "composition,element_set,expected,message",
+        [
+            ("Fe3Al", smact.metals, 1.0, "Expected all elements in Fe3Al to be metals"),
+            ("Fe2Nb", smact.d_block, 1.0, "Expected all elements in Fe2Nb to be d-block"),
+            ("Fe3Al", set(), 0.0, "Expected zero fraction for empty element set"),
+        ],
+    )
+    def test_get_element_fraction(self, composition, element_set, expected, message):
+        """Test the helper function for element fraction calculations."""
+        self.assertAlmostEqual(
+            get_element_fraction(composition, element_set),
+            expected,
+            places=6,
+            msg=message,
+        )

160-175: Expand edge case coverage and standardise exception testing.

  1. Add more edge cases
  2. Standardise exception testing with the chosen framework
     def test_edge_cases(self):
         """Test edge cases and error handling."""
         # Single element
         score = intermetallic_score("Fe")
         self.assertTrue(
             0.0 <= score <= 1.0,
             msg=f"Expected score between 0 and 1 for Fe, got {score:.2f}",
         )

+        # Test with very large composition
+        large_comp = "Fe" * 100 + "Al" * 100
+        score = intermetallic_score(large_comp)
+        self.assertTrue(
+            0.0 <= score <= 1.0,
+            msg=f"Expected score between 0 and 1 for large composition, got {score:.2f}",
+        )
+
         # Empty composition should not crash
-        with pytest.raises(ValueError, match="Empty composition"):
+        with self.assertRaisesRegex(ValueError, "Empty composition"):
             intermetallic_score("")

         # Invalid formula should raise error
-        with pytest.raises(ValueError, match="Invalid formula"):
+        with self.assertRaisesRegex(ValueError, "Invalid formula"):
             intermetallic_score("NotAnElement")
+
+        # Test with special characters
+        with self.assertRaisesRegex(ValueError, "Invalid formula"):
+            intermetallic_score("Fe@Al")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4f4d84 and 1ff91b8.

📒 Files selected for processing (3)
  • new_intermetallics_mod_usage_example.py (1 hunks)
  • smact/intermetallics.py (1 hunks)
  • smact/tests/test_intermetallics.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • new_intermetallics_mod_usage_example.py
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: test (3.12, windows-latest)
  • GitHub Check: test (3.12, macos-latest)
  • GitHub Check: test (3.12, ubuntu-latest)
  • GitHub Check: test (3.11, windows-latest)
  • GitHub Check: test (3.11, macos-latest)
  • GitHub Check: test (3.11, ubuntu-latest)
  • GitHub Check: test (3.10, windows-latest)
  • GitHub Check: test (3.10, macos-latest)
  • GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (3)
smact/intermetallics.py (2)

71-100: ⚠️ Potential issue

Address Python version compatibility and clarify logic.

  1. Remove Python 3.10+ specific strict=False parameter
  2. Clarify the metal-nonmetal pair logic with better variable names and comments
  3. Consider using absolute values for consistent comparison
     # Calculate pairwise differences
     mismatches = []
-    for i, (el1, eneg1) in enumerate(zip(elements, electronegativities, strict=False)):
-        for el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :], strict=False):
+    for i, (el1, eneg1) in enumerate(zip(elements, electronegativities)):
+        for el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :]):
             # For metal pairs, we expect small electronegativity differences
             if el1.symbol in smact.metals and el2.symbol in smact.metals:
-                mismatches.append(abs(eneg1 - eneg2))
+                metal_pair_mismatch = abs(eneg1 - eneg2)
+                mismatches.append(metal_pair_mismatch)
             # For metal-nonmetal pairs, we expect larger differences
             elif (el1.symbol in smact.metals) != (el2.symbol in smact.metals):
-                mismatches.append(1.0 - abs(eneg1 - eneg2))
+                # Invert the difference for metal-nonmetal pairs as we expect
+                # larger differences (closer to 1.0) to indicate ionic rather
+                # than intermetallic character
+                ionic_character = abs(eneg1 - eneg2)
+                intermetallic_character = 1.0 - ionic_character
+                mismatches.append(intermetallic_character)

Likely invalid or redundant comment.


32-57: 🛠️ Refactor suggestion

Maintain consistency with input handling across functions.

Apply the same input flexibility pattern to these functions for consistency with get_element_fraction.

-def get_metal_fraction(composition: Composition) -> float:
+def get_metal_fraction(composition: str | Composition) -> float:
     """Calculate the fraction of metallic elements in a composition.
     Implemented using get_element_fraction helper with smact.metals set.

     Args:
-        composition: A pymatgen Composition object
+        composition: Chemical formula as string or pymatgen Composition object

     Returns:
         float: Fraction of the composition that consists of metallic elements (0-1)
+
+    Raises:
+        ValueError: If the composition is invalid or empty
     """
     return get_element_fraction(composition, smact.metals)

-def get_d_electron_fraction(composition: Composition) -> float:
+def get_d_electron_fraction(composition: str | Composition) -> float:
     """Calculate the fraction of d-block elements in a composition.
     Implemented using get_element_fraction helper with smact.d_block set.

     Args:
-        composition: A pymatgen Composition object
+        composition: Chemical formula as string or pymatgen Composition object

     Returns:
         float: Fraction of the composition that consists of d-block elements (0-1)
+
+    Raises:
+        ValueError: If the composition is invalid or empty
     """
     return get_element_fraction(composition, smact.d_block)

Likely invalid or redundant comment.

smact/tests/test_intermetallics.py (1)

129-141: 🛠️ Refactor suggestion

Enhance Pauling test assertions and documentation.

The test could be more explicit about the expected behaviour and use absolute values as suggested in past reviews.

     def test_get_pauling_test_mismatch(self):
-        """Test Pauling electronegativity mismatch calculation."""
+        """Test Pauling electronegativity mismatch calculation.
+
+        The mismatch score should be:
+        - Lower for intermetallic compounds (similar electronegativities)
+        - Higher for ionic compounds (large electronegativity differences)
+        """
         # Ionic compounds should have high mismatch
         nacl_mismatch = get_pauling_test_mismatch(Composition("NaCl"))

         # Intermetallics should have lower mismatch
         fe3al_mismatch = get_pauling_test_mismatch(Composition("Fe3Al"))

         self.assertTrue(
-            fe3al_mismatch < nacl_mismatch,
+            abs(fe3al_mismatch) < abs(nacl_mismatch),
             msg=f"Expected lower Pauling mismatch for Fe3Al ({fe3al_mismatch:.2f}) than NaCl ({nacl_mismatch:.2f})",
         )

Likely invalid or redundant comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
smact/intermetallics.py (1)

107-108: ⚠️ Potential issue

Remove Python 3.10+ specific parameter.

The strict parameter in zip is only available in Python 3.10 and above.

Apply this diff to maintain compatibility with earlier Python versions:

-for i, (el1, eneg1) in enumerate(zip(elements, electronegativities, strict=False)):
+for i, (el1, eneg1) in enumerate(zip(elements, electronegativities)):
-    for el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :], strict=False):
+    for el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :]):
🧹 Nitpick comments (1)
smact/intermetallics.py (1)

119-170: Consider making weights configurable.

The scoring weights are hardcoded and might need adjustment based on empirical testing. Consider making them configurable through function parameters or a configuration file.

Apply this diff to make weights configurable:

-def intermetallic_score(composition: str | Composition) -> float:
+def intermetallic_score(
+    composition: str | Composition,
+    weights: dict[str, float] | None = None,
+) -> float:
     """Calculate a score indicating how likely a composition is to be an intermetallic compound.
     
     Args:
         composition: Chemical formula as string or pymatgen Composition
+        weights: Optional dictionary of weights for each factor.
+                Default weights are used if not provided.
     """
     comp = _ensure_composition(composition)
 
     # 1. Basic metrics
     metal_fraction = get_metal_fraction(comp)
     d_electron_fraction = get_d_electron_fraction(comp)
     n_metals = get_distinct_metal_count(comp)
 
     # 2. Electronic structure indicators
     try:
         vec = valence_electron_count(comp.reduced_formula)
         vec_factor = 1.0 - (abs(vec - 8.0) / 8.0)  # Normalized around VEC=8
     except ValueError:
         vec_factor = 0.5  # Default if we can't calculate VEC
 
     # 3. Bonding character
     pauling_mismatch = get_pauling_test_mismatch(comp)
 
     # 4. Calculate weighted score
-    # These weights can be tuned based on empirical testing
-    weights = {"metal_fraction": 0.3, "d_electron": 0.2, "n_metals": 0.2, "vec": 0.15, "pauling": 0.15}
+    default_weights = {
+        "metal_fraction": 0.3,
+        "d_electron": 0.2,
+        "n_metals": 0.2,
+        "vec": 0.15,
+        "pauling": 0.15
+    }
+    weights = weights or default_weights
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ff91b8 and 56246d3.

📒 Files selected for processing (3)
  • new_intermetallics_mod_usage_example.py (1 hunks)
  • smact/intermetallics.py (1 hunks)
  • smact/screening.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • new_intermetallics_mod_usage_example.py
🔇 Additional comments (5)
smact/intermetallics.py (4)

13-24: LGTM! Well-structured helper function.

The function is well-documented, properly typed, and follows the single responsibility principle.


47-57: LGTM! Clean implementation.

Good use of the helper function to avoid code duplication.


61-71: LGTM! Clean implementation.

Good use of the helper function to avoid code duplication.


74-84: LGTM! Efficient implementation.

The function effectively uses a generator expression for counting unique metals.

smact/screening.py (1)

441-442: LGTM! Well-implemented intermetallic validation.

The implementation follows the suggested approach of keeping intermetallic validation separate from the existing alloy validation. The new parameters are well-documented and the logic is clear.

Also applies to: 481-485

Comment on lines 28 to 43
def get_element_fraction(composition: str | Composition, element_set: set[str]) -> float:
"""Calculate the fraction of elements from a given set in a composition.
This helper function is used to avoid code duplication in functions that
calculate fractions of specific element types (e.g., metals, d-block elements).

Args:
composition: Chemical formula as string or pymatgen Composition
element_set: Set of element symbols to check for

Returns:
float: Fraction of the composition that consists of elements from the set (0-1)
"""
comp = _ensure_composition(composition)
total_amt = sum(comp.values())
target_amt = sum(amt for el, amt in comp.items() if el.symbol in element_set)
return target_amt / total_amt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for edge cases.

The function should handle empty compositions and potential division by zero errors.

Apply this diff to add error handling:

 def get_element_fraction(composition: str | Composition, element_set: set[str]) -> float:
     comp = _ensure_composition(composition)
     total_amt = sum(comp.values())
+    if not total_amt:
+        raise ValueError("Empty composition")
     target_amt = sum(amt for el, amt in comp.items() if el.symbol in element_set)
     return target_amt / total_amt
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_element_fraction(composition: str | Composition, element_set: set[str]) -> float:
"""Calculate the fraction of elements from a given set in a composition.
This helper function is used to avoid code duplication in functions that
calculate fractions of specific element types (e.g., metals, d-block elements).
Args:
composition: Chemical formula as string or pymatgen Composition
element_set: Set of element symbols to check for
Returns:
float: Fraction of the composition that consists of elements from the set (0-1)
"""
comp = _ensure_composition(composition)
total_amt = sum(comp.values())
target_amt = sum(amt for el, amt in comp.items() if el.symbol in element_set)
return target_amt / total_amt
def get_element_fraction(composition: str | Composition, element_set: set[str]) -> float:
"""Calculate the fraction of elements from a given set in a composition.
This helper function is used to avoid code duplication in functions that
calculate fractions of specific element types (e.g., metals, d-block elements).
Args:
composition: Chemical formula as string or pymatgen Composition
element_set: Set of element symbols to check for
Returns:
float: Fraction of the composition that consists of elements from the set (0-1)
"""
comp = _ensure_composition(composition)
total_amt = sum(comp.values())
if not total_amt:
raise ValueError("Empty composition")
target_amt = sum(amt for el, amt in comp.items() if el.symbol in element_set)
return target_amt / total_amt

validated the approximate values in the docs with the new intermetallic scoring function
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
docs/intermetallics_readme.md (1)

21-23: 🛠️ Refactor suggestion

Inconsistent API usage examples.

The example shows both string and Composition input methods, but based on previous feedback, the API should be simplified to accept string formulas directly.

Update the example to show the preferred usage pattern:

-print(get_metal_fraction("Fe3Al"))  # Works with string formula
-print(get_metal_fraction(Composition("Fe3Al")))  # Works with Composition
+print(get_metal_fraction("Fe3Al"))  # Returns 1.0
🧹 Nitpick comments (2)
docs/intermetallics_readme.md (2)

179-179: Improve error handling display.

Using string 'nan' for NaN values is not a consistent way to handle numerical formatting.

Consider using a more Pythonic approach:

-print(f"Pauling mismatch: {'nan' if math.isnan(pauling) else f'{pauling:.2f}'}")
+print(f"Pauling mismatch: {pauling:.2f if not math.isnan(pauling) else 'N/A'}")

307-316: Enhance reference formatting.

The references section could be improved for better readability and accessibility.

Consider using a consistent format for all references:

1. Original SMACT paper:
   - Title: "SMACT: Semiconducting Materials by Analogy and Chemical Theory"
   - Journal: Journal of Open Source Software
   - DOI: 10.21105/joss.01361

2. Intermetallics Theory:
   - Author: D.G. Pettifor
   - Title: "A chemical scale for crystal-structure maps"
   - Journal: Solid State Communications, 51 (1984) 31-34
   - DOI: 10.1016/0038-1098(84)90765-8

3. Miedema's Framework:
   - Author: A.R. Miedema
   - Title: "Cohesion in alloys - fundamentals of a semi-empirical model"
   - DOI: 10.1016/0378-4363(80)90054-6
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 56246d3 and f5ec89a.

📒 Files selected for processing (1)
  • docs/intermetallics_readme.md (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: test (3.12, windows-latest)
  • GitHub Check: test (3.12, macos-latest)
  • GitHub Check: test (3.12, ubuntu-latest)
  • GitHub Check: test (3.11, windows-latest)
  • GitHub Check: test (3.11, macos-latest)
  • GitHub Check: test (3.11, ubuntu-latest)
  • GitHub Check: test (3.10, windows-latest)
  • GitHub Check: test (3.10, macos-latest)
  • GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (2)
docs/intermetallics_readme.md (2)

92-98: Verify numerical values in examples.

The example scores appear to be inconsistent with actual implementation results.

Let's verify these values by running the actual implementation:

#!/bin/bash
# Search for test files containing these specific compounds and their scores
rg -A 5 "Fe3Al|Ni3Ti|NaCl|Fe2O3|SiO2.*score.*=.*0\.[0-9]+" --type py

133-133: Clarify threshold value selection.

The examples use 0.7 as the threshold, but there's no explanation of why this value was chosen or how it relates to the classification performance.

Add a note explaining:

  • How the 0.7 threshold was determined
  • What classification performance it achieves
  • When users might want to adjust it

Also applies to: 142-142

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
docs/intermetallics_readme.md (2)

150-151: 🛠️ Refactor suggestion

Replace wildcard import with explicit imports.

Using wildcard imports (from smact.intermetallics import *) is not a recommended practice as it can lead to namespace pollution.

Apply this diff:

-from smact.intermetallics import *
+from smact.intermetallics import (
+    get_metal_fraction,
+    get_d_electron_fraction,
+    get_distinct_metal_count,
+    get_pauling_test_mismatch,
+    intermetallic_score
+)

242-242: 🛠️ Refactor suggestion

Replace wildcard import in the advanced usage example.

Another instance of wildcard import that should be replaced with explicit imports.

Apply this diff:

-from smact.intermetallics import *
+from smact.intermetallics import (
+    get_metal_fraction,
+    get_d_electron_fraction,
+    get_distinct_metal_count,
+    get_pauling_test_mismatch,
+    intermetallic_score
+)
🧹 Nitpick comments (2)
docs/intermetallics_readme.md (2)

309-311: Add inline citation for the Pettifor number.

While the reference is provided, it would be helpful to add an inline citation to the specific page or section where the Pettifor number concept is introduced.

Apply this diff:

-   - D.G. Pettifor introduced the concept of a single "chemical scale" or "structure map" coordinate (Pettifor number) to systematically separate compound classes. The new intermetallicscore is a step in that direction but customized to SMACT's internal data structures.
+   - D.G. Pettifor introduced the concept of a single "chemical scale" or "structure map" coordinate (Pettifor number) to systematically separate compound classes [1, p. 31]. The new intermetallicscore is a step in that direction but customized to SMACT's internal data structures.

313-314: Fix grammatical issues in the Miedema reference.

The sentence about Miedema's framework needs grammatical corrections.

Apply this diff:

-   - Also, The role of charge transfer and atomic size mismatch is pivotal in stabilizing intermetallic phases. Miedema's framework quantifies these effects, making it useful for predicting alloying behaviors and crystal structure, the parameters coded here, while conceptually similar have not implemented Miedema directly.
+   - The role of charge transfer and atomic size mismatch is pivotal in stabilizing intermetallic phases. Miedema's framework quantifies these effects, making it useful for predicting alloying behaviors and crystal structure. The parameters coded here, while conceptually similar, have not implemented Miedema's model directly.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~313-~313: A comma might be missing here.
Context: ...rameters coded here, while conceptually similar have not implemented Miedema directly. ...

(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f5ec89a and 92aeb6c.

📒 Files selected for processing (1)
  • docs/intermetallics_readme.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/intermetallics_readme.md

[uncategorized] ~82-~82: A determiner appears to be missing. Consider inserting it.
Context: ...# intermetallic_score(composition) - Main scoring function combining multiple met...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)


[uncategorized] ~269-~269: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~313-~313: A comma might be missing here.
Context: ...rameters coded here, while conceptually similar have not implemented Miedema directly. ...

(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)

🔇 Additional comments (3)
docs/intermetallics_readme.md (3)

91-97: Verify the intermetallic scores in the examples.

The example shows specific scores (e.g., ~0.83 for Fe3Al, ~0.87 for Ni3Ti) that need verification based on previous feedback about numerical discrepancies.

Run this script to verify the actual outputs:

✅ Verification successful

The intermetallic scores in the examples are accurate and consistent with the implementation.

The scores align with the implementation's scoring heuristics and test cases. The codebase confirms:

  • Fe3Al (~0.83) and Ni3Ti (~0.87) are classified as known intermetallics with expected high scores
  • NaCl (0.63), Fe2O3 (0.64), and SiO2 (0.25) are correctly classified as non-intermetallics with appropriately lower scores
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Create a minimal test script
echo 'from smact.intermetallics import intermetallic_score
compounds = ["Fe3Al", "Ni3Ti", "NaCl", "Fe2O3", "SiO2"]
for compound in compounds:
    print(f"{compound}: {intermetallic_score(compound)}")' > test_scores.py

# Execute the test script
python3 test_scores.py

Length of output: 325


Script:

#!/bin/bash
# Find the implementation of intermetallic_score
ast-grep --pattern 'def intermetallic_score($$$):'

# Search for test files containing these compounds
rg -l "Fe3Al|Ni3Ti|NaCl|Fe2O3|SiO2" --type py

Length of output: 312


Script:

#!/bin/bash
# Check the implementation
echo "=== Implementation in intermetallics.py ==="
rg -A 20 "def intermetallic_score" smact/intermetallics.py

echo -e "\n=== Test cases in test_intermetallics.py ==="
rg -A 5 -B 5 "Fe3Al|Ni3Ti|NaCl|Fe2O3|SiO2" smact/tests/test_intermetallics.py

Length of output: 5039


269-271: Clarify the default threshold value.

The documentation mentions a default threshold of 0.7, but there's inconsistency in the codebase regarding this value. Please verify and document:

  1. The current default value
  2. The rationale for this specific value
  3. Guidelines for when to adjust it

Run this script to verify the default threshold:

✅ Verification successful

The default threshold value of 0.7 is correctly documented and consistently implemented.

The threshold value:

  • Is defined as 0.7 in the source code (smact/screening.py)
  • Is validated by test cases expecting scores above 0.7 for known intermetallics
  • Aligns with the statistical distribution of ion pair similarities in the codebase

Guidelines for adjustment:

  • Consider lowering for more inclusive screening of potential intermetallics
  • Consider raising for stricter filtering in high-precision applications
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for the default threshold value in the codebase
ast-grep --pattern 'intermetallic_threshold = $_'
rg -i "default.*threshold.*=.*0\.[0-9]+"

Length of output: 94


Script:

#!/bin/bash
# Search with broader patterns
rg "0\.7" --type py --type yaml --type json -A 2 -B 2
rg "threshold" --type py --type yaml --type json -A 2 -B 2

# Look for configuration files
fd -e yaml -e json -e toml -e ini -e cfg --exec cat {}

# Search for threshold-related constants
ast-grep --pattern 'THRESHOLD = $_'
ast-grep --pattern 'threshold = $_'

Length of output: 71635

🧰 Tools
🪛 LanguageTool

[uncategorized] ~269-~269: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...

(AI_EN_LECTOR_MISSING_DETERMINER)


21-25: Verify the numerical output in the example.

The example shows get_metal_fraction("Fe2O3") returning 0.4, but this value needs verification as previous reviewers reported discrepancies in the numerical outputs.

Run this script to verify the actual output:

✅ Verification successful

Documentation example is accurate

The example showing get_metal_fraction("Fe2O3") returning 0.4 is mathematically correct. The test suite confirms that Fe2O3 should return a fractional value between 0 and 1, and the calculation (2 metal atoms / 5 total atoms = 0.4) is accurate.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Create a minimal test script
echo 'from smact.intermetallics import get_metal_fraction
print(get_metal_fraction("Fe2O3"))' > test_metal_fraction.py

# Execute the test script
python3 test_metal_fraction.py

Length of output: 226


Script:

#!/bin/bash
# Find the implementation of get_metal_fraction
ast-grep --pattern 'def get_metal_fraction($_):
    $$$'

# Also search with ripgrep for more context
rg "def get_metal_fraction" -A 10

Length of output: 813


Script:

#!/bin/bash
# Find get_element_fraction implementation
ast-grep --pattern 'def get_element_fraction($_):
    $$$'

# Look for metals set definition
rg "metals = " -A 5

# Also check if there's any direct usage of Fe2O3 in tests
rg "Fe2O3" --type py

Length of output: 1647


Script:

#!/bin/bash
# Get the full test case for Fe2O3
rg "Fe2O3" smact/tests/test_intermetallics.py -B 2 -A 2

# Get the get_element_fraction implementation
ast-grep --pattern 'def get_element_fraction($params):
    $$$'

Length of output: 522

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
docs/intermetallics_readme.md (4)

132-133: Document the default threshold value.

The example shows a threshold of 0.7, but it would be helpful to explicitly mention this is the default value in the documentation.


244-252: Show both string and Composition usage in advanced example.

Consider showing both string and Composition usage to demonstrate the API's flexibility:

 # Detailed analysis of a compound
-comp = Composition("Fe3Al")
+# Using string formula
+metrics_str = {
+    "metal_fraction": get_metal_fraction("Fe3Al"),
+    "d_electron_fraction": get_d_electron_fraction("Fe3Al"),
+    "distinct_metals": get_distinct_metal_count("Fe3Al"),
+    "pauling_mismatch": get_pauling_test_mismatch("Fe3Al"),
+    "overall_score": intermetallic_score("Fe3Al"),
+}
+
+# Using Composition object
+comp = Composition("Fe3Al")
 metrics = {
     "metal_fraction": get_metal_fraction(comp),
     "d_electron_fraction": get_d_electron_fraction(comp),

260-260: Add missing period.

Add a period at the end of the sentence.

-   - Falls back to default behavior in these cases
+   - Falls back to default behavior in these cases.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)


257-261: Add version compatibility information.

Consider adding information about minimum required versions of dependencies (pymatgen, etc.) in the limitations section.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 92aeb6c and db78427.

📒 Files selected for processing (1)
  • docs/intermetallics_readme.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/intermetallics_readme.md

[uncategorized] ~82-~82: A determiner appears to be missing. Consider inserting it.
Context: ...# intermetallic_score(composition) - Main scoring function combining multiple met...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)


[uncategorized] ~269-~269: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...

(AI_EN_LECTOR_MISSING_DETERMINER)

🔇 Additional comments (2)
docs/intermetallics_readme.md (2)

150-151: Replace wildcard import with explicit imports.

Using wildcard imports is not recommended as it can lead to namespace pollution and make it unclear which functions are being used.

-from smact.intermetallics import *
+from smact.intermetallics import (
+    get_metal_fraction,
+    get_d_electron_fraction,
+    get_distinct_metal_count,
+    get_pauling_test_mismatch,
+    intermetallic_score
+)

242-242: Replace wildcard import in advanced usage example.

-from smact.intermetallics import *
+from smact.intermetallics import (
+    get_metal_fraction,
+    get_d_electron_fraction,
+    get_distinct_metal_count,
+    get_pauling_test_mismatch,
+    intermetallic_score
+)

test_edge_cases, test_get_pauling_test_mismatch, test_intermetallic_score
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
docs/intermetallics_readme.md (2)

150-151: 🛠️ Refactor suggestion

Replace wildcard import with explicit imports.

Using wildcard imports can lead to namespace pollution and make it unclear which functions are being used.

-from smact.intermetallics import *
+from smact.intermetallics import (
+    get_metal_fraction,
+    get_d_electron_fraction,
+    get_distinct_metal_count,
+    get_pauling_test_mismatch,
+    intermetallic_score
+)

242-242: 🛠️ Refactor suggestion

Replace wildcard import in advanced usage example.

Another instance of wildcard import that should be replaced with explicit imports.

-from smact.intermetallics import *
+from smact.intermetallics import (
+    get_metal_fraction,
+    get_d_electron_fraction,
+    get_distinct_metal_count,
+    get_pauling_test_mismatch,
+    intermetallic_score
+)
🧹 Nitpick comments (4)
docs/intermetallics_readme.md (1)

22-25: Clarify example outputs and input types.

The examples should:

  1. Clarify that numerical outputs are approximate values that may vary.
  2. Show both string and Composition object usage examples since both are supported.
-print(get_metal_fraction("Fe3Al"))  # Works with string formula (& Composition object)
+# Using string input
+print(get_metal_fraction("Fe3Al"))  # Returns 1.0
+
+# Using Composition object
+from pymatgen.core import Composition
+print(get_metal_fraction(Composition("Fe3Al")))  # Returns 1.0

-print(get_metal_fraction("Fe2O3"))  # 0.4
+print(get_metal_fraction("Fe2O3"))  # Returns ~0.4 (exact value may vary)
smact/tests/test_intermetallics.py (3)

42-67: Consider adding negative test case

The test coverage is thorough for valid cases. Consider adding a test case with an invalid composition to verify error handling.


142-159: Define score thresholds as class constants

Consider defining the score thresholds (0.7 and 0.5) as class constants to improve maintainability and make it easier to adjust these values if needed.

Example:

class TestIntermetallics(unittest.TestCase):
    HIGH_SCORE_THRESHOLD = 0.7
    LOW_SCORE_THRESHOLD = 0.5

1-2: Add docstring coverage verification

Consider adding a test to verify that all public functions in the intermetallics module have proper docstrings.

Example:

def test_docstring_coverage(self):
    """Verify all public functions have docstrings."""
    for func in [get_metal_fraction, get_d_electron_fraction,
                 get_distinct_metal_count, get_pauling_test_mismatch,
                 intermetallic_score]:
        self.assertIsNotNone(func.__doc__, 
            f"{func.__name__} missing docstring")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between db78427 and 4727c9a.

📒 Files selected for processing (3)
  • docs/intermetallics_readme.md (1 hunks)
  • smact/intermetallics.py (1 hunks)
  • smact/tests/test_intermetallics.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • smact/intermetallics.py
🧰 Additional context used
🪛 LanguageTool
docs/intermetallics_readme.md

[uncategorized] ~82-~82: A determiner appears to be missing. Consider inserting it.
Context: ...# intermetallic_score(composition) - Main scoring function combining multiple met...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~260-~260: A period might be missing here.
Context: ...Falls back to default behavior in these cases 2. VEC Calculation - Assumes s...

(AI_EN_LECTOR_MISSING_PUNCTUATION_PERIOD)


[uncategorized] ~269-~269: A determiner appears to be missing. Consider inserting it.
Context: ...tures 3. Threshold Selection - Default threshold (0.7) may need adjustment for...

(AI_EN_LECTOR_MISSING_DETERMINER)

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: test (3.12, windows-latest)
  • GitHub Check: test (3.12, macos-latest)
  • GitHub Check: test (3.12, ubuntu-latest)
  • GitHub Check: test (3.11, windows-latest)
  • GitHub Check: test (3.11, macos-latest)
  • GitHub Check: test (3.11, ubuntu-latest)
  • GitHub Check: test (3.10, windows-latest)
  • GitHub Check: test (3.10, macos-latest)
  • GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (6)
docs/intermetallics_readme.md (1)

110-143: Well-structured examples demonstrating all validation paths!

The examples effectively showcase different validation methods with clear comments and consistent formatting.

smact/tests/test_intermetallics.py (5)

5-8: Standardise the testing framework

The test suite is mixing unittest and pytest frameworks. Since the class inherits from unittest.TestCase, we should maintain consistency by using unittest throughout.


24-40: Well-structured test data setup

Excellent selection of test cases covering various types of intermetallics and non-intermetallics. The setup provides a robust foundation for comprehensive testing.


68-92: Excellent boundary testing

Comprehensive test coverage with clear boundary cases (0%, 100%, and intermediate metal fractions) and descriptive error messages.


167-172: 🛠️ Refactor suggestion

Use unittest assertions consistently

Replace pytest.raises with unittest's assertRaises to maintain consistency with the chosen testing framework.

Apply this diff:

-        with pytest.raises(ValueError, match="Empty composition"):
+        with self.assertRaises(ValueError) as cm:
             intermetallic_score("")
+        self.assertEqual(str(cm.exception), "Empty composition")

-        with pytest.raises(ValueError, match="Invalid formula"):
+        with self.assertRaises(ValueError) as cm:
             intermetallic_score("NotAnElement")
+        self.assertEqual(str(cm.exception), "Invalid formula")

Likely invalid or redundant comment.


129-141: 🛠️ Refactor suggestion

Update Pauling test comparison

The comparison should use absolute values to properly measure the deviation from zero, as per previous feedback. Additionally, consider adding edge cases with known electronegativity values.

Apply this diff:

-            fe3al_mismatch < nacl_mismatch,
+            abs(fe3al_mismatch) < abs(nacl_mismatch),

Likely invalid or redundant comment.

Comment on lines 74 to 77
print(get_pauling_test_mismatch("Fe3Al")) # 0.22

# Ionic compound - high mismatch
print(get_pauling_test_mismatch("NaCl")) # -1.23
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add disclaimers for numerical values.

The numerical values shown in the examples should be clearly marked as approximate values that may vary based on the implementation. Additionally, emphasise the relative relationships rather than specific values.

-print(get_pauling_test_mismatch("Fe3Al"))  # 0.22
-print(get_pauling_test_mismatch("NaCl"))  # -1.23
+print(get_pauling_test_mismatch("Fe3Al"))  # Returns ~0.22 (intermetallics have lower mismatch)
+print(get_pauling_test_mismatch("NaCl"))  # Returns ~-1.23 (ionic compounds have higher mismatch)

-print(intermetallic_score("Fe3Al"))  # ~0.83
-print(intermetallic_score("Ni3Ti"))  # ~0.87
+print(intermetallic_score("Fe3Al"))  # Returns ~0.83 (high score indicates strong intermetallic character)
+print(intermetallic_score("Ni3Ti"))  # Returns ~0.87 (high score indicates strong intermetallic character)

-print(intermetallic_score("NaCl"))  # 0.63
-print(intermetallic_score("Fe2O3"))  # 0.64
-print(intermetallic_score("SiO2"))  # 0.25
+print(intermetallic_score("NaCl"))  # Returns ~0.63 (ionic compounds have lower scores)
+print(intermetallic_score("Fe2O3"))  # Returns ~0.64 (metal oxides have lower scores)
+print(intermetallic_score("SiO2"))  # Returns ~0.25 (non-metallic compounds have lowest scores)

Also applies to: 91-97

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
smact/intermetallics.py (3)

13-35: Enhance error messages for better debugging

The error messages could be more specific to help users identify the exact issue with their input.

Apply this diff to improve error messages:

-            raise ValueError("Invalid formula") from exc
+            raise ValueError(f"Invalid formula '{composition}': {str(exc)}") from exc

56-70: Enhance function documentation

Consider adding more detailed docstrings including:

  • Return value ranges
  • Example usage
  • Edge cases

Example enhancement for get_metal_fraction:

def get_metal_fraction(composition: str | Composition) -> float:
    """Calculate the fraction of metallic elements in a composition.

    Args:
        composition: Chemical formula as string or pymatgen Composition

    Returns:
        float: Fraction between 0 and 1, where 1 indicates all elements are metals

    Example:
        >>> get_metal_fraction("Fe2O3")
        0.4  # 2/(2+3) as only Fe is metallic
    """

107-157: Make scoring parameters configurable

Consider extracting magic numbers into configurable parameters to allow for different scoring strategies:

  • The scale factor of 3.0 for pauling mismatch
  • The target VEC value of 8.0
  • The weights dictionary

Example implementation:

def intermetallic_score(
    composition: str | Composition,
    *,
    pauling_scale: float = 3.0,
    target_vec: float = 8.0,
    weights: dict[str, float] | None = None,
) -> float:
    """Calculate a score (0-1) indicating how intermetallic a composition is.

    Args:
        composition: Chemical formula or pymatgen Composition
        pauling_scale: Scale factor for pauling mismatch penalty (default: 3.0)
        target_vec: Target valence electron count (default: 8.0)
        weights: Custom weights for each component (default: None, uses standard weights)
    """
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4727c9a and 96b4f30.

📒 Files selected for processing (1)
  • smact/intermetallics.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: test (3.12, windows-latest)
  • GitHub Check: test (3.12, macos-latest)
  • GitHub Check: test (3.12, ubuntu-latest)
  • GitHub Check: test (3.11, windows-latest)
  • GitHub Check: test (3.11, macos-latest)
  • GitHub Check: test (3.11, ubuntu-latest)
  • GitHub Check: test (3.10, windows-latest)
  • GitHub Check: test (3.10, macos-latest)
  • GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (3)
smact/intermetallics.py (3)

1-11: Well-structured module setup!

The imports are properly organised, and the module's purpose is clearly documented.


38-54: Add error handling for edge cases

The function should handle empty compositions and potential division by zero errors.


97-98: Ensure compatibility with Python versions earlier than 3.10

The use of the strict parameter in the zip function is only available in Python 3.10 and above.

Copy link

codecov bot commented Jan 17, 2025

Codecov Report

Attention: Patch coverage is 95.45455% with 5 lines in your changes missing coverage. Please review.

Project coverage is 76.17%. Comparing base (a92e123) to head (ca030ef).
Report is 51 commits behind head on develop.

Files with missing lines Patch % Lines
smact/metallicity.py 93.10% 4 Missing ⚠️
smact/screening.py 87.50% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #367      +/-   ##
===========================================
+ Coverage    75.47%   76.17%   +0.70%     
===========================================
  Files           31       33       +2     
  Lines         2642     2749     +107     
===========================================
+ Hits          1994     2094     +100     
- Misses         648      655       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ryannduma ryannduma dismissed AntObi’s stale review January 17, 2025 19:31

fix has been made, dismissing to request fresh review

@ryannduma ryannduma requested a review from AntObi January 17, 2025 19:32
@ryannduma
Copy link
Collaborator Author

Hey Anthony, thanks for all the reviews and requested changes, hopefully resolved by the changes made today. Tests are passing which is great, only test not working is pre-commit because there's a new Pyright version out for the pre-commit hook. Other than that, I hope this is a step in the right direction for this chemical filter and I look forward to future discussions and contributions towards its enhancement. Have a lovely weekend and a great start at your new job next week! All the very best <3

@AntObi AntObi changed the base branch from master to develop January 19, 2025 15:47
@ryannduma
Copy link
Collaborator Author

Hey Anthony, latest commit just moved the Jupyter notebook example into the examples docs folder, there are no inherent code changes, though the tests are failing cause of something to do with mp_api being none which I think is a separate issue you're working on so don't worry.

@ryannduma ryannduma changed the title SMACT Intermetallics Enhancements SMACT Metallicity Handling Enhancements Jan 29, 2025
metallicity - allowing us to be more general with the features added
opening up the possibility of adding more features for enhanced metal detection in the future
Includes addition of tests, docs, and tutorials changes
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (8)
smact/metallicity.py (3)

38-53: Consider caching compositions for repeated calculations.
Repeatedly calling _ensure_composition converts the string input to a Composition object each time. If performance becomes a concern, consider memoising or reusing the converted object in long workflows.


72-76: Clarify usage of smact-specific metals list.
The smact.metals set might differ from other sources. Provide a clear explanation in your documentation to ensure users understand which elements are considered metals by SMACT.


142-157: Expose weighting as an optional parameter.
The weighting dict in metallicity_score is hard-coded, which limits user customisation. Consider exposing this as a function parameter or a configuration for better flexibility.

smact/tests/test_metallicity.py (1)

7-8: Be cautious mixing unittest and pytest.
Combining unittest.TestCase with pytest features can work, but might complicate debugging or test discovery. Ensure your CI pipeline consistently handles both frameworks.

docs/metallicity_readme.md (2)

201-201: Simplify the wording.

Replace "in conjunction with" with "alongside" or "with" for better readability.

🧰 Tools
🪛 LanguageTool

[style] ~201-~201: ‘in conjunction with’ might be wordy. Consider a shorter alternative.
Context: ...n The metallicity module has been used in conjunction with machine learning approaches to predict ...

(EN_WORDINESS_PREMIUM_IN_CONJUNCTION_WITH)


80-98: Add error handling examples.

The examples for metallicity_score should include cases demonstrating how the function handles invalid inputs (e.g., unknown elements, invalid formulas).

Add examples like:

# Error handling examples
try:
    print(metallicity_score("Xx2O3"))  # Invalid element
except ValueError as e:
    print(f"Error: {e}")

try:
    print(metallicity_score("Fe2O3]"))  # Invalid formula
except ValueError as e:
    print(f"Error: {e}")
smact/screening.py (2)

481-486: Consider reordering the validation checks for better performance.

The metallicity check, which involves more complex calculations, is performed before the simpler metal alloys check. Consider reordering these checks to perform the less computationally intensive checks first:

-    # Check for intermetallic compounds if enabled
-    if check_metallicity:
-        score = metallicity_score(composition)
-        if score >= metallicity_threshold:
-            return True
-
     # Check for simple metal alloys if enabled
     if include_alloys:
         is_metal_list = [elem_s in smact.metals for elem_s in elem_symbols]
         if all(is_metal_list):
             return True
+
+    # Check for intermetallic compounds if enabled
+    if check_metallicity:
+        score = metallicity_score(composition)
+        if score >= metallicity_threshold:
+            return True

464-466: Enhance the docstring for the new parameters.

The descriptions for check_metallicity and metallicity_threshold could be more detailed. Consider adding:

  • Example values for metallicity_threshold
  • Explanation of how these parameters interact with include_alloys
  • Reference to the metallicity scoring system documentation
-        check_metallicity (bool): If True, uses the metallicity scoring system to validate potential metallic/alloy compounds.
-        metallicity_threshold (float): Score threshold (0-1) above which a compound is considered a valid metallic/alloy.
-            Only used if check_metallicity is True.
+        check_metallicity (bool): If True, uses the metallicity scoring system to validate potential
+            metallic/alloy compounds. This provides a more nuanced approach compared to include_alloys.
+            See smact.metallicity module documentation for details on the scoring system.
+        metallicity_threshold (float): Score threshold (0-1) above which a compound is considered
+            a valid metallic/alloy. Only used if check_metallicity is True. Default is 0.7.
+            Example: 0.8 for stricter metallic character, 0.6 for more lenient validation.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b6dd89a and 7dcf738.

📒 Files selected for processing (7)
  • docs/metallicity_readme.md (1 hunks)
  • docs/smact.metallicity.rst (1 hunks)
  • docs/smact.rst (1 hunks)
  • docs/tutorials.rst (1 hunks)
  • smact/metallicity.py (1 hunks)
  • smact/screening.py (4 hunks)
  • smact/tests/test_metallicity.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • docs/smact.metallicity.rst
🧰 Additional context used
🪛 LanguageTool
docs/metallicity_readme.md

[style] ~201-~201: ‘in conjunction with’ might be wordy. Consider a shorter alternative.
Context: ...n The metallicity module has been used in conjunction with machine learning approaches to predict ...

(EN_WORDINESS_PREMIUM_IN_CONJUNCTION_WITH)

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: test (3.12, windows-latest)
  • GitHub Check: test (3.12, macos-latest)
  • GitHub Check: test (3.12, ubuntu-latest)
  • GitHub Check: test (3.11, windows-latest)
  • GitHub Check: test (3.11, macos-latest)
  • GitHub Check: test (3.11, ubuntu-latest)
  • GitHub Check: test (3.10, windows-latest)
  • GitHub Check: test (3.10, macos-latest)
  • GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (9)
smact/metallicity.py (2)

107-141: Check for potential confusion between smact.Element and pymatgen elements.
By re-instantiating elements via Element(el.symbol), you rely on smact’s element definitions rather than the existing pymatgen ones. Confirm that any differences in property data or naming do not cause unintended behaviour.


13-35: Ensure Python version compatibility.
The usage of zip(..., strict=False) (lines 97–98) requires Python 3.10 or later. If support for earlier versions is desired, consider removing strict=False or implementing an alternative strategy.

smact/tests/test_metallicity.py (3)

24-32: Commendable range of metallic examples.
Your selection of intermetallic compounds (Fe3Al, Ni3Ti, Cu3Au, etc.) provides strong coverage for real-world metallic systems. This makes the tests more representative and beneficial.


42-66: Excellent use of boundary testing.
Verifying that the fraction is 1.0, fractional, or 0.0 for different compositions ensures comprehensive coverage of the calculation logic.


160-172: Robust edge case handling.
Testing empty strings and invalid formula inputs is a good practice. This secures your code against unexpected user inputs.

docs/tutorials.rst (1)

13-13: Great addition of the metallicity classification tutorial.
Including the new tutorial reference ensures that users can learn how to apply the metallicity scoring functions in practice.

docs/smact.rst (1)

32-32: LGTM! Documentation updated correctly.

The new smact.metallicity module has been properly added to the list of submodules in alphabetical order.

docs/metallicity_readme.md (1)

95-97: Review the scoring thresholds for non-metallic compounds.

The example scores for non-metallic compounds (NaCl: 0.63, Fe2O3: 0.64) seem quite high, considering they are ionic/covalent compounds. This might lead to false positives when using the default threshold of 0.7. Consider adjusting the scoring weights or providing guidance on appropriate thresholds for different compound types.

smact/screening.py (1)

441-442: Implementation aligns with previous feedback.

The separation of metallicity functionality from include_alloys using check_metallicity parameter aligns well with the previous review feedback about maintaining backwards compatibility.

Also applies to: 481-486

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
requirements.txt (1)

177-178: Consider making CUDA support optional.

The NVIDIA NCCL dependency is currently required for Linux systems. Consider making this an optional dependency to avoid unnecessary installations for users not requiring GPU acceleration.

pyproject.toml (2)

94-94: Consider adding version constraints for ML dependencies.

The new ML dependencies (xgboost, shap) should have version constraints to ensure reproducible builds and prevent compatibility issues.

-ml = ["xgboost","shap","smact[featurisers]"]
+ml = ["xgboost>=2.1.3,<3.0.0", "shap>=0.46.0,<0.47.0", "smact[featurisers]"]

108-108: Document the ML feature requirements.

The addition of smact[ml] to optional dependencies introduces new machine learning capabilities. Consider adding a note in the README or documentation about these optional features and their requirements.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7dcf738 and 7669102.

📒 Files selected for processing (2)
  • pyproject.toml (2 hunks)
  • requirements.txt (12 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: test (3.12, windows-latest)
  • GitHub Check: test (3.12, macos-latest)
  • GitHub Check: test (3.12, ubuntu-latest)
  • GitHub Check: test (3.11, windows-latest)
  • GitHub Check: test (3.11, macos-latest)
  • GitHub Check: test (3.11, ubuntu-latest)
  • GitHub Check: test (3.10, windows-latest)
  • GitHub Check: test (3.10, macos-latest)
  • GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (1)
requirements.txt (1)

330-331: Verify SHAP and its dependencies compatibility.

The addition of SHAP and its dependencies (slicer) introduces new machine learning capabilities. Please ensure these versions are compatible with the minimum Python version requirement (3.10) specified in pyproject.toml.

Also applies to: 338-339

✅ Verification successful

SHAP 0.46.0 and its dependencies are compatible with Python 3.10

Both SHAP 0.46.0 (requires Python >=3.9) and its dependency slicer 0.0.8 (requires Python >=3.6) are fully compatible with the project's minimum Python requirement of 3.10.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify package compatibility with Python 3.10
pip debug --verbose | grep -A 5 "Compatible tags"
pip index versions shap==0.46.0 | grep "Requires-Python"
pip index versions slicer==0.0.8 | grep "Requires-Python"

Length of output: 924


Script:

#!/bin/bash
# Fetch package metadata from PyPI
curl -s https://pypi.org/pypi/shap/0.46.0/json | jq -r '.info | {requires_python, requires_dist}'
curl -s https://pypi.org/pypi/slicer/0.0.8/json | jq -r '.info | {requires_python, requires_dist}'

Length of output: 2134

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
docs/examples/metallicity.ipynb (2)

355-384: Enhance the analyze_compound function with type hints and comprehensive docstring.

The function would benefit from better documentation and type safety.

-def analyze_compound(formula):
-    """Perform comprehensive analysis of a compound."""
+def analyze_compound(formula: str) -> None:
+    """Perform comprehensive analysis of a compound.
+    
+    Args:
+        formula: Chemical formula of the compound (e.g., "Fe3Al")
+        
+    Raises:
+        ValueError: If the formula is invalid
+    """

Also, consider handling potential NaN values more gracefully throughout the function.


7-11: Enhance the introduction with version and performance information.

Consider adding:

  1. Minimum version requirements for SMACT and its dependencies
  2. Performance characteristics for large-scale screening
  3. Memory usage guidelines for different compound sizes
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7669102 and a4980b4.

📒 Files selected for processing (2)
  • docs/examples.rst (1 hunks)
  • docs/examples/metallicity.ipynb (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: test (3.12, windows-latest)
  • GitHub Check: test (3.12, macos-latest)
  • GitHub Check: test (3.12, ubuntu-latest)
  • GitHub Check: test (3.11, windows-latest)
  • GitHub Check: test (3.11, macos-latest)
  • GitHub Check: test (3.11, ubuntu-latest)
  • GitHub Check: test (3.10, windows-latest)
  • GitHub Check: test (3.10, macos-latest)
  • GitHub Check: test (3.10, ubuntu-latest)
🔇 Additional comments (2)
docs/examples.rst (2)

23-23: Logical placement in the documentation structure.

The addition of the metallicity example to the toctree is well-structured and aligns with the PR's objectives of enhancing metallicity handling capabilities.


23-23: Verify the existence of the metallicity example file.

Please ensure that the referenced file examples/metallicity exists and contains appropriate documentation for the new metallicity features.

✅ Verification successful

Toctree entry correctly references the metallicity example file

The referenced file exists at docs/examples/metallicity.ipynb and contains comprehensive documentation for the new metallicity features.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the existence and content of the metallicity example file

# Check if the file exists
fd -t f "metallicity" docs/examples/

# If found, display its content
if [ $? -eq 0 ]; then
    echo "File found. Displaying content:"
    cat $(fd -t f "metallicity" docs/examples/)
else
    echo "Warning: Example file not found"
fi

Length of output: 17723

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/examples/metallicity.ipynb (2)

242-277: Enhance validation examples with diverse test cases.

The validation examples effectively demonstrate different paths but could be improved by:

  1. Adding edge cases (e.g., compounds with mixed metallic character)
  2. Including comments explaining why certain compounds pass/fail specific validation paths
  3. Demonstrating the impact of different metallicity thresholds
 from smact.screening import smact_validity

-# Test with different validation methods
-compound = "Fe3Al"
+# Test cases demonstrating different validation scenarios
+test_cases = [
+    "Fe3Al",  # Pure metallic - passes alloy and metallicity checks
+    "Fe2O3",  # Metal oxide - passes standard check
+    "Cu3Au",  # Ordered intermetallic - behaviour varies with threshold
+]

-# 1. Standard validation (no alloy/metallicity check)
-is_valid_standard = smact_validity(
-    compound, use_pauling_test=True, include_alloys=False, check_metallicity=False
-)
-print(f"Is {compound} valid without an alloy check? {is_valid_standard}")
+for compound in test_cases:
+    print(f"\nValidation results for {compound}:")
+    
+    # 1. Standard validation (no alloy/metallicity check)
+    is_valid_standard = smact_validity(
+        compound, use_pauling_test=True, include_alloys=False, check_metallicity=False
+    )
+    print(f"Standard validation: {is_valid_standard}")
+    
+    # 2. With alloy detection
+    is_valid_alloy = smact_validity(
+        compound, use_pauling_test=True, include_alloys=True, check_metallicity=False
+    )
+    print(f"Alloy validation: {is_valid_alloy}")
+    
+    # 3. Test different metallicity thresholds
+    for threshold in [0.5, 0.7, 0.9]:
+        is_valid_metallic = smact_validity(
+            compound,
+            use_pauling_test=True,
+            include_alloys=False,
+            check_metallicity=True,
+            metallicity_threshold=threshold,
+        )
+        print(f"Metallicity validation (threshold={threshold}): {is_valid_metallic}")

355-374: Enhance function documentation and type safety.

The analyze_compound function would benefit from:

  1. Type hints for parameters and return values
  2. A more detailed docstring following NumPy or Google style
  3. Return values instead of just printing results
-def analyze_compound(formula):
-    """Perform comprehensive analysis of a compound."""
+from typing import Dict, Union
+
+def analyze_compound(formula: str) -> Dict[str, Union[float, bool]]:
+    """Perform comprehensive analysis of a compound's metallic properties.
+    
+    Args:
+        formula: Chemical formula of the compound (e.g., "Fe3Al")
+        
+    Returns:
+        Dict containing analysis results with the following keys:
+            - metal_fraction: Fraction of metallic elements
+            - d_electron_fraction: Fraction of d-block elements
+            - distinct_metals: Count of unique metallic elements
+            - pauling_mismatch: Deviation from ideal electronegativity ordering
+            - metallicity_score: Overall metallic character score
+            - valid_standard: Result of standard validation
+            - valid_alloy: Result of alloy validation
+            - valid_metallic: Result of metallicity validation
+            
+    Raises:
+        ValueError: If the formula is invalid
+    """
     # Basic metrics
     metal_frac = get_metal_fraction(formula)
     d_frac = get_d_electron_fraction(formula)
     n_metals = get_distinct_metal_count(formula)
     pauling = get_pauling_test_mismatch(formula)
     score = metallicity_score(formula)

     # Validity checks
     valid_standard = smact_validity(
         formula, use_pauling_test=True, include_alloys=False, check_metallicity=False
     )
     valid_alloy = smact_validity(
         formula, use_pauling_test=True, include_alloys=True, check_metallicity=False
     )
     valid_metallic = smact_validity(
         formula, use_pauling_test=True, include_alloys=False, check_metallicity=True
     )

+    results = {
+        "metal_fraction": metal_frac,
+        "d_electron_fraction": d_frac,
+        "distinct_metals": n_metals,
+        "pauling_mismatch": pauling if not math.isnan(pauling) else None,
+        "metallicity_score": score,
+        "valid_standard": valid_standard,
+        "valid_alloy": valid_alloy,
+        "valid_metallic": valid_metallic
+    }
+
+    # Print results for backwards compatibility
     print(f"Analysis of {formula}:")
-    print(f"Metal fraction: {metal_frac:.2f}")
-    print(f"d-electron fraction: {d_frac:.2f}")
-    print(f"Distinct metals: {n_metals}")
-    print(f"Pauling mismatch: {'nan' if math.isnan(pauling) else f'{pauling:.2f'}")
-    print(f"Metallicity score: {score:.2f}")
-    print(f"Valid (standard): {valid_standard}")
-    print(f"Valid (alloy): {valid_alloy}")
-    print(f"Valid (metallic): {valid_metallic}")
+    for key, value in results.items():
+        if isinstance(value, float):
+            print(f"{key}: {value:.2f}")
+        else:
+            print(f"{key}: {value}")
+
+    return results
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a4980b4 and 94aec86.

📒 Files selected for processing (1)
  • docs/examples/metallicity.ipynb (1 hunks)
🔇 Additional comments (2)
docs/examples/metallicity.ipynb (2)

46-54: Add comprehensive error handling and input validation to example code.

The example code cells would benefit from demonstrating error handling for invalid chemical formulae and showing input validation for edge cases. Additionally, include examples with both string and Composition object inputs.

Here's an example enhancement for the metallicity_score demonstration:

 from smact.metallicity import metallicity_score

-# Metallic compounds - high scores
-print(f"The metallicity score of Fe3Al is {metallicity_score('Fe3Al'):.2f}")  # ~0.85
-print(f"The metallicity score of Ni3Ti is {metallicity_score('Ni3Ti'):.2f}")  # ~0.91
+# Example with string input and error handling
+try:
+    # Metallic compounds - high scores
+    print(f"The metallicity score of Fe3Al is {metallicity_score('Fe3Al'):.2f}")  # ~0.85
+    print(f"The metallicity score of Ni3Ti is {metallicity_score('Ni3Ti'):.2f}")  # ~0.91
+    
+    # Invalid formula
+    print(metallicity_score("NotACompound"))
+except ValueError as e:
+    print(f"Error: {e}")
 
-# Non-metallic compounds - low scores
-print(f"The metallicity score of NaCl is {metallicity_score('NaCl'):.2f}")  # 0.33
-print(f"The metallicity score of Fe2O3 is {metallicity_score('Fe2O3'):.2f}")  # 0.46
-print(f"The metallicity score of SiO2 is {metallicity_score('SiO2'):.2f}")  # 0.17
+# Example with Composition object
+from pymatgen.core import Composition
+
+# Non-metallic compounds - low scores
+compounds = ["NaCl", "Fe2O3", "SiO2"]
+for formula in compounds:
+    comp = Composition(formula)
+    print(f"The metallicity score of {comp.reduced_formula} is {metallicity_score(comp):.2f}")

Also applies to: 80-88, 117-128, 157-165, 197-208


351-351: Replace wildcard import with explicit imports.

Using from smact.metallicity import * is not recommended as it can lead to namespace pollution and make it unclear which functions are being used.

return get_element_fraction(composition, smact.metals)


def get_d_electron_fraction(composition: str | Composition) -> float:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest renaming this to get_d_block_element_fraction or something similar. electron in the function name would imply that we're trying to count the fraction of d electrons available in the compound over all the electrons which is quite different from the intended functionality

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha! I'll fix this in my next commit

from smact.properties import valence_electron_count


def _ensure_composition(composition: str | Composition) -> Composition:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this function is needed. Could you explain if it is?
I believe the Composition class is already robust enough with its error handling.

In each function call in the rest of module, you could probably have comp=Composition(composition) even if composition is an instance of the Composition class.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this might be a redundancy on my end, but I originally wrote this to ensure the composition was a pymatgen composition object even in contexts where they may enter a composition that's otherwise - ill do some testing on my end and possibly get rid of the function should this be the case and incorporate your suggested changes.

get_d_block_element_fraction
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
docs/examples/metallicity.ipynb (2)

46-54: 🛠️ Refactor suggestion

Add error handling and clarify input types.

The example would benefit from demonstrating error handling and being more explicit about input types.

 from smact.metallicity import get_metal_fraction
+from pymatgen.core import Composition
 
-# Pure metallic compound - returns 1.0
-print(f'The metallic fraction of Fe3Al is {get_metal_fraction("Fe3Al")}')  # Works with string formula (& Composition object)
+try:
+    # Example with string input
+    print(f'The metallic fraction of Fe3Al is {get_metal_fraction("Fe3Al")}')
 
-# Mixed compound - returns fraction of how much of the compound is metallic
-print(f'The metallic fraction of Fe2O3 {get_metal_fraction("Fe2O3")}')  # 0.4
+    # Example with Composition object
+    comp = Composition("Fe2O3")
+    print(f'The metallic fraction of {comp.reduced_formula} is {get_metal_fraction(comp)}')
+
+    # Example with invalid input
+    print(get_metal_fraction("NotACompound"))
+except ValueError as e:
+    print(f"Error: {e}")

351-351: 🛠️ Refactor suggestion

Replace wildcard import with explicit imports.

Using wildcard imports can lead to namespace pollution and make it unclear which functions are being used.

-from smact.metallicity import *
+from smact.metallicity import (
+    get_metal_fraction,
+    get_d_block_element_fraction,
+    get_distinct_metal_count,
+    get_pauling_test_mismatch,
+    metallicity_score
+)
🧹 Nitpick comments (8)
smact/metallicity.py (4)

56-61: Enhance the docstring with more details.

While the implementation is correct, the docstring could be more descriptive by including:

  • Input/output types and ranges
  • Example usage
  • Return value interpretation
 def get_metal_fraction(composition: str | Composition) -> float:
-    """Calculate the fraction of metallic elements in a composition.
-
-    Implemented using get_element_fraction helper with smact.metals set.
-    """
+    """Calculate the fraction of metallic elements in a composition.
+
+    Args:
+        composition: Chemical formula as string or pymatgen Composition
+
+    Returns:
+        float: Fraction between 0-1 representing the proportion of metallic elements
+
+    Example:
+        >>> get_metal_fraction("Fe3Al")  # Returns 1.0 (all elements are metals)
+        >>> get_metal_fraction("Fe2O3")  # Returns 0.4 (2/5 elements are metals)
+    """

97-98: Remove redundant strict=False parameter in zip.

The strict parameter is redundant here as False is the default value.

-        for i, (_el1, eneg1) in enumerate(zip(elements, electronegativities, strict=False)):
-            for _el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :], strict=False):
+        for i, (_el1, eneg1) in enumerate(zip(elements, electronegativities)):
+            for _el2, eneg2 in zip(elements[i + 1 :], electronegativities[i + 1 :]):

138-140: Document the rationale for the scale factor.

The scale factor of 3.0 for the Pauling mismatch penalty appears to be a magic number. Consider adding a comment explaining its significance.

-        scale = 3.0
+        # Scale factor of 3.0 chosen as typical Pauling electronegativity
+        # differences >3.0 are rare and indicate extreme ionic character
+        scale = 3.0

107-118: Enhance the docstring with comprehensive details.

The docstring should document the scoring components, weights, and interpretation of the score.

 def metallicity_score(composition: str | Composition) -> float:
-    """Calculate a score (0-1) indicating the degree of a compound's metallic/alloy nature.
-
-    1. Fraction of metallic elements
-    2. Number of distinct metals
-    3. d-block element fraction
-    4. Electronegativity difference (Pauling mismatch)
-    5. Valence electron count proximity to 8
-
-    Args:
-        composition: Chemical formula or pymatgen Composition
-    """
+    """Calculate a score (0-1) indicating the degree of a compound's metallic/alloy nature.
+
+    The score is a weighted sum of the following components:
+    1. Metal fraction (30%): Proportion of metallic elements
+    2. d-block element fraction (20%): Proportion of d-block elements
+    3. Distinct metals count (20%): Number of unique metals (capped at 3)
+    4. Valence electron count (15%): Proximity to 8 electrons
+    5. Pauling mismatch (15%): Electronegativity ordering deviation
+
+    Args:
+        composition: Chemical formula as string or pymatgen Composition
+
+    Returns:
+        float: Score between 0-1 where:
+            - Higher scores (>0.7) suggest metallic/alloy character
+            - Lower scores (<0.5) suggest non-metallic character
+
+    Example:
+        >>> metallicity_score("Fe3Al")  # Returns ~0.85 (metallic)
+        >>> metallicity_score("NaCl")  # Returns ~0.33 (ionic)
+    """
smact/tests/test_metallicity.py (4)

93-109: Add test case for mixed d-block/non-d-block compounds.

The test method covers pure d-block and non-d-block compounds but would benefit from testing mixed compounds.

 def test_get_d_block_element_fraction(self):
     """Test d-block element fraction calculation."""
     # Should be 1.0 for pure transition metal compounds
     self.assertAlmostEqual(
         get_d_block_element_fraction(Composition("Fe2Nb")),
         1.0,
         places=6,
         msg="Expected all d-block elements in Fe2Nb",
     )

     # Should be 0.0 for compounds with no d-block elements
     self.assertAlmostEqual(
         get_d_block_element_fraction(Composition("NaCl")),
         0.0,
         places=6,
         msg="Expected no d-block elements in NaCl",
     )
+
+    # Should be fractional for mixed compounds
+    self.assertAlmostEqual(
+        get_d_block_element_fraction(Composition("Fe3Al")),
+        0.75,
+        places=6,
+        msg="Expected 75% d-block elements in Fe3Al",
+    )

111-127: Add test case for complex multi-metal compounds.

The test method would benefit from testing compounds with more than two metals.

 def test_get_distinct_metal_count(self):
     """Test counting of distinct metals."""
     self.assertEqual(
         get_distinct_metal_count(Composition("Fe3Al")),
         2,
         msg="Expected 2 distinct metals in Fe3Al",
     )
     self.assertEqual(
         get_distinct_metal_count(Composition("NaCl")),
         1,
         msg="Expected 1 distinct metal in NaCl",
     )
     self.assertEqual(
         get_distinct_metal_count(Composition("SiO2")),
         0,
         msg="Expected no metals in SiO2",
     )
+    # Test complex multi-metal compound
+    self.assertEqual(
+        get_distinct_metal_count(Composition("NbTiAlCr")),
+        4,
+        msg="Expected 4 distinct metals in NbTiAlCr",
+    )

129-140: Add test case for compounds with missing electronegativity data.

The test method should verify that NaN is returned when electronegativity data is missing.

 def test_get_pauling_test_mismatch(self):
     """Test Pauling electronegativity mismatch calculation."""
     # Ionic compounds should have high mismatch
     nacl_mismatch = get_pauling_test_mismatch(Composition("NaCl"))

     # Metallic compounds should have lower mismatch
     fe3al_mismatch = get_pauling_test_mismatch(Composition("Fe3Al"))

     self.assertTrue(
         fe3al_mismatch < nacl_mismatch,
         msg=f"Expected lower Pauling mismatch for Fe3Al ({fe3al_mismatch:.2f}) than NaCl ({nacl_mismatch:.2f})",
     )
+
+    # Test compound with missing electronegativity data
+    import math
+    mismatch = get_pauling_test_mismatch(Composition("Og2O"))  # Oganesson has no known electronegativity
+    self.assertTrue(
+        math.isnan(mismatch),
+        msg="Expected NaN for compound with missing electronegativity data",
+    )

142-172: Add test cases for boundary conditions.

The test methods would benefit from testing boundary conditions of the metallicity score.

 def test_metallicity_score(self):
     """Test the overall metallicity scoring function."""
     # Known metallic compounds should score high
     for formula in self.metallic_compounds:
         score = metallicity_score(formula)
         self.assertTrue(
             score > 0.7,
             msg=f"Expected high metallicity score (>0.7) for {formula}, but got {score:.2f}",
         )

     # Non-metallic compounds should score low
     for formula in self.non_metallic_compounds:
         score = metallicity_score(formula)
         self.assertTrue(
             score < 0.5,
             msg=f"Expected low metallicity score (<0.5) for {formula}, but got {score:.2f}",
         )
+
+    # Test boundary cases
+    # Compound with maximum possible score
+    max_score = metallicity_score("Fe2Nb")  # All metrics should be optimal
+    self.assertAlmostEqual(
+        max_score,
+        1.0,
+        places=6,
+        msg="Expected maximum score for optimal compound",
+    )
+
+    # Compound with minimum possible score
+    min_score = metallicity_score("F2")  # All metrics should be minimal
+    self.assertAlmostEqual(
+        min_score,
+        0.0,
+        places=6,
+        msg="Expected minimum score for non-metallic compound",
+    )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7b278f3 and ca030ef.

📒 Files selected for processing (3)
  • docs/examples/metallicity.ipynb (1 hunks)
  • smact/metallicity.py (1 hunks)
  • smact/tests/test_metallicity.py (1 hunks)
🔇 Additional comments (8)
smact/metallicity.py (5)

13-35: LGTM! The error handling is comprehensive.

The function provides consistent error handling and user-friendly error messages. The author's explanation about its use case in the past review comments justifies its existence.


38-53: Well-designed helper function that promotes code reuse.

The implementation is efficient and the docstring clearly explains the function's purpose and usage.


64-69: Enhance the docstring with more details.

The implementation is correct and the name has been updated as suggested. However, like get_metal_fraction, the docstring could be more descriptive.

 def get_d_block_element_fraction(composition: str | Composition) -> float:
-    """Calculate the fraction of d-block elements in a composition.
-
-    Implemented using get_element_fraction helper with smact.d_block set.
-    """
+    """Calculate the fraction of d-block elements in a composition.
+
+    Args:
+        composition: Chemical formula as string or pymatgen Composition
+
+    Returns:
+        float: Fraction between 0-1 representing the proportion of d-block elements
+
+    Example:
+        >>> get_d_block_element_fraction("Fe2Nb")  # Returns 1.0 (all d-block)
+        >>> get_d_block_element_fraction("Fe3Al")  # Returns 0.75 (Fe is d-block)
+    """

72-75: Enhance the docstring with more details.

The implementation is efficient but the docstring could be more descriptive.

 def get_distinct_metal_count(composition: str | Composition) -> int:
-    """Count the number of distinct metallic elements in a composition."""
+    """Count the number of distinct metallic elements in a composition.
+
+    Args:
+        composition: Chemical formula as string or pymatgen Composition
+
+    Returns:
+        int: Number of unique metallic elements in the composition
+
+    Example:
+        >>> get_distinct_metal_count("Fe3Al")  # Returns 2
+        >>> get_distinct_metal_count("NbTiAlCr")  # Returns 4
+        >>> get_distinct_metal_count("SiO2")  # Returns 0
+    """

78-87: Enhance the docstring with more details.

The docstring could be more descriptive by including input/output types and examples.

 def get_pauling_test_mismatch(composition: str | Composition) -> float:
-    """Calculate a score for how much the composition deviates from ideal
-    Pauling electronegativity ordering.
-
-    Higher mismatch => more difference (ionic, e.g. NaCl).
-    Lower mismatch => metal-metal bonds (e.g. Fe-Al).
-
-    Returns:
-        float: Mismatch score (0=perfect match, higher=more deviation, NaN=missing data)
-    """
+    """Calculate a score for how much the composition deviates from ideal
+    Pauling electronegativity ordering.
+
+    Higher mismatch indicates more ionic character (e.g., NaCl),
+    while lower mismatch suggests metal-metal bonds (e.g., Fe-Al).
+
+    Args:
+        composition: Chemical formula as string or pymatgen Composition
+
+    Returns:
+        float: Mismatch score where:
+            - 0.0 indicates perfect match (metallic-like bonding)
+            - Higher values indicate more ionic character
+            - NaN if any element lacks known electronegativity
+
+    Example:
+        >>> get_pauling_test_mismatch("Fe3Al")  # Returns ~0.22 (metallic)
+        >>> get_pauling_test_mismatch("NaCl")  # Returns ~2.23 (ionic)
+    """
smact/tests/test_metallicity.py (3)

24-40: LGTM! Well-organized test data setup.

The test compounds are well-chosen with clear comments explaining their significance.


42-66: LGTM! Comprehensive test coverage.

The test method covers all important cases with appropriate assertions and descriptive error messages.


68-91: LGTM! Tests cover edge cases effectively.

The test method effectively validates the function's behaviour with pure metallic, non-metallic, and mixed compounds.

@ryannduma ryannduma requested a review from AntObi January 30, 2025 16:19
Copy link
Collaborator

@AntObi AntObi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@AntObi AntObi merged commit 019a5cd into WMD-group:develop Jan 30, 2025
15 checks passed
@ryannduma ryannduma deleted the smact_intermetallics branch January 30, 2025 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement feature python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants