Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW]: AutoEIS: An automated tool for analysis of electrochemical impedance spectroscopy using evolutionary algorithms and Bayesian inference #6256

Open
editorialbot opened this issue Jan 22, 2024 · 77 comments
Assignees
Labels
Python review Shell TeX Track: 2 (BCM) Biomedical Engineering, Biosciences, Chemistry, and Materials

Comments

@editorialbot
Copy link
Collaborator

editorialbot commented Jan 22, 2024

Submitting author: @ma-sadeghi (Mohammad Amin Sadeghi)
Repository: https://github.com/AUTODIAL/AutoEIS/
Branch with paper.md (empty if default branch): joss
Version: v0.0.17
Editor: @lucydot
Reviewers: @dap-biospec, @DevT-0
Archive: Pending

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/a073ad3ce9719adc3cce19fb586414e4"><img src="https://joss.theoj.org/papers/a073ad3ce9719adc3cce19fb586414e4/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/a073ad3ce9719adc3cce19fb586414e4/status.svg)](https://joss.theoj.org/papers/a073ad3ce9719adc3cce19fb586414e4)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@dap-biospec & @DevT-0, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review.
First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @lucydot know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @DevT-0

📝 Checklist for @DevT-0

📝 Checklist for @dap-biospec

@editorialbot
Copy link
Collaborator Author

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

Software report:

github.com/AlDanial/cloc v 1.88  T=0.14 s (284.9 files/s, 152802.3 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          19            578            661           1981
Markdown                         5             74              0            185
Bourne Shell                     2             20             12            151
YAML                             4             14              7            125
TeX                              1              0              0            119
Jupyter Notebook                 1              0          17642            115
TOML                             1              4              0             78
Dockerfile                       1             14              8             46
CSS                              1              7              1             29
DOS Batch                        1              8              1             26
reStructuredText                 4             18             26             15
make                             1              5              7             12
-------------------------------------------------------------------------------
SUM:                            41            742          18365           2882
-------------------------------------------------------------------------------


gitinspector failed to run statistical information for the repository

@editorialbot
Copy link
Collaborator Author

Wordcount for paper.md is 989

@editorialbot
Copy link
Collaborator Author

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.21105/joss.02349 is OK
- 10.5281/zenodo.2535951 is OK
- 10.1016/j.electacta.2020.136864 is OK
- 10.1149/1.2044210 is OK
- 10.1149/1945-7111/aceab2 is OK
- 10.1109/TIM.2021.3113116 is OK

MISSING DOIs

- None

INVALID DOIs

- None

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@lucydot
Copy link

lucydot commented Jan 29, 2024

@dap-biospec, @DevT-0 - a reminder prompt to start reviews; your first task is to generate a checklist (see instructions at top of this thread). Any questions, please ask -

Lucy

@Kevin-Mattheus-Moerman
Copy link
Member

@dap-biospec, @DevT-0 👋

@DevT-0
Copy link

DevT-0 commented Feb 14, 2024

I will submit my review latest by this Friday. Thanks for your patience.

@DevT-0
Copy link

DevT-0 commented Feb 19, 2024

Auto EIS review final.pdf

Review of AUTOEIS routine for JOSS
This paper presents a comprehensive and highly anticipated analysis tool, AutoEIS, designed to enhance the efficiency of Electrochemical Impedance Spectroscopy (EIS) analysis when employed judiciously.

General Remarks on the Scientific Applicability of AutoEIS
It is imperative for users to recognise that the construction of specific equivalent circuit models for an electrochemical system must be informed and justified by the physicochemical properties of the system itself and be adequately supported by a suite of chemical and microscopic analyses. The frequent reliance on impedance analysis models, justified merely by their fit quality as indicated by high statistical correlation values, has regrettably contributed to a tarnished reputation for electrochemical impedance spectroscopy. For instance, the indiscriminate use of an excessive number of Constant Phase Elements (CPEs) with arbitrary phase values, transmission lines, and similar circuit components can enable the fitting of any data set without yielding scientifically meaningful insights. The code under review, particularly within the Models and main.py files, introduces a "capacitance filter" designed to exclude equivalent circuits devoid of ideal capacitors. A model comprising ideal capacitors that results in a less precise fit is preferable to one incorporating CPEs with arbitrary phase values. The impedance data from highly complex systems may lack the distinct geometric features necessary to identify a suitable and minimalist model. This code does not consider the physical system in any capacity, which would pose a significant challenge. Nevertheless, it facilitates the fitting of complex impedance data, potentially promoting the continuation of practices that overlook rigorous and scientifically sound analysis.

Moreover, an essential aspect of impedance spectroscopic measurement involves meticulous setup. Unique to electrochemical measurements, the cell constitutes an integral component of the measurement system, with each element and accompanying chemical reactions and physical phenomena within the cell (e.g., adsorption, charge-transfer, mass-transfer or diffusion, field-line geometries, or even simple issues like poor contacts and cabling) having a substantial impact on the data. Consequently, the awareness and consideration of such potential influences are crucial for accurate analysis. Regrettably, impedance measurements yield scant chemical information to identify the origins of artefacts. Thus, assuring the scientific validity of the measurements is challenging. In the context of Kronig-Kramers transform (KKT) validation, ideally, data spanning a broad frequency range would facilitate the application of direct integration KKT validation checks. However, it is important to note that KKT validation may provide a sufficient, but not necessary, condition for data validation. The implementation of linear KKT in this code serves merely to eliminate non-conforming points rather than to verify data validity.

Comments on the Code's Programming
The routine demonstrates commendable adherence to software engineering best practices, encompassing modular design, extensive testing, and thorough documentation. Nevertheless, there are areas for improvement, such as error handling, test coverage, and potential enhancements through techniques like JAX's Just-In-Time (JIT) compilation.
The subsequent sections of this review provide detailed comments on specific sub-routines, highlighting their strengths, weaknesses, and potential improvements. While some observations may seem trivial or perhaps extraneous, others could offer valuable suggestions for future enhancements. The decision to incorporate these comments into revisions of the AutoEIS routine rests with the much-experienced editors and authors. Finally, I extend my apologies for the delay in reviewing this routine and compiling these comments, with the hope that they will contribute constructively to the refinement of this valuable tool.

0 - README.md
Strengths:

  1. Provides a clear introduction and overview of AutoEIS, including its purpose and target audience, which is good for user engagement and clarity.
  2. Includes badges for workflow status, enhancing visibility into the current state of the codebase's CI/CD pipelines.
  3. Directs users on how to contribute, encouraging community involvement and feedback.

Weaknesses:

  1. The mention of the API not being stable could deter potential users or contributors from adopting or contributing to the project. More confidence could be instilled by detailing the roadmap or expected stability timelines.

Suggestions:

  1. Consider adding a section on "Current Limitations and Future Work" to manage user expectations and encourage contributions in specific areas of need.

1 init.py
Strengths:

  1. Modular Import Structure: The module effectively organises the package's components by importing core functionalities, input/output operations, metrics calculations, parsing capabilities, utilities, and visualisation tools. This modular approach facilitates easy navigation and usage of the package's diverse features.
  2. Centralised Version Control: Including version information directly within the initialisation file ensures that version checks can be easily performed, enhancing the management of dependencies and compatibility checks.

Weaknesses:

  1. Wildcard Imports: The use of from .core import * potentially introduces a risk of namespace pollution, where unnecessary or conflicting names might be imported, leading to less predictable code behaviour and potential clashes with other modules or user-defined variables.
  2. Noqa Flags: The extensive use of # noqa: F401, E402 comments to bypass flake8 linting rules might be indicative of underlying issues with import order or unused imports that could be addressed more systematically.

Improvements:

  1. Refine Import Strategy: To mitigate risks associated with wildcard imports, it's advisable to explicitly list imported entities, thereby enhancing code clarity and maintainability. This change would also aid in understanding the module's dependencies.
  2. Address Linting Warnings: Instead of bypassing linting warnings, a review and potential refactor of the import statements and module structure could resolve underlying issues, leading to cleaner and more compliant code.
  3. Enhanced Documentation: While the module's purpose is relatively straightforward, additional comments or a module docstring explaining the rationale behind the organisation and the role of each imported component could provide valuable context to users and contributors.

2 - main.py
Strengths:

  1. Simplicity: The code is straightforward, making it easy to understand and maintain. It adheres to the common Python idiom of checking if name == "main": to determine if the script is being run as the main program.
  2. Modular Design: Importing the autoeis_installer function from a separate CLI module promotes code modularity and reuse, allowing the CLI functionality to be easily expanded or modified without altering the entry script.

Weaknesses:

  1. Limited Functionality: The current implementation only invokes the autoeis_installer function with a hardcoded prog_name argument. This approach might limit the flexibility of the CLI, as users cannot pass additional arguments or commands directly through the command line when executing the script.
  2. Error Handling: No visible error handling provided. Any errors or exceptions raised by autoeis_installer or during the import process will lead to an abrupt termination of the script, potentially leaving the user without clear guidance on the issue.

Improvements:

  1. Enhanced CLI Options: Extending the CLI functionality to accept command-line arguments and flags could provide users with more control over the execution of the package. Utilising libraries like argparse or click (already used in the .cli module) could facilitate this enhancement.
  2. Error Handling and Feedback: Implementing error handling and providing meaningful feedback to the user in case of exceptions could improve the user experience. This might include catching specific exceptions and displaying user-friendly error messages or suggestions for resolution.
  3. Logging and Debugging Support: Introducing logging statements and a debug mode could aid in troubleshooting and provide users with more insight into the execution flow and potential issues

3 - .cli.py:
Strengths:

  1. User-Friendly Interface: The use of Click makes the CLI intuitive and easy to use, adhering to common command-line conventions.
  2. Flexibility in Installation: The --ec-path option allows users to specify a local copy of EquivalentCircuits, offering flexibility in managing dependencies.
  3. Clear Command Purpose: The command install is clearly named, indicating its purpose to install Julia dependencies, making it straightforward for users to understand what action will be taken.

Weaknesses:

  1. Error Handling: The script lacks explicit error handling for the installation process, which could leave users without clear guidance in case of failures or exceptions during the installation.
  2. Dependency on External Functions: The script depends on functions from .julia_helpers, which could introduce coupling and dependency issues if those functions are modified or removed.
  3. Limited Functionality: The script currently focuses solely on the installation process, potentially underutilising the capabilities of a CLI in enhancing user interaction with the package.

Improvements:

  1. Enhanced Error Handling: Implement robust error handling and user feedback for the installation process to guide users through resolving potential issues.
  2. Decoupling and Modularity: Ensure that the CLI script and the Julia helpers module are loosely coupled, allowing independent updates and modifications without breaking functionality.
  3. Expand CLI Capabilities: Consider extending the CLI with additional commands and options to cover more functionalities of the AutoEIS package, such as running analyses, managing configurations, or providing help and documentation directly through the command line.

4 - cli_pyjulia.py:
Strengths:

  1. Project-Specific Installations: The -p or --project option allows for targeted Julia dependency installations within specific Julia projects, enhancing modularity and project management.
  2. Control Over Precompilation: The options --precompile and --no-precompile provide users with control over the precompilation of Julia libraries, offering flexibility based on user preferences and system requirements.
  3. Consistent User Interface: Maintaining a consistent CLI design with autoeis_installer as the main entry point ensures a uniform user experience across different CLI components of the package.

Weaknesses:

  1. Complexity for New Users: The additional options, while powerful, might introduce complexity that could be overwhelming for new or less technical users.
  2. Potential Redundancy: There seems to be an overlap in functionality with .cli.py, which could confuse users about when to use each script.
  3. Documentation and Examples: The script could benefit from inline documentation or examples demonstrating the use of project-specific installations and precompilation options.

Improvements:

  1. Unified CLI Interface: Consider integrating the functionalities of cli_pyjulia.py into .cli.py to centralise CLI interactions and reduce redundancy.
  2. User Guidance: Enhance the script with more detailed help messages, examples, and error messages to guide users through using advanced options effectively.
  3. Interactive CLI Features: Introduce interactive CLI features that guide users through the installation process, making decisions based on user input and system checks to simplify the process for non-expert users.

5 - .core.py

Data Preprocessing (preprocess_impedance_data):

Strengths:

  1. Implements noise reduction and data normalisation, crucial for ensuring the quality and consistency of EIS data before Analysis.
  2. Utilises a threshold parameter to flexibly adjust the level of noise reduction based on the dataset's characteristics.

Weaknesses:

  1. Lacks detailed documentation on how the preprocessing impacts different types of EIS data, potentially leaving users uncertain about the appropriateness of the default settings for their specific data.
  2. The fixed threshold value might not be optimal for all datasets, possibly leading to over- or under-filtering in certain cases.

Improvements:

  1. Enhance the documentation to include examples and guidelines on selecting the threshold value.
  2. Introduce adaptive noise reduction techniques that adjust based on the data's characteristics, improving preprocessing outcomes across diverse datasets.

ECM Generation (generate_equivalent_circuits):

Strengths:

  1. Facilitates the generation of ECMs using genetic programming, a robust method for exploring the vast space of possible circuit configurations.
  2. Offers parameters like complexity, population_size, and generations for fine-tuning the genetic programming process, providing users with control over the balance between search thoroughness and computational efficiency.

Weaknesses:

  1. The complexity of the genetic programming approach might be daunting for users unfamiliar with evolutionary algorithms, potentially hindering accessibility.
  2. Parameter tuning requires a good understanding of genetic programming, which might not be straightforward for all users, leading to suboptimal configurations.

Improvements:

  1. Develop a more user-friendly interface for ECM generation that abstracts away some of the complexities of genetic programming, possibly through higher-level presets or automated parameter tuning based on data characteristics.
  2. Include more comprehensive examples and tutorials that demonstrate effective parameter tuning strategies for different types of EIS data.

Bayesian Inference (perform_bayesian_inference):

Strengths:

  1. Employs Bayesian inference to estimate ECM parameters, providing not only point estimates but also uncertainty measures, which are invaluable for rigorous scientific Analysis.
  2. Configurable through kwargs_mcmc, allowing users to adjust inference settings like num_warmup and num_samples to balance accuracy and computational load.

Weaknesses:

  1. The implementation assumes a certain level of familiarity with Bayesian statistics and MCMC sampling, which might not be the case for all users, potentially limiting the method's accessibility.
  2. The choice of priors and MCMC settings can significantly impact inference results, yet guidance on these aspects seems limited, which could lead to non-optimal usage.

Improvements:

  1. Provide more extensive documentation and educational materials on the Bayesian inference process within the context of EIS analysis, including how to choose priors and interpret the results.
  2. Implement heuristic methods or provide tools to assist users in selecting appropriate MCMC settings and priors based on their data, enhancing the approachability of Bayesian inference for a broader audience.

Plausibility Filtering (filter_implausible_circuits):

Strengths:

  1. Introduces a necessary step to eliminate physically implausible ECMs from consideration, ensuring that the resulting models are not only statistically but also scientifically valid.
  2. Automates the filtering process, significantly reducing the manual effort required to vet the generated ECMs.

Weaknesses:

  1. The criteria for plausibility might not be transparent or customisable, which could lead to the exclusion of viable models or inclusion of implausible ones based on the predefined thresholds.
  2. Lacks a mechanism to adjust the stringency of the filtering process, which might be necessary for applications with unique plausibility considerations.

Improvements:

  1. Enhance transparency by clearly documenting the plausibility criteria used and their scientific justification, providing users with a better understanding of the filtering process.
  2. Introduce configurable parameters that allow users to adjust the stringency of the plausibility filtering, accommodating a wider range of EIS applications and research needs.

6 - io.py Analysis:
Strengths:

  1. Modularity and Clarity: The module is well-organised with clear, dedicated functions for each type of data interaction, enhancing readability and maintainability.
  2. Ease of Use: Functions like load_test_dataset and load_test_circuits abstract away the complexities of data handling, making it easier for users to access test datasets and circuits for experimentation.
  3. Integration with Pandas and Numpy: Leveraging these libraries for data manipulation and numerical operations ensures efficient and familiar data handling practices for the Python community.

Weaknesses:

  1. Hardcoded Paths: The reliance on hardcoded paths (e.g., in get_assets_path) could reduce the flexibility and portability of the module, especially when the package structure changes or in different deployment environments.
  2. Error Handling: There's a lack of explicit error handling for potential issues like missing files, incorrect formats, or read errors, which could lead to uninformative exceptions for the end-users.
  3. String Evaluation in load_test_circuits: The use of eval to convert stringified lists to Python objects poses a security risk, especially if the function is ever adapted to handle user-provided data.

Improvements:

  1. Configurable Paths: Implement a configuration system or environment variables to allow dynamic specification of asset paths, improving flexibility and adaptability to different environments.
  2. Robust Error Handling: Introduce more comprehensive error handling and validation to gracefully manage and report issues with data files, enhancing user experience and debugging ease.
  3. Safe String Evaluation: Replace eval in load_test_circuits with a safer alternative like ast.literal_eval or custom parsing logic to mitigate potential security risks.

Critical Observations:

  1. The function parse_ec_output demonstrates a specialised parsing routine tailored to the output format of EquivalentCircuits.jl. While this tight coupling is efficient, it may limit the module's flexibility with respect to changes in the output format of EquivalentCircuits.jl or the integration of other tools.
  2. The use of regular expressions in parse_ec_output is effective for the current expected format but might require adjustments if the output format becomes more complex or varied. This could introduce maintenance challenges, necessitating a more flexible parsing approach or better standardisation of output formats from EquivalentCircuits.jl.

7 - julia_helpers.py
Strengths:

  1. Integration with Julia: The module facilitates seamless integration with Julia, enabling the use of Julia's high-performance computational capabilities within a Python environment. This cross-language functionality is crucial for leveraging specialised Julia packages like EquivalentCircuits.jl.
  2. Simplification of Julia Installation: The install_julia function abstracts the complexity of setting up Julia, making it more accessible to users unfamiliar with Julia.

Weaknesses:

  1. Dependency Management: The module assumes that Julia and its packages are not already installed or managed in a different environment, which might not align with the user's existing setup.
  2. Error Handling: There is a lack of detailed error handling and user feedback during the installation and setup process. Errors during package installation or Julia setup could lead to uninformative error messages for the end-user.

Improvements:

  1. Enhanced Error Handling and Feedback: Implement more robust error handling mechanisms to provide clear and informative feedback to users during Julia installation and package setup. This could include checking for common issues and suggesting fixes.
  2. Flexible Julia Environment Management: Allow for more flexibility in managing Julia environments, such as detecting existing installations, working within virtual environments, or integrating with package managers like Conda.

8 - cli_pyjulia.py
Strengths:

  1. Flexibility: The CLI provides options to install dependencies in a specific Julia project, offering flexibility for users working with multiple Julia environments or projects.
  2. User Experience: The --quiet option to disable logging and options to control precompilation (--precompile and --no-precompile) enhance user experience by giving users control over the installation process.

Weaknesses:

  1. Error Handling: The script lacks explicit error handling for the installation process. If the installation fails (due to network issues, incorrect Julia setup, etc.), the user might not receive clear guidance on troubleshooting.
  2. Documentation: While the CLI options are described, there's no documentation on the expected outcomes, potential errors, or how the install function interacts with Julia environments. This lack of information might confuse users unfamiliar with Julia or the specifics of the package's requirements.

Improvements:

  1. Enhanced Error Handling: Implement error handling to catch and provide informative messages for common issues during the installation process, such as missing Julia installation, network problems, or permissions issues.
  2. Comprehensive Documentation: Expand the CLI documentation to include examples, common issues, and troubleshooting tips. Detailed docstrings for the install_cli function could also clarify its behaviour and requirements.
  3. Validation Checks: Introduce checks to validate the Julia environment before attempting installation, ensuring prerequisites are met and potentially guiding users through resolving common setup issues.
  4. Feedback Mechanism: Provide real-time feedback during the installation process, such as progress indicators or confirmation messages upon successful completion, to improve user engagement and confidence in the process.

10 - Metrics Module:

Metrics.py

Strengths:

  1. Simplicity and Clarity: The functions are straightforward and easy to understand. Each function has a clear purpose, aligning with standard practices in model evaluation metrics.
  2. Documentation: The docstrings provide a clear description of what each function does, the expected input parameters, and the return values. This is beneficial for users and developers who may refer to this code for understanding or integration purposes.
  3. Handling of Complex Numbers: A notable feature is the handling of complex numbers in the input arrays. This capability is particularly useful in fields like signal processing or electrical engineering, where complex numbers are commonplace.
  4. General Applicability: These functions can be used in a wide range of applications, from simple regression problems to more complex analyses involving complex-valued data. The code does not make assumptions about the specific use case, enhancing its utility across different domains.
  5. Efficiency: The use of NumPy operations ensures that the computations are efficient and can handle large arrays effectively, leveraging NumPy's optimised C and Fortran code.

Weaknesses:

  1. Division by Zero in MAPE: The mape_score function could potentially result in a division by zero if any of the y_true values are zero. This is a common issue with MAPE, and the function lacks a safeguard against this scenario.
  2. Generalisation of R2 Score: The R2 score calculation does not account for the potential issues that can arise with complex numbers, such as the interpretation of the sum of squares due to the absolute value operation. The R2 score's traditional interpretation may not directly translate to complex-valued data.
  3. Error Handling and Validation: There is no explicit input validation or error handling. For example, the functions do not check if the input arrays y_true and y_pred have the same shape, which is a prerequisite for these calculations.
  4. Handling of Edge Cases: None of the functions address potential edge cases or numerical stability issues explicitly. For instance, in the r2_score function, if the total sum of squares (sst) is very close to zero, the division could lead to numerical instability or a misleading R2 score.
  5. Limited Flexibility: The functions are designed with a specific formula and do not offer flexibility in terms of adjusting parameters or accommodating variations of these metrics that might be useful in specific contexts (e.g., weighted versions of these scores).

Improvements:

  1. Enhance Documentation: Adding detailed docstrings for each function, explaining their parameters, return values, and potential errors, would significantly improve usability.
  2. Implement Error Handling: Including error handling mechanisms to catch and manage potential exceptions gracefully would enhance the robustness of the module.

_generate_ecm_parallel_julia

Strengths:

  1. Parallel Processing: Utilises Julia's multiprocessing capabilities, aiming to leverage parallel processing for efficiency gains in generating candidate circuits.
  2. Reproducibility: Attempts to set a random seed for reproducibility, which is crucial in scientific computations to ensure that results can be consistently replicated.

Weaknesses:

  1. Multiprocessing with Random Seeds: The comment # FIXME: This doesn't work when multiprocessing, use @Everywhere instead indicates an unresolved issue with setting random seeds in a multiprocessing environment. This could lead to inconsistencies in the results across different runs.
  2. Error Handling: No error handling is shown for the multiprocessing tasks. In a parallel processing environment, handling errors gracefully is crucial to ensure stability and reliability.
  3. JAX Utilisation: While JAX is mentioned in the context of JIT compilation, this sub-routine does not demonstrate its use. Leveraging JAX's capabilities could potentially enhance performance further.

Improvements:

  1. Resolve Multiprocessing Issue: Address the noted FIXME by implementing the suggested use of @Everywhere to correctly initialise random seeds in each Julia process. This will ensure reproducibility across multiprocessing tasks.
  2. Integrate JAX for Performance: Consider integrating JAX for numerical computations within this function. JAX's JIT compilation can significantly speed up array operations, which are likely a core part of generating and evaluating circuits.
  3. Enhance Error Handling: Implement comprehensive error handling to manage potential failures in parallel tasks, ensuring the function can recover or gracefully exit upon encountering issues.
  4. Documentation: Expand the function's documentation to include more details about its parameters, expected behaviour, and any side effects, especially concerning multiprocessing.
  5. JAX's JIT Compilation for Efficiency: JAX's JIT compilation transforms functions to be compiled by XLA (Accelerated Linear Algebra), which can dramatically speed up execution, especially for operations on large arrays or complex numerical computations typical in circuit analysis.

(Following on last pt. 5) Application to _generate_ecm_parallel_julia:

  1. Array Operations: If the generation and evaluation of circuits involve heavy array manipulations, JIT compiling these parts with JAX could reduce computation times.
  2. Batch Processing: JAX excels at vectorised operations. Rewriting parts of the circuit generation process to utilise batch operations could yield significant performance gains.
  3. Hybrid Approach: While the core parallel processing leverages Julia, computational bottlenecks within each process that involve intensive numerical operations could be targeted with JAX's JIT compilation for optimisation.

11 -Models.py
Strengths:

  1. Use of Probabilistic Programming: The code leverages Numpyro for Bayesian inference, which is a powerful approach for estimating the uncertainty in model parameters. This is particularly useful in complex systems like electronic circuits where there is inherent noise and uncertainty.
  2. Vectorisation with JAX: By importing JAX and using jax.numpy for operations, the code is positioned to take advantage of JAX's auto-differentiation and its ability to compile and optimise calculations for speed, particularly on GPU or TPU hardware.
  3. Modularity through Functions: The separation of the Bayesian model into two functions (circuit_regression and circuit_regression_wrapped) enhances code reusability and readability. It allows for flexibility in specifying different circuit models or functions without altering the core Bayesian inference logic.
  4. Explicit Priors Definition: The functions require a dictionary of prior distributions as an argument, allowing for explicit and flexible specification of priors for each model parameter. This is a key aspect of Bayesian analysis, providing a clear way to incorporate prior knowledge into the model.
  5. Complex Data Handling: The functions are designed to handle complex impedance data (Z: np.ndarray[complex]), which is common in electronic circuit analysis. This shows that the code is tailored for domain-specific applications, potentially making it a valuable tool for electrical engineers and researchers.

Weaknesses:

  1. Performance Considerations: The line p = jnp.array([numpyro.sample(k, v) for k, v in priors.items()]) involves a Python list comprehension and a subsequent conversion to a JAX array. This pattern can be sub-optimal for performance, as it does not fully leverage JAX's JIT (Just-In-Time) compilation capabilities, which work best with pure JAX operations.
  2. Error Handling and Validation: The code lacks error handling and input validation. For instance, it does not check if the provided circuit string in circuit_regression corresponds to a valid circuit function or if the shapes of Z and freq arrays are compatible.
  3. Hardcoded Distribution Parameters: The observation model's noise parameters (sigma_real and sigma_imag) are sampled from an Exponential distribution with a hardcoded rate of 1.0. This may not be appropriate for all use cases, and the flexibility to specify these as part of the model's inputs could enhance the code's applicability.
  4. Documentation and Comments: While there are docstrings providing a high-level overview, the code could benefit from more detailed comments, especially explaining the Bayesian inference steps and the rationale behind certain design choices (e.g., the use of separate sigma_real and sigma_imag for modelling noise in the real and imaginary parts of Z).
  5. Dependency on External Utility Function: The function circuit_regression depends on utils.generate_circuit_fn(circuit), whose behaviour and implementation are not shown. This external dependency could affect the code's robustness and portability if not properly managed or documented.

Improvements:

  1. User Documentation: Enhancing the documentation for these functions, including examples and explanations of Bayesian concepts, could make this part of the codebase more accessible.
  2. Error Handling: Implementing robust error handling and validation of inputs to the Bayesian models to prevent runtime errors and provide informative error messages.

12 - Parser Module
Strengths:

  1. Comprehensive Validation: The functions provide thorough validation of circuit strings and parameters, ensuring the integrity of ECM representations.
  2. Modularity: The modular design of parsing functions allows for easy extension and reuse throughout the codebase.

Weaknesses:

  1. RegEx Dependency: Heavy reliance on regular expressions for parsing might make the code difficult to maintain or extend, especially for complex circuit structures.
  2. Error Messaging: While validation checks are in place, the error messages might not always provide clear guidance on how to correct invalid inputs.

Improvements:

  1. Parser Flexibility: Enhancing the parser to handle a wider variety of circuit formats and potentially simplifying the parsing logic to reduce maintenance complexity.
  2. Improved Error Handling: Developing more descriptive error messages and guidance for common parsing errors to improve user experience.

13 - Utility Functions
Strengths:

  1. Utility Diversity: A wide range of utility functions supports different aspects of the package, enhancing code reusability and modularity.
  2. Support Functions: Functions like version assertions ensure compatibility and stability within the Julia and Python ecosystem used by AutoEIS.

Weaknesses:

  1. Utility Overload: The large number of utility functions might overwhelm new contributors or users, potentially obscuring the core functionalities of the package.
  2. Documentation: Some utility functions might benefit from more detailed docstrings that explain their purpose, parameters, and return values in greater detail.

Improvements:

  1. Utility Documentation: Expand the documentation within the utility module to provide clear, concise descriptions and usage examples for each function.
  2. Utility Consolidation: Evaluating the utility functions for opportunities to consolidate or refactor, reducing complexity and enhancing maintainability.

14 - Versioning
Strengths:

  1. Clear Versioning: Explicit versioning supports package stability and compatibility, especially when integrating with external Julia packages.

Weaknesses:

  1. Hardcoded Versions: Hardcoded versions might require manual updates, which could be overlooked, leading to potential compatibility issues.

Improvements:

  1. Dynamic Version Management: Implementing a more dynamic approach to version management, potentially using a versioning tool or script, could streamline updates and ensure consistency across dependencies.

15 - Testing and Validation
Strengths:

  1. Comprehensive Testing: The extensive testing suite covering a wide range of functionalities ensures the robustness and reliability of the package.
  2. Use of Pytest: Leveraging pytest for organising tests enhances the readability and maintainability of test cases.

Weaknesses:

  1. Test Coverage: While extensive, the tests may not cover all edge cases or error conditions, which could lead to potential undetected bugs.
  2. Mocking External Dependencies: The tests appear to rely on actual data and package functionalities, which might benefit from mocking to isolate and test specific components more effectively.

Improvements:

  1. Increase Test Coverage: Expanding the test suite to cover more edge cases, especially for error handling and failure modes, would further enhance the robustness.
  2. Mock External Dependencies: Implementing mocks for external dependencies and data could make tests more reliable and faster, isolating the functionality being tested.

@lucydot
Copy link

lucydot commented Feb 20, 2024

Hi @DevT-0 -

Wow, this certainly gets the award for longest Github issue comment I have seen :) You have really gone through each part with a fine tooth comb - there is a lot of useful looking feedback.

We need to be clear on what your (if any) acceptance blockers are; we need to understand if this submission meets the minimum JOSS standard. @DevT-0 - could you generate the checklist (see instructions at top of thread) and tick off which criteria is met and which you feel is not met? Where not met, please indicate what needs to be done to improve.

For more information on each of the checklist criteria you can see the JOSS documentation (https://joss.readthedocs.io/en/latest/reviewer_guidelines.html). It may also be useful to look through another JOSS review as our process is quite different from other journals (for example here is one I am currently editing which is near completion - #6264 (comment)).

@ma-sadeghi
Copy link

Hi @lucydot, just checking on the status of this review. Thanks for all your efforts!

@lucydot
Copy link

lucydot commented Feb 28, 2024

Hi @ma-sadeghi I've just started a direct conversation with @DevT-0 outside of this thread and I will update here once I have a reply.

@dap-biospec - what is the status of your review? Are you able to start in the next week?

@DevT-0
Copy link

DevT-0 commented Feb 28, 2024

Review checklist for @DevT-0

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the https://github.com/AUTODIAL/AutoEIS/?
  • License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@ma-sadeghi) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
  • Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
  • Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
  • Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

@DevT-0
Copy link

DevT-0 commented Feb 28, 2024

In the Statement of need, I disagree with the authors stating that (or at least per my understanding of this statement):
“…However, these tools require the user to feed in an ECM, evaluate its fit to the data, and repeat this process until a satisfactory model is found. This process can be time-consuming and subjective, especially for complex EIS data. AutoEIS addresses this challenge by automating the ECM construction and evaluation process, providing a more objective and efficient approach to EIS data analysis…!
I am not sure what do the authors mean by making the process more objective, and current approach is subjective or more of a dark-art. As pointed in my opening comment on Page 1, the interpretation of impedance spectrum requires that the model accounts for chemistry and physics of the system as well being aware of the sources of error in the data. Therefore, the analysis has to be guided by apriori knowledge about physico-chemical composition of the system and it can not be the other way round. One of the limitations of impedance spectroscopy is the limited chemical info content. In fact, this “fit and justify” approach is the reason users often end up making wrong inferences from the data. Additionally, this non-scientific approach is being bolstered with the availability of easy-to-use interfaces, in many other branches or methods of analyses employing multivariate fitting, say for example full-profile crystal structure refinement, or fitting of XPS peaks. Each parameter in such complex routines, not only, affect the geometric features of the modelled curve but also the associated physics or chemistry. For example, in case of electrochemical impedance, the devised circuit elements like constant phase element or distributed elements, are as such are mathematically very flexible and become cause a concern for over-fitting, when the impedance spectra don’t have the sufficient sharp geometrical features. Ideally, we should have been able to make a representative model with coupled differential equations proxying different charge transport processes and then be able to solve them to get their exact analytical solutions. However, this is much more challenging, leaving equivalent circuit modelling, where circuit elements act as lumped terms, as the primary mode of analysis and making sure that chosen or identified equivalent circuit model minimally sufficient depiction of most important physical or chemical aspects of the system. This has become particularly and thus, even more important in current times of intensive discovery of new battery, fuel cell, or other (photo)electrocatalytic applications, when most of the users are working with system that do not comprise well defined crystallite facets but rather nanocrystalline, porous, composite and compound materials with undefined surface reconstruction and composition. Further, the inhomogeneous nature of such electrode surfaces (in composition and/or morphology) leads to distributed time-constants and while there exists an assortment of distributed elements, they all been designed for very particular purposes and make physical sense only with certain values of their parameter.
This is my primary concern, as the code under review doesn’t provide a more robust set of checks to ensure AutoEIS’s suitability for general use. The checks included for removing non-physical models or lin KKT, are very very minimal and insufficient. Additionally, ranking the suitability of different models is being done on the basis of single-valued statistical correlation score. This is too simplistic an approach, especially as the data is typically being scanned in log-scale. Rather, as argued above, one should target a minimally sufficient model and good quality of fit by analysing the difference curve across the spectrum and to the least, across the time-constant windows for the critical processes. Further, electrochemical processes in any of emerging and currently interesting applications are complex and multiscale, often the instrumentation or the set-up is limited to capture high-quality data across wide-enough window, while making sure the sample surface remains constant, to let the inflexion point to develop allowing to estimate the confident values for simplest of circuit elements, say in case of high capacitance with slow mass transport. Lastly, the test sets provided are again too limited to demonstrate the versatility of AutoEIS to perform reliable and robust automated analysis of complex system generated data with unique minimal and physically reasonable solutions.
So while, this is a valid and substantial piece of syntactically correct computer program and to begin with, the suggestions for improving performance when sifting and processing many different permutations over can be overlooked, and while an avatar of AutoEIS is published in a peer-reviewed journal (DOI: 10.1149/1945-7111/aceab2) my hesitation to endorse this submission in it’s as is state, is rooted in scientific fundamentals of electrochemical impedance spectroscopy, which are challenging for me to ignore.

@DevT-0
Copy link

DevT-0 commented Feb 28, 2024

I think this will be all from me. I have tried to evaluate this submission as a scientific communication. Rest I believe, I have given detailed point-wise comments to make a decision above.

@dap-biospec
Copy link

dap-biospec commented Mar 1, 2024

Review checklist for @dap-biospec

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the https://github.com/AUTODIAL/AutoEIS/?
  • License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@ma-sadeghi) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
  • Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
  • Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
  • Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

@dap-biospec
Copy link

@ma-sadeghi @lucydot So far I am unable to verify functionality due to an OSError posted.

@lucydot
Copy link

lucydot commented Mar 6, 2024

Hi @ma-sadeghi:

  • you can see @dap-biospec comment above re: install with Win64 Python

  • I have read through the previous comment from @DevT-0. They raise an issue around the approach of using data-driven rather than physico-chemical rules for model construction. This seems to be a comment on the fundamental method underlying your code, and not one which can be addressed in this review process. However I can see that the more data-driven approach is adopted widely in the community, and so I do not see as an acceptance blocker to publication. However, if not already done, I suggest you make clear in your documentation the possible pitfalls of data-driven approaches (such as overfitting).

@DevT-0 I can see there are a number of tick boxes which are unticked under Documentation and Software Paper. What needs to be done (at a minimum) so that these can be ticked? There is a "Statement of Need" and "State of the field" (existing related software projects) listed in the paper, so I think these can be ticked off? You also indicate that there is good coverage of testing (it does not need to be complete), so I infer from this "automated tests" description is also met?

@ma-sadeghi
Copy link

Hi @lucydot, thanks for the follow up. I'm a bit bandwidth-strapped this week, I'll start addressing the comments next Tuesday.

@DevT-0
Copy link

DevT-0 commented Mar 6, 2024

Hi @ma-sadeghi:

  • you can see @dap-biospec comment above re: install with Win64 Python
  • I have read through the previous comment from @DevT-0. They raise an issue around the approach of using data-driven rather than physico-chemical rules for model construction. This seems to be a comment on the fundamental method underlying your code, and not one which can be addressed in this review process. However I can see that the more data-driven approach is adopted widely in the community, and so I do not see as an acceptance blocker to publication. However, if not already done, I suggest you make clear in your documentation the possible pitfalls of data-driven approaches (such as overfitting).

@DevT-0 I can see there are a number of tick boxes which are unticked under Documentation and Software Paper. What needs to be done (at a minimum) so that these can be ticked? There is a "Statement of Need" and "State of the field" (existing related software projects) listed in the paper, so I think these can be ticked off? You also indicate that there is good coverage of testing (it does not need to be complete), so I infer from this "automated tests" description is also met?

Hi Lucy, please accept my apologies for any inconvenience and delay my review may have caused. I thought I had followed a thorough review process during which I compiled 13 pages of detailed and for most part point-wise comments that sufficiently justified ticking off most of the checklist boxes as well as leaving some unticked. But I realise now the stringent compliance an acceptable review must meet for JOSS. At the same time, given the considerable effort and time I have already spent in this review that I am running out of my capacity to contribute further. As such I would understand if my review cannot be used. I appreciate the opportunity to participate in this unique one-time experience. Thanks and Best regards, Dev

@lucydot
Copy link

lucydot commented Mar 7, 2024

Hi @DevT-0 -

JOSS is a different approach to reviewing than traditional journals, with papers being accepted once the reviewers/editors are happy that each item in the checklist has been met. Your contributions are gratefully received - I should be able to take what you have written and infer (alongside my own checks) if most of the remaining checkboxes can be ticked off.

The key aspect we still need to review (and is beyond my role as editor) is functionality, which I believe @dap-biospec has been working on given issue raised around installation.

Thanks @ma-sadeghi for your update re: availability, and thanks @DevT-0 for your review contributions - we would like to list you as reviewer in published paper as you have committed much time and expertise already to the process, unless you have any objection to this. However I won't ping you with any more review related requests!

Lucy

@lucydot
Copy link

lucydot commented Mar 7, 2024

Hello all, a quick message to say I will be out of office from tomorrow, and will check back here on the 18th.

@lucydot
Copy link

lucydot commented Mar 25, 2024

Hi @ma-sadeghi, @dap-biospec - any updates on response to review / review?

@lucydot
Copy link

lucydot commented Oct 2, 2024

@dap-biospec - are you able to continue with the review?

@dap-biospec
Copy link

@ma-sadeghi It appears that you attempted to resolve issues by eliminating introductory section from depository all together. This is not a solution to the bugs in the code. Please include basic step-by-step tutorial that can be clearly found and followed from the landing page of the depository. It may be prudent to include verifiable demo section for each claim that you are making in the manuscript. Please include all the required steps in a markup page(s), leaving linked jupyter notebooks only for illustrations that cannot be included in the markup files. As @lucydot rightfully pointed out, please ask an un-initiated colleague to follow written tutorial without any additional support and verify that they obtain all the results that you claim and expect to see.

@ma-sadeghi
Copy link

@dap-biospec Thanks for the feedback. Here’s how we’ve addressed your points:

The perform_full_analysis function was originally meant as a convenience helper, but it turned out to add unnecessary maintenance complexity and made debugging harder by masking individual steps. We decided to remove it, but this doesn’t affect AutoEIS’s core functionality.

The "basic usage" section was removed from the README to keep the landing page clean and focused. We’ve linked to the full documentation, which covers everything in detail:


Is there a reason you’d prefer it on the landing page?

On Jupyter notebooks vs. Markdown: Good point. Notebooks can be slow to render. To handle that, we convert them to HTML in the documentation, so they’re easy to view. Users can still grab the actual notebooks from the repo if they want to run them. We avoid Markdown for examples because it’s harder to automate testing for it, and Jupyter notebooks are tested on every push.

Per your and @lucydot's suggestion, we had colleague independently install and use AutoEIS, and they were able to go through the examples.

Hope these address your concerns.

@lucydot
Copy link

lucydot commented Oct 10, 2024

@ma-sadeghi - thank you for addressing points raised, it makes it clear for our reviewers what has changed.

@dap-biospec - it appears that @ma-sadeghi and team have addressed a number of your concerns. Are you able to confirm functionality? I think this is the priority for the review to progress. Once we are happy with core functionality, I can support in reviewing other aspects (e.g. documentation, joss paper, general checks).

@dap-biospec
Copy link

@ma-sadeghi I attempted to reproduce the first three examples and encountered variety of errors and inconsistencies in each of "Circuit models 101", "circuit generation", and "detailed workflow". This starts with the lack of arguments for ae.visualization.set_plot_style() example and continues with runtimes errors and unexpected keywords.

It does not appear that the authors performed literal execution of their tutorial on a clean system. Errors may be due to package dependence or environment configuration.

It also appears that the authors exported jupyter notebook as HTML without correcting action steps or providing adequate instructions on the expected behavior.

@lucydot at this time functionality is not verifiable.

@dap-biospec
Copy link

@ma-sadeghi please also note that your "visit examples" link bypasses installation steps.

@ma-sadeghi
Copy link

ma-sadeghi commented Nov 3, 2024

@dap-biospec Thank you for taking the time to review.


@ma-sadeghi I attempted to reproduce the first three examples and encountered various errors and inconsistencies in "Circuit models 101," "circuit generation," and "detailed workflow." Issues include missing arguments for ae.visualization.set_plot_style() and further runtime errors and unexpected keywords.

All of our example notebooks are automatically tested on GitHub virtual machines. These tests are designed to catch any cell that fails to run. The tests are two-fold: One is executed on every pull request, while another runs on every push to the repository to update the documentation. All recent tests have passed. However, to double-check, I cloned a fresh repository, created a fresh virtual environment, and successfully ran the "detailed_workflow.ipynb" notebook (which covers core AutoEIS functionalities). I’ve recorded the process, and you can view it here.

(Regarding set_plot_style, the function should work with default parameters, but you can refer to the API documentation to inspect potential arguments)

It does not appear that the authors performed a literal execution of their tutorial on a clean system. Errors may be due to package dependency or environment configuration.

As explained above, the examples are tested and run on GitHub virtual machines as part of our CI workflow.

It also appears that the authors exported the Jupyter notebook as HTML without addressing action steps or providing adequate instructions on the expected behavior.

The snippet below shows the Action responsible for publishing the documentation. This workflow is executed on every push to the repository:

      - name: Build the documentation
        run: |
          uv run jupyter nbconvert \
            --ExecutePreprocessor.timeout=3600 \
            --to notebook --execute --inplace examples/*.ipynb
          cd doc
          uv run make html

Based on this snippet, prior to generating HTML files, all example notebooks are re-run, and results are updated. If any cell fails, the published HTML reflects the errors (I double checked our online documentation, and there was no error). Additionally, as noted, there’s a separate Action (notebooks.yml) that runs for each pull request. This setup ensures that any changes causing failures are caught and cannot be merged.

@ma-sadeghi please also note that your "visit examples" link bypasses installation steps.

The "Installation" section in README is right before the "Usage" section. The assumption was that prior to diving into examples, users have already gone through the installation step.


If you're experiencing specific issues with your local setup, I’d be glad to help you troubleshoot. Feel free to share details about your environment, dependencies, or any error messages, and I can assist in pinpointing the source of the problem.

@dap-biospec
Copy link

@ma-sadeghi Thank you for confirming that HTML content was literally translated from jupyter notebooks. This likely explains why HTML instructions do not work, by the same account as printing of dynamic HTML content to PDF does not yield working pages.

Please review previous request to verify that literal execution of your published HTML instructions yield expected demonstrations. A "literal" means that a reader that has absolutely no prior knowledge of your package can follow your instructions from this page https://github.com/AUTODIAL/AutoEIS/ verbatim (i.e. copy-paste) and arrive at the same results. Please also note that code validation by script, or auto-generation of HTML from notebooks in not sufficient to make "as published" tutorial adequate. Hence, the earlier request to test the instructions by uninitiated human.

@lucydot
Copy link

lucydot commented Nov 4, 2024

@dap-biospec thank you for continued efforts testing the code and @ma-sadeghi for clear account of testing and verification setup.

Testing the code via Github Actions is a widely used tool as it should reflect what occurs for when software is freshly installed. It is peculiar that the GH Action runs without error, yet a human following the same procedure does get errors.

Do you think this could this be a difference in Python and/or Julia versions @ma-sadeghi ? I can see the GH workflows tests.yml and notebooks.yml use julia version v1.10 and python version v3.10.

@dap-biospec - could you let us know which python version you are using? Are you running in a clean python environment (for example, with a virtual environment or conda)? This might help us understand if there are version conflicts.

@dap-biospec
Copy link

@ma-sadeghi is making a small but consequential omission in the online tutorial section. Code that may run in jupyper notebook may not run, or produce the same output, outside of it. Hence, simple HTML export does not work as an online documentation. Furthermore, execution of code by copying and pasting in plain python prompt does not yield described results, as outlined above.

As a publication, this work should be self-sufficient and accessible to any reader, including non-programming chemists.

@ma-sadeghi
Copy link

@dap-biospec Just to clarify, are you asking if copying the code cells from the example Jupyter notebooks (.ipynb) and pasting them into a standalone Python script (.py) will produce the same results as running the code directly within the Jupyter notebook?

The reason I ask is that if the user simply downloads the example notebooks (in .ipynb format) and runs them cell by cell, they will get the same results. This is what I demonstrated in the screencast I uploaded to YouTube.

@dap-biospec
Copy link

@ma-sadeghi I am reviewing the paper/depository at face value. I am following your instructions literally (i.e. HTML pages exported from jupyter notebooks) - copying code snippets and executing in python shell. I am not doing anything that instructions do not call for.

The fact that line-by-line execution of your examples in plain python yields exceptions and missing argument prompts is a separate issue, which indicates that there are unreported or unrecognized differences in the environment, dependencies, etc.

@ma-sadeghi
Copy link

@dap-biospec I think I got it. Just to recap so you can correct me if I missed anything: you’re asking us to verify if a user can reproduce the results shown in the example notebooks by simply running the code cells in a Python REPL. I’ll work on this over the weekend, and if I run into any issues, I’ll sort them out. If all goes well, I’ll make a screencast and share it next week.

@lucydot
Copy link

lucydot commented Nov 11, 2024

@ma-sadeghi - this sounds like a good plan. What you summarise is also what I think @dap-biospec is suggesting.

@lucydot
Copy link

lucydot commented Nov 22, 2024

Hi @ma-sadeghi - have you got a timescale for completing the above? Thanks - Lucy.

@ma-sadeghi
Copy link

Hi @lucydot, thanks for you patience. I did successfully reproduce it in a fresh Python REPL. You can find the screencast here.

@dap-biospec
Copy link

@ma-sadeghi Here are screenshots of REPL executions. As the screencast is not a part of the manuscript, I am going only by the actual posted steps.

First, "Circuits 101" leads to a prompt:
image
In the screenshots below this step was bypassed.

While you mention lcapy and LaTeX in "Circuits 101", this should be clearly spelled in the installation section similarly to Julia if you are relying on it to a similar extent. Alternatively, you can have separate and clearly marked pages with or without visualization.

"Circuit generation":
image

"Detailed workflow":
image

Please let me know when this is addressed in the code or in the tutorial and I will repeat.

@lucydot
Copy link

lucydot commented Dec 7, 2024

Hi @dap-biospec and @ma-sadeghi -

I've had a run through installation and tutorials myself. In general I try to avoid doing this as editor, but given the considerable time that has passed, and effort already invested, I think it is sensible to take a more direct approach so that we can wrap this up.

Installation

I created an empty conda environment, installed pip, and then followed the instructions given: pip install -U autoeis. It proceeded for me without hiccup.

While you mention lcapy and LaTeX in "Circuits 101", this should be clearly spelled in the installation section similarly to Julia if you are relying on it to a similar extent

I agree with this suggestion from @dap-biospec. To me it makes sense to keep all installation steps on the installation page @ma-sadeghi. It could be included as an optional sub-section for those who want to visualise the circuit (making clear that for most autoeis functionality these additional steps, which are a little more involved than a pip-install, are not needed).

Jupyter vs REPL

I think there is an assumption that the tutorials are being run through a Jupyter Notebook - for example, ae.visualization.plot_impedance_combo(freq, Z) does not save (or otherwise display) a plot when following the tutorials with REPL. I think this assumption is fair for many (perhaps even most) users.

@ma-sadeghi if you are working on the assumption that tutorials are completed within a Jupyter Notebook, you could state this at the top of each tutorial page. To support those who want to use REPL, you could point to the nice screencast you have developed and demonstrate how to save a figure using plt.savefig function - I know vast majority of researchers will know this, but may help students/those just starting out with Python/matplotlib.

Functionality

@dap-biospec I could not reproduce the errors you had in "Circuit generation" and "Detailed workflow". This difference seems unusual to me; perhaps this is a bug which has been fixed since starting the review - I suggest re-installing autoeis (ideally within a virtual environment / conda), if you have not already done so, and re-running through these tutorials.

@dap-biospec
Copy link

@ma-sadeghi I updated autoeis to 0.0.34.
"ae.visualization.set_plot_style()" still produces user prompt.
If bypassed, circuit generation successfully completed " ae.core.generate_equivalent_circuits(freq, Z, **kwargs)". I will continue testing further, but the prompt must be addresses in tutorial or the software.

@dap-biospec
Copy link

@lucydot For a depository which is used solely by the authors and close colleagues, any assumptions are warranted. A peer reviewed publication imposes additional requirements on spelling out assumptions, abbreviations, prerequisites and several other established standard, without which depository does not rise to a level of a publication. While I can easily guess that ae.visualization.plot_impedance_combo(freq, Z) and similar calls may not have the same output in REPL as in jupyter, a) we cannot exclude electrochemists without extensive programming background from the audience and b) the example code must proceed through all steps without errors.

Furthermore, I have substantial concerns about rolling code changes. Release says 0.0.34, list says 0.0.36, paper submitted for review still says 0.0.17, if I am not mistaken. In my opinion, the submitted paper must be locked to the release under review with the optional information on obtaining the latest version. Otherwise, the accepted paper can point to a materially different code, however well intended the changes were.

@lucydot
Copy link

lucydot commented Dec 19, 2024

A peer reviewed publication imposes additional requirements on spelling out assumptions, abbreviations, prerequisites and several other established standard, without which depository does not rise to a level of a publication.

I agree, hence the suggestion to state the assumption of Jupyter Notebook use at the top of the tutorial. I think a case might be made for asking authors to give instructions for REPL and Jupyter. @ma-sadeghi - what are your thoughts - are you able to update tutorials so that they run as written for REPL and Jupyter?

In my opinion, the submitted paper must be locked to the release under review with the optional information on obtaining the latest version.

It is very usual for code to be updated and new versions released during the JOSS review process. In fact this is encouraged - it means the review process is working (as the code is being improved through review). When a code passes review, we ask authors to make a final release and the version reported on the paper is updated accordingly.

I will continue testing further

Excellent 👍

@lucydot
Copy link

lucydot commented Dec 19, 2024

FYI @dap-biospec @ma-sadeghi I am going to take two weeks leave, and will be back on the 6th of Jan.

@lucydot
Copy link

lucydot commented Jan 28, 2025

@ma-sadeghi - what are your thoughts on your comments above - are you able to update tutorials so that they run as written for REPL and Jupyter?

@dap-biospec - has your testing raised any further issues? If the Jupyter vs REPL issue is resolved are you happy that stated functionality is demonstrated?

@ma-sadeghi
Copy link

ma-sadeghi commented Feb 6, 2025

Hi @lucydot, I apologize for the delay. I agree, here's what we've done to address @dap-biospec's comments:

  • There was a subtle bug that broke REPL workflows. The bug is now fixed. If you're interested in the details: The reason the bug didn't surface in my screen cast was that I was using an IPython REPL, not Python's native REPL.
  • That said, because of the interactive nature of typical AutoEIS workflows, we still strongly encourage users to use a proper interactive environment, e.g., IPython Notebook, Jupyter Lab, VS Code, etc. So, we added a prominent warning to both the README and the examples' landing page to communicate this to users.

Hope this addresses the issue.

README

Image

Examples' landing page

Image

@lucydot
Copy link

lucydot commented Feb 12, 2025

@dap-biospec - has your testing raised any further issues? Please see message above from @ma-sadeghi; they have made clear that examples are expected to run in a Jupyter Notebook (with the assumption being that basic adaptations to the example as written would enable integration into Python REPL or script).

@dap-biospec
Copy link

@ma-sadeghi While the previous errors were cleared, additional errors have been encountered and examples are still not reproducible.

  1. "Example notebooks" pages are still a plain HTML conversion of the content of Jupyter notebooks, with no notices that each notebook must be copy/pasted into particular environment or the included URL of notebook to download and run. Thus, HTML examples are misleading.
  2. Example notebooks are not renderable on Github depository and no instructions are provided to a broad scientific community on handling them, whether on the ReadMe page or examples section.
  3. Example calls or code snippets that require graphical environment are still not clearly marked as such. An umbrella warning, provided in the installation section, is not sufficient to identify environment prerequisites. The authors must know environment dependence of each call and should not offload it to the reader.
  4. ae.visualization.draw_circuit(circuit) causes runtime error for unresolved dependency form pdflatex, which is not automatically resolved or addressed in manual installation instructions. Thus, the pattern of implicit dependency expectations, raised several times before, continues. This occurs in both Python command line and Jupyter environments.
  5. In cases where code executes, reported values and circuits are different from those provided in the example notebooks, with no explanation provided for the discrepancy. (ae.core.generate_equivalent_circuits(freq, Z, **kwargs) and ae.core.filter_implausible_circuits(circuits_unfiltered) ) Thus, claimed results are not reproducible.
  6. Runtime errors continue at several steps in the example code, ending up with multiple errors an lock-up at ae.core.perform_bayesian_inference(circuits, freq, Z) step:
    Refining Initial Guess 0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/3 [ 0:00:23 < -:--:-- , ? it/s ][15:57:03] WARNING Removing Julia environment directory: julia_helpers.py:222
    .julia\environments\pyjuliapkg
    Refining Initial Guess 0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/3 [ 0:00:24 < -:--:-- , ? it/s ]Traceback (most recent call last):
    Refining Initial Guess 0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/3 [ 0:00:24 < -:--:-- , ? it/s ] File "D:\Python\Python312\Lib\site-packages\autoeis\julia_helpers.py", line 203, in ensure_julia_deps_ready
    ensure_julia_deps_ready(quiet)
    File "D:\Python\Python312\Lib\site-packages\autoeis\julia_helpers.py", line 189, in ensure_julia_deps_ready
    Main = init_julia(quiet=quiet)
    ^^^^^^^^^^^^^^^^^^^^^^^
    File "D:\Python\Python312\Lib\site-packages\autoeis\julia_helpers.py", line 70, in init_julia
    from juliacall import Main
    File "D:\Python\Python312\Lib\site-packages\juliacall_init
    .py", line 251, in
    init()
    File "D:\Python\Python312\Lib\site-packages\juliacall_init
    .py", line 227, in init
    res = jl_eval(script.encode('utf8'))
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    OSError: [WinError 541541187] Windows Error 0x20474343
    [errors truncated]

Further testing was halted as irrelevant.

@dap-biospec
Copy link

@lucydot After over a year of efforts to bring the manuscript in line with expectations for a peer-reviewed research article, I arrive at the conclusion that the authors are either unable or unwilling to make a niche GitHub depository accessible to a broad research community beyond its current users. Furthermore, I do not believe that the content of this manuscript makes a substantial new contribution beyond what was already published in this article and, thus, does not warrant a separate publication. Based on the lack of novel contribution, lack of reproducibility, and poor documentation I recommend rejection of this manuscript.

@lucydot
Copy link

lucydot commented Feb 25, 2025

Thank you @dap-biospec for your time reviewing this submission for JOSS.

@lucydot
Copy link

lucydot commented Feb 27, 2025

A note here to say that I'm in discussion with authors and editors re: the best way to proceed.
I will be at a conference until the 10th March, so may be slower to reply than usual until I'm back in the office.

@ma-sadeghi
Copy link

Here's our response to the reviewer's comments:

  1. "Example notebooks" pages are still a plain HTML conversion of the content of Jupyter notebooks, with no notices that each notebook must be copy/pasted into particular environment or the included URL of notebook to download and run. Thus, HTML examples are misleading.

We've explained it a few times before and it's not clear to us why "on-the-fly executed notebooks, which are then exported to HTML" is not good enough, and how that's misleading.

  1. Example notebooks are not renderable on Github depository and no instructions are provided to a broad scientific community on handling them, whether on the ReadMe page or examples section.

We've stripped the cell outputs from the git-committed notebooks to keep the repository size minimal. All examples are rendered as HTML in our documentation website.

  1. Example calls or code snippets that require graphical environment are still not clearly marked as such. An umbrella warning, provided in the installation section, is not sufficient to identify environment prerequisites. The authors must know environment dependence of each call and should not offload it to the reader.

We've added clear instructions as callouts in both README, and the examples page in our documentation. That should be sufficient.

  1. ae.visualization.draw_circuit(circuit) causes runtime error for unresolved dependency form pdflatex, which is not automatically resolved or addressed in manual installation instructions. Thus, the pattern of implicit dependency expectations, raised several times before, continues. This occurs in both Python command line and Jupyter environments.

This has already been mentioned in our "Circuit Models 101" example notebook. From the notebook: "We can visualize the circuit model using the draw_circuit function, which requires the lcapy package to be installed, and a working LaTeX installation. See here for more details."

  1. In cases where code executes, reported values and circuits are different from those provided in the example notebooks, with no explanation provided for the discrepancy. (ae.core.generate_equivalent_circuits(freq, Z, **kwargs) and ae.core.filter_implausible_circuits(circuits_unfiltered) ) Thus, claimed results are not reproducible.

This is expected. Equivalent circuits are generated using an evolutionary algorithm, which is a stochastic process. So, you'd expect a different pool of circuits every time. (Of course, you'd expect the algorithm to converge to a few good candidates if you generate a large pool of circuits via iters keywords)

  1. Runtime errors continue at several steps in the example code, ending up with multiple errors an lock-up at ae.core.perform_bayesian_inference(circuits, freq, Z) step:
    Refining Initial Guess 0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/3 [ 0:00:23 < -:--:-- , ? it/s ][15:57:03] WARNING Removing Julia environment directory: julia_helpers.py:222
    .julia\environments\pyjuliapkg
    Refining Initial Guess 0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/3 [ 0:00:24 < -:--:-- , ? it/s ]Traceback (most recent call last):
    Refining Initial Guess 0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/3 [ 0:00:24 < -:--:-- , ? it/s ] File "D:\Python\Python312\Lib\site-packages\autoeis\julia_helpers.py", line 203, in ensure_julia_deps_ready
    ensure_julia_deps_ready(quiet)
    File "D:\Python\Python312\Lib\site-packages\autoeis\julia_helpers.py", line 189, in ensure_julia_deps_ready
    Main = init_julia(quiet=quiet)
    ^^^^^^^^^^^^^^^^^^^^^^^
    File "D:\Python\Python312\Lib\site-packages\autoeis\julia_helpers.py", line 70, in init_julia
    from juliacall import Main
    File "D:\Python\Python312\Lib\site-packages\juliacall_init.py", line 251, in
    init()
    File "D:\Python\Python312\Lib\site-packages\juliacall_init.py", line 227, in init
    res = jl_eval(script.encode('utf8'))
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    OSError: [WinError 541541187] Windows Error 0x20474343
    [errors truncated]

Our automated unit/integration/notebook tests are all passing as of today. I've also triple checked on a brand new Windows machine today. @lucydot We'd appreciate it if you could verify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Python review Shell TeX Track: 2 (BCM) Biomedical Engineering, Biosciences, Chemistry, and Materials
Projects
None yet
Development

No branches or pull requests

6 participants