Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Bug: Test command failing? #1487

Closed
GemmaTuron opened this issue Jan 3, 2025 · 19 comments
Closed

🐛 Bug: Test command failing? #1487

GemmaTuron opened this issue Jan 3, 2025 · 19 comments
Assignees
Labels
bug Something isn't working

Comments

@GemmaTuron
Copy link
Member

Describe the bug.

@Abellegese please check when you can, I do not know if I have something set up wrong in my system or there is a version problem.

I am trying the ersilia test command but I get an error when trying to test eos3b5e (molweight) I haven't had time to have a detailed look but here is the log. Same happened with another model I used as example, eos5guo. I have the latest ersilia code pulled locally and installed the [test] extension. Working on Ubuntu.

eos3b5e_test.txt

Describe the steps to reproduce the behavior

No response

Operating environment

Ubuntu 22.04 LTS

@Abellegese
Copy link
Contributor

Abellegese commented Jan 3, 2025

I will add more detail soon. But for now I found out it was bentoml issue on py3.10 version on py3.8 it works perfectly. The error happened because bentoml produce empty result when generating info file for the model for given bundle tag. When you even simply type bentoml list on the terminal produce no output on py3.10 but on py3.8 list all bundle tags found locally.

@Abellegese
Copy link
Contributor

Abellegese commented Jan 3, 2025

@DhanshreeA has more knowlede on this. But we will tackle it together. I remeber we were debugging same issue a month or 3 weeks ago. For now to solve this, reinstall latest ersilia on py3.8.

conda activate ersilia
conda install python=3.8
pip install -e .

@GemmaTuron
Copy link
Member Author

GemmaTuron commented Jan 4, 2025

Hi @Abellegese, thanks. The Py3.8 did the trick for eos3b5e but not for eos5guo, and even then both models eventually fail to run which worries me.

A few questions:

  • Why is the printed log saying "ERROR" all the time instead of "INFO" when there does not seem to be any error?
    eos3b5e_log_error.txt
  • From the documentation it is not clear what does the --inspect flag do. Can you please explain the difference between running ersilia test eos3b5e -d eos3b5e --remote --inspect and ersilia test eos3b5e -d eos3b5e --remote. I see the "inspect" table but I do not understand why we need a flag for that, couldn't we just do with test and test deep? In any case this is not well documented. eos3b5e_remote_inspect.txt
  • In the command ersilia test eos3b5e -d eos3b5e --remote I see a Test Run Summary. Does this command actually test the model or not? From the documentation it shouldn't but apparently it does? Or it just gives False to "single input run without error" because it has actually not run the command?
    eos3b5e_remote.txt
  • When I try to add the --level deep command it crashes I do not know if because of the model or the test. The model is givig an error in the inspect but it should not, is the molecular weight!
  • I have tried to check for another model but it directly does not fetch. Please have a look
    eos5guo_remote_inspect.txt.
  • is there a flag to save the output of the test directly on a file so I don't have to copy it each time? if so it is not documented. If we don't have this functionality we should add it

@Abellegese
Copy link
Contributor

Abellegese commented Jan 4, 2025

Hi @GemmaTuron that was quite useful look up into the test command. I did some study on the errors and here is the answer (some with fix) below:

Q1) There was a bug there and its fixed now. It was logger.error and instead of logger.info, nice catch.
Q2) We consolidated the inspect command and test command and add inspect command as extension (keeping checks that test command does not have). They are listed as below:

  • Model repository existence (URL cheks)
  • Complete metadata i.e.
    - Determine metadata URL:
    Uses either the JSON metadata file or the YAML metadata file based on the packaging method.
    - Check URL existence:
    Verifies that the appropriate metadata file is accessible at the constructed URL.
    - Fetch metadata:
    Tries to parse the file contents as JSON (or YAML), returning an error if it cannot.
    - Find missing fields:
    Checks for required fields (Publication, Source Code, S3, DockerHub) and notes any that are missing.
    - Check URL validity:
    For each of the required fields in the metadata, checks if the associated URL is valid.
    - Report findings:
    Collects missing fields and invalid URLs in a details list for reporting in the result.
  • Complete Folder Structure: This checks if all necessary files existed in the model repository
  • Dockerfile checks: This checks if all the dependency can be installed in specified python version without any version issues. This is for bentoml model types. For ersilia-pack type it will check yml file.
  • Computational Performance Tracks: It checks time spent to run the model for 1, 10, 100 inputs
  • Extra Files Check: Checks if unnecessary extra files existed in the model repo.

As you can see the original design decision behind inspect command was to checks model repos after they published and we somehow made it also to work locally. And I don't know how you managed to see the Inspect Summary table without adding the --inspect flag but its not displaying here. May be making the inspect command a default check is a good idea instead of putting it always, since most of the checks are necessary by default.

Q3) The test summary is about the status of weather the checks are ran and not. This was intended to spot checks that are not executed (for instance in the case of command without -l deep flag). You are absolutely right about the Single Input Run without Error and also other checks that have Run without Error in this table. Its fixed now like this:

  • From Run without Error to Check Exectuted
    • eg. Single Input Check Executed and if Status is False for this means not executed. Makes sense now
  • Table header changed from Test Run Summary to Test Checks Running Status Summary.
  • For this question " Does this command actually test the model or not? " yes it does but high level checks mentioned in the documentation. It does not run any model but checks all high-level metadata checks. Thats why this check is fast. We decided to put this as a flag, to give user a choice. So this are the check performed by deep checks:
  • Checks single input example
  • Run example command for this model
  • Consistency checks
  • Run bash commands (running model in its isolated env, without ersilia)

Q4) This was a bug in the example command (its maybe a feature but upgraded recently I guess)
- Up until I pushed final changes of testcommand, when you run example command from python API as below

def run_exampe(
    self,
    n_samples: int,
    file_name: str = None,
    simple: bool = True,
    try_predefined: bool = False
):
    """
    Generate example input samples for the model.

    Returns
    -------
    list
        List of generated input samples eg ["CCO", ...].
    """
    return self.example.example(
        n_samples=n_samples,
        file_name=file_name,
        simple=simple,
        try_predefined=try_predefined

This will return a list of smiles like this:

["CC[n+]1c(-c2ccccc2)c2cc(N)ccc2c2ccc(N)cc12", "CN1C(=O)N[C@H](Cc2c[nH]c3c(Cl)cccc23)C1=O", ...]

So single input example and other checks that needs example sample assumes this above data structure. Now I found out that is has this structure below:

[{'input': 'CCO[C@H]1O[C@@H]2O[C@@]3(C)CC[C@H]4[C@H](C)CC[C@@H]([C@H]1C)[C@@]24OO3'}, {'input': 'Cc1cnc(c[n+]1[O-])C(O)=O'}, ...]

This caused the error and it FIXED now.

Q5) The problem was not from ersilia it was from this errors. It seems it came from the model itself:

Traceback (most recent call last):
  File "/home/abellegese/eos/repository/eos5guo/20250104180906_0463B7/eos5guo/artifacts/framework/code/main.py", line 26, in <module>
    with open(input_file, "r") as f:
         ^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/home/abellegese/eos/repository/eos5guo/20250104180906_0463B7/eos5guo/artifacts/framework/example_input.csv'

The other error was:

Traceback (most recent call last):  File "/home/abellegese/eos/repository/eos5guo/20250104180906_0463B7/eos5guo/artifacts/framework/code/main.py", line 6, in <module>
    from rdkit import Chem
  File "/home/abellegese/miniconda3/envs/eos5guo/lib/python3.11/site-packages/rdkit/Chem/__init__.py", line 16, in <module>
    from rdkit.Chem import rdchem
AttributeError: _ARRAY_API not found

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.2.1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

This happened because due to an incompatibility between the version of NumPy installed in eos5guo environment (2.2.1) and the compiled extensions (e.g., RDKit) that rely on an earlier version of NumPy (1.x).

Q6) To get the output as .txt file you can simply use > log.txt as full command below:

ersilia test .... > log.txt

@Abellegese
Copy link
Contributor

The fixes will be pushed along with the new updates.

@GemmaTuron
Copy link
Member Author

Hi @Abellegese

Thanks for the clarifications. Some comments to see if I understood:

  • the inspect command is doing checks that can only be made once the model is merged into the eos repository and uploaded to dockerhub and S3 since otherwise these metadata fields are empty. That makes sense, this should be implemented in thelast step of the model workflows only. All this needs to be documented.
  • about Q3, indeed the False was misleading so better to have it as you suggest
  • About Q4, and looping in @DhanshreeA here. When was the example command output changed, and why this error was not catched? Whenever these basic things are modified there should be extensive testing of all the functions that use those commands. What other places the example command is being used that may now not work?
  • About Q5 and Q6, also for @DhanshreeA not all the models have an example_input file, what happens in that case? is this created automatically? And about the dependencies, this model works in principle so I do not understand why it does have dependency issues now.

I understand @Abellegese that all this is using PY3.8 only? this should also be fixed as soon as possible.
Let me know once the changes are merged so I can test again.

@Abellegese
Copy link
Contributor

Abellegese commented Jan 5, 2025

Hi @GemmaTuron I will try to push this changes. On ersilia working 3.8 currently, am looking into the ruff refactoring I made (about 233 file formatted ) to see if I made any code change. I checked 189 files but could not find any code change except formating and linting, removing unused variables and imports, fixing variable name etc. I will check the rest of the files if I made a mistake. We will fix this as soon as possible.

Also you understood it correctly the inspect command I explained.

@GemmaTuron
Copy link
Member Author

Another question for discussion: should we add a flag to save the output of the test to a more parsable file, for example a .json file, rather than just outputing the log in txt? This could come in handy for workflows maybe. Let me know your thoughts!

@Abellegese
Copy link
Contributor

Yes thats sounds good. It currently reports as a table if you don't add any -v on ersilia command, makes it clean to see the report. But all necessary logs won't shown if something is failed for instance. So we can track down all logs and put them as json.

@GemmaTuron
Copy link
Member Author

GemmaTuron commented Jan 5, 2025

but the table is saved in TXT format? Wouldn't it be better to convert the tables to a JSON format for automatic parsing? Then you can get a notification with the specific test that failed. I guess you could do that from a txt as well but if properly formatted the Json or yml would be more versatile?

@Abellegese
Copy link
Contributor

The test command is not currently doing any saving outputs. What you saw previously was saved using the linux command ersilia test ... > result.txt redirecting terminal output to the .txt file. But I used a code (in ersilia-maintenance) that converts output as json file this for instance:

{
   "Is Github Url Available": {
      "Status": true,
      "Details": "Is Github Url Available Repository exists.Details"
   },
   "Complete Metadata": {
      "Status": true,
      "Details": "Complete Metadata Details Metadata is complete."
   },
   ...
}

We can use such methods to get detailed json parsed report as well. But we may want keep displaying reports as table format in the terminal, right @GemmaTuron.

@GemmaTuron GemmaTuron moved this from On Hold to In Progress in Ersilia Model Hub Jan 7, 2025
@GemmaTuron
Copy link
Member Author

More questions:

  • Is the error in the eos5guo model related to its internal versions or clashes with the Ersilia conda env itself? It was not clear to me. Why are some package versions hardcoded (for example numpy >=1.26) in the ersilia[test] installation but not for other packages (like rdkit). Numpy upgraded to 2 so soon we should move onto 2 as well?

@DhanshreeA
Copy link
Member

DhanshreeA commented Jan 7, 2025

Re Q4 - Example command updates - this was done quite some time back as part of CLI clean up. Referencing the relevant PR: #1430 In all places, the example command is being used through the CLI, and not the Python API, explaining one way this slipped through.

@DhanshreeA
Copy link
Member

Re Q5 and Q6:

  • Example files: This is something that we're adding as part of both refactoring for ersilia pack and the new annotation that Miquel is carrying out. In case there's no example input file, the example command will default to generating the input.
  • This model was built quite some time back, so what you see working is its Dockerfile. The Numpy dependency isn't specifically mentioned in the Dockerfile (it should be otherwise this model will never work from source). I'll update this.

@DhanshreeA
Copy link
Member

Re 3.8 and 3.10: That one is throwing me off as well, I am trying to reproduce this and nail it down.

@GemmaTuron
Copy link
Member Author

Re Q4 - Example command updates - this was done quite some time back as part of CLI clean up. Referencing the relevant PR: #1430 In all places, the example command is being used through the CLI, and not the Python API, explaining one way this slipped through.

yeap, that makes sense. We need to introduce stronger testing for the Python API as we use it quite a lot in some applications.

@GemmaTuron
Copy link
Member Author

@Abellegese and @DhanshreeA
Once the test command is refactored, re-visit the issues I found to ensure they are solved in the new version.

@Abellegese
Copy link
Contributor

Okay @GemmaTuron we will.

@GemmaTuron
Copy link
Member Author

I'll close this issue as this is already being tackled in the refactoring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

3 participants