Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐈 Task: Upgrade descriptor models to newest version #1510

Open
11 of 12 tasks
GemmaTuron opened this issue Jan 15, 2025 · 5 comments
Open
11 of 12 tasks

🐈 Task: Upgrade descriptor models to newest version #1510

GemmaTuron opened this issue Jan 15, 2025 · 5 comments

Comments

@GemmaTuron
Copy link
Member

GemmaTuron commented Jan 15, 2025

Summary

Hi @DhanshreeA

For both the Abaumannii project and the refactoring of ZairaChem we want to use several descriptor models from Ersilia. We need those to be packaged with FastAPI and be sure they are working. Below the list:

  • ersilia-compound-embedding (eos2gw4)
  • cc-signaturizer (eos4u6p)
  • unimol-representation (eos39co)
  • grover-embedding (eos7w6n)
  • image-mol-embeddings (eos4avb)
  • kgpgt-embedding (eos8aa5)
  • datamol-basic-descriptors (eos4djh)
  • rdkit-descriptors (eos8a4x)
  • mordred (eos78ao)
  • morgan-counts (eos5axz)
  • erg-fingerprints (eos5guo)
  • whales-descriptor (eos3ae6)

Please can you confirm which models have already been successfully rebuilt? For example I think we were facing issues with eos78ao.
How would you go about refactoring the rest of them?

Objective(s)

Build the specified models with FastAPI and ensure they work using the latest workflow update

Documentation

No response

@DhanshreeA DhanshreeA self-assigned this Jan 15, 2025
@DhanshreeA DhanshreeA moved this from On Hold to In Progress in Ersilia Model Hub Jan 15, 2025
@GemmaTuron
Copy link
Member Author

GemmaTuron commented Jan 16, 2025

I have checked all the models that are marked and they work! Only missing then eos78ao and eos8aa5.

eos5guo still gives this error ate fetch time but it works:

8:47:40 | DEBUG    | Resolving pack method
18:47:40 | DEBUG    | Resolved pack method: fastapi
b8cb61d5cd292abc2fb647c5a3565019960b99c152adc63621e15bb21438eefd
Successfully copied 3.07kB to /home/gturon/eos/dest/eos5guo/information.json
a35fbf78-63a3-49e7-88e2-8f410db04a0f
18:47:40 | DEBUG    | Copying api_schema_file file from model container
968f2902be5db144bdf1f22f671c4ce43603216e21eda006b3478b931d9bc29e
Successfully copied 13.8kB to /home/gturon/eos/dest/eos5guo/api_schema.json
dae3c57d-6414-44ea-a68b-441ba874cb7d
18:47:41 | DEBUG    | Copying status file from model container
59ed0b6157c02dcdb8d0ccfe63a9c23d453e1d64e019785bee9ff551150933a2
Successfully copied 2.05kB to /home/gturon/eos/dest/eos5guo/status.json
71ceb5d6-acbf-4873-aa09-a9e9ba792b5f
27ddfcb30df3a63843f349c3f7cbdbdbc4fc7fa1fdec4a2f7d9b66550a78dcec
invalid output path: directory "/home/gturon/eos/dest/eos5guo/model/framework/examples" does not exist
c65249d4-8638-4a6b-9414-d8c722282e89
b848c2cbc57fe27d335a9b92dfcf3aff373ebdc282122f99a7b80c89edbe732e
invalid output path: directory "/home/gturon/eos/dest/eos5guo/model/framework/examples" does not exist
04891990-45fe-4ff7-b185-bd977a0b327a
968f1b7ad826369668e09ec974699e427b80759b787ef59d7396bd0dfb8a370e
invalid output path: directory "/home/gturon/eos/dest/eos5guo/model/framework" does not exist
fa3f1bc1-53a9-4927-91f0-48e8550d1925
ac3d9fc32facf9184b76e0e3e55403153a573f0a72d7521cdcb6c690ede287ca
invalid output path: directory "/home/gturon/eos/dest/eos5guo/model/framework" does not exist
5bb7a44f-8730-498d-ae43-a6ac20fcdd79
3713d94af125edbe0759b84a4e475237ea9e18c3181ed641e94d984ee9441ca5
Error response from daemon: Could not find the file /root/example.csv in container 7c302956-f149-47db-b3d3-60ba10d54692
7c302956-f149-47db-b3d3-60ba10d54692
18:47:43 | DEBUG    | Running standard CSV example
18:47:43 | DEBUG    | /home/gturon/eos/dest/eos5guo/example_standard_input.csv
18:47:43 | DEBUG    | /home/gturon/eos/dest/eos5guo/example_standard_output.csv
18:47:44 | DEBUG    | Usage: ersilia [OPTIONS] COMMAND [ARGS]...

  🦠 Welcome to Ersilia! 💊

Options:
  --version      Show the version and exit.
  -v, --verbose  Show logging on terminal when running commands.
  -s, --silent   Do not echo any progress message.
  --help         Show this message and exit.

Commands:
  auth       Log in to ersilia to enter contributor mode.
  catalog    List a catalog of models
  close      Close model
  delete     Delete model from local computer
  example    Generate input examples for the model of interest
  fetch      Fetch model from Ersilia Model Hub
  info       Get model information
  run        Run a served model
  serve      Serve model
  test       Test a model
  uninstall  Uninstall ersilia

18:47:44 | DEBUG    | No need to use Conda!

@DhanshreeA
Copy link
Member

Hey @GemmaTuron all of the models have been successfully rebuilt and pushed.

@GemmaTuron
Copy link
Member Author

Hi @DhanshreeA

Eos78ao is not working (see issue in repo: ersilia-os/eos78ao#14) and in any case we should try Mordred Community instead for this model

@DhanshreeA
Copy link
Member

I tried out eos78ao with mordred community, and it is definitely updated enough to not run into troubles with updated versions of numpy and networkx. However all descriptors do not calculate still. Interestingly though, when I ran a diff of these two files, I see some descriptors being calculated in the original version of mordred but not in the community fork, specifically, the following:

invalid value encountered in scalar divide (MDEC-24),3.3865457448640486,invalid value encountered in scalar divide (MDEC-34),invalid value encountered in scalar divide (MDEC-44),0.5000000000000001,invalid value encountered in scalar divide (MDEO-12),invalid value encountered in scalar divide (MDEO-22),invalid value encountered in scalar divide (MDEN-11),invalid value encountered in scalar divide (MDEN-12),invalid value encountered in scalar divide (MDEN-13),invalid value encountered in scalar divide (MDEN-22),invalid value encountered in scalar divide (MDEN-23),invalid value encountered in scalar divide

And this varies across molecules. For example,

  1. For the molecule, CC(C)CC1=CC=C(C=C1)C(C)C(=O)O, MDEC-14, MDEC-24, MDEC-34, MDEC-44, MDEO-12, MDEO-22, MDEN-11, MDEN-12, MDEN-13, MDEN-22, MDEN-23, MDEN-33 weren't calculated.
  2. For CC(C)CC1=CC=C(C=C1)C(C)C(=O)O, MDEO-11, and MDEN-33 weren't calculated
  3. For, COC1=CC23CCCN2CCC4=CC5=C(C=C4C3C1O)OCO5, MDEC-11, MDEC-44, MDEO-11, MDEN-11, MDEN-12, MDEN-13, MDEN-22, MDEN-23, MDEN-33 didn't get calculated.

It makes sense to go down an imputation approach for this by first fitting it entirely on a dataset like DrugBank, as @miquelduranfrigola has suggested.

output-mordred-comm.csv
output-mordred.csv

@DhanshreeA DhanshreeA removed the status in Ersilia Model Hub Jan 22, 2025
@DhanshreeA DhanshreeA removed their assignment Jan 22, 2025
@DhanshreeA DhanshreeA moved this to On Hold in Ersilia Model Hub Jan 22, 2025
@GemmaTuron
Copy link
Member Author

GemmaTuron commented Jan 26, 2025

Hi @DhanshreeA

Let me rewrite this comment with clarity, apologies if you have received an earlier version of this. I do not know if that belongs here or somewhere else but just because this issue was open and I am working with descriptors:

I am working on Ubuntu 24.04 LTS. I am running everything in two independent computers to verify the reproducibility of the issues. I have pulled the last Ersilia code and I have it installed in py 3.12 environment. I have found some issues that might, or might not, be related:

  1. Wrong decision at model fetch time. Upon fetching a model, it decides it is packed with BentoML not with FastAPI and hence the fetch crashes with the attached error log -- eos2gw4_attempts_bentoml.txt. This is happening in the NUC but not in Raluy, cannot figure out the cause - I understand this will be solved by adding this info as metadata, what are the timelineS? currently I cannot fetch models in my NUC because all this
  2. Session errors: while running into all these issues it is possible that some models do not exit properly. Then I cannot fetch them again due to open sessions. The only thing I can do here is manually sudo delete all the session files. Happens in both computers. Not critical but not nice, sessions should be cleaned. I think this is something being worked on?
  3. Error messages from PythonAPI not clear. It happened I was copying a piece of code for testing that did not have the ".csv" at the end of the file input, instead of telling me there was no such file the pipeline generated an output file that stated the following:
key,input,outcome
UNPROCESSABLE_INPUT,UNPROCESSABLE_INPUT,"[None, None, None, None, ...]

That was very confusing and I propagated the error when testing in different envs and so on. Probably this calls for the need to work a bit on the Python API. As we integrate Ersilia within more "experimental"pipelines which will always go through the Python API, this becomes more needed.

The sessions error is:

(zairadescribe) gturon@raluy:~/github/ersilia-os/zairachem-docker/02_describe$ ersilia close
No model was served
(zairadescribe) gturon@raluy:~/github/ersilia-os/zairachem-docker/02_describe$ ersilia delete eos2gw4
Traceback (most recent call last):
  File "/home/gturon/miniconda3/envs/zairadescribe/bin/ersilia", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/home/gturon/miniconda3/envs/zairadescribe/lib/python3.12/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gturon/miniconda3/envs/zairadescribe/lib/python3.12/site-packages/click/core.py", line 1082, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/gturon/miniconda3/envs/zairadescribe/lib/python3.12/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gturon/miniconda3/envs/zairadescribe/lib/python3.12/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gturon/miniconda3/envs/zairadescribe/lib/python3.12/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gturon/github/ersilia-os/ersilia/ersilia/cli/commands/__init__.py", line 31, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/gturon/github/ersilia-os/ersilia/ersilia/cli/commands/delete.py", line 94, in delete
    _delete_model_by_id(model_id)
  File "/home/gturon/github/ersilia-os/ersilia/ersilia/cli/commands/delete.py", line 37, in _delete_model_by_id
    can_delete, reason = md.can_be_deleted(model_id)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gturon/github/ersilia-os/ersilia/ersilia/hub/delete/delete.py", line 604, in can_be_deleted
    remove_session_dir(mdl_session)
  File "/home/gturon/github/ersilia-os/ersilia/ersilia/utils/session.py", line 105, in remove_session_dir
    shutil.rmtree(session_dir)
  File "/home/gturon/miniconda3/envs/zairadescribe/lib/python3.12/shutil.py", line 759, in rmtree
    _rmtree_safe_fd(stack, onexc)
  File "/home/gturon/miniconda3/envs/zairadescribe/lib/python3.12/shutil.py", line 703, in _rmtree_safe_fd
    onexc(func, path, err)
  File "/home/gturon/miniconda3/envs/zairadescribe/lib/python3.12/shutil.py", line 674, in _rmtree_safe_fd
    topfd = os.open(name, os.O_RDONLY | os.O_NONBLOCK, dir_fd=dirfd)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PermissionError: [Errno 13] Permission denied: '/home/gturon/eos/sessions/session_374161/_logs/tmp/ersilia-fuo0nacl'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: On Hold
Development

No branches or pull requests

2 participants