Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean UP & Dockerization eos8451 #1

Closed
GemmaTuron opened this issue Jul 11, 2023 · 12 comments
Closed

Clean UP & Dockerization eos8451 #1

GemmaTuron opened this issue Jul 11, 2023 · 12 comments
Assignees

Comments

@GemmaTuron
Copy link
Member

No description provided.

@GemmaTuron GemmaTuron converted this from a draft issue Jul 11, 2023
@GemmaTuron GemmaTuron moved this to Suggested in Ersilia Model Hub Jul 18, 2023
@simrantan
Copy link
Collaborator

I have looked into the workflow error for this model and it appears to be an package issue.

UnsatisfiableError: The following specifications were found to be incompatible with a past
explicit spec that is not an explicit spec in this operation (ld_impl_linux-64):

  - rdkit=2021.03.4 -> boost[version='>=1.74.0,<1.74.1.0a0'] -> libgcc-ng[version='>=10.3.0|>=12|>=7.5.0|>=9.4.0|>=7.3.0|>=4.9|>=11.2.0|>=7.2.0|>=8.4.0']
  - rdkit=2021.03.4 -> boost[version='>=1.74.0,<1.74.1.0a0'] -> libstdcxx-

We can see in the workflow that the error originates from the rdkit version creating an unstasfiable error, and then further that the DescriptaStorus package that the model needs is also failing because of rdkit.


error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [2 lines of output]
      Descriptastorus requires rkdit to function, this is not installable by pip
       see https://rdkit.org/ for more information
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

To solve this, I looked at the source code for DescriptaStorus and found it requires rdkit 2022.3.3, so I changed the line in the dockerfile to this version of rdkit and used pip install since Descriptastorus seems to prefer it. I ran the model locally with these changes and it worked, so I have moved on to refactoring.

@simrantan
Copy link
Collaborator

@GemmaTuron
I have refactored the model but I had a question - main.py is in the folder grover, which contains files that main.py uses. I put the folder grover in the code folder, but I was wondering if the preferred structure is putting main.py in the code folder only and seperating from the folder grover. If so, I can make the change, I just wanted to clarify!

@GemmaTuron
Copy link
Member Author

yes, that would be the prefered option: code/main.py
Thanks for checking!

@simrantan
Copy link
Collaborator

Of course!
I moved main.py out and started working on fixing the paths in the file. This change caused the model to stall out, and I am working on finding the solution. I am leaving it to run for the next several hours to see if it will return, and am examining my files to see where the error might have occurred, though I think it is rooted in the path changes.

@GemmaTuron
Copy link
Member Author

Hi @simrantan

Please update this issue when you can

@GemmaTuron GemmaTuron moved this from Suggested to In Progress in Ersilia Model Hub Jul 25, 2023
@simrantan
Copy link
Collaborator

Yesterday, I finally got the fetch to finish running after finding a minor syntax error (missing space) in the service.py file ( a bug I have gotten before on a different model that is difficult to spot)
I am now working on figuring out this issue, which is somewhat confusing because it appears the problem takes place in a file I don't have access to.

File "/home/simran/ersilia/ersilia/cli/commands/fetch.py", line 73, in fetch
   _fetch(mf, model_id)
 File "/home/simran/ersilia/ersilia/cli/commands/fetch.py", line 12, in _fetch
   mf.fetch(model_id)
 File "/home/simran/ersilia/ersilia/hub/fetch/fetch.py", line 348, in fetch
   self._fetch_not_from_dockerhub(model_id=model_id)
 File "/home/simran/ersilia/ersilia/hub/fetch/fetch.py", line 270, in _fetch_not_from_dockerhub
   self._pack()
 File "/home/simran/ersilia/ersilia/hub/fetch/fetch.py", line 211, in _pack
   mp.pack()
 File "/home/simran/ersilia/ersilia/hub/fetch/actions/pack.py", line 61, in pack
   self._setup()
 File "/home/simran/ersilia/ersilia/hub/fetch/actions/pack.py", line 23, in _setup
   ServiceFile(folder).rename_service()
 File "/home/simran/ersilia/ersilia/hub/bundle/repo.py", line 63, in rename_service
   with open(file_name, "r") as f:

@GemmaTuron
Copy link
Member Author

Hi @simrantan

Can you:

  • Add the whole error log
  • Explain the syntax error, where is it and what exactly you need to change?

@simrantan
Copy link
Collaborator

Hi, yes!

The syntax error I received was a missing space at the beginning of this line:
lines = ["bash {0}/run.sh {0} {1} {2}".format(
This caused the no return error.

I have been working on the paths, and I found that the service.py error is no longer occuring but this new error is:

 Detailed error:
Model API eos8451:run did not produce an outputWARNING:root:No normalization for BCUT2D_MWHI
WARNING:root:No normalization for BCUT2D_MWLOW
WARNING:root:No normalization for BCUT2D_CHGHI
WARNING:root:No normalization for BCUT2D_CHGLO
WARNING:root:No normalization for BCUT2D_LOGPHI
WARNING:root:No normalization for BCUT2D_LOGPLOW
WARNING:root:No normalization for BCUT2D_MRHI
WARNING:root:No normalization for BCUT2D_MRLOW
[WARNING] Horovod cannot be imported; multi-GPU training is unsupported

  0%|          | 0/3 [00:00<?, ?it/s]
 33%|███▎      | 1/3 [00:00<00:00,  5.02it/s]
100%|██████████| 3/3 [00:00<00:00, 14.95it/s]
Loading training args
Traceback (most recent call last):
  File "/home/simran/eos/repository/eos8451/20230726023000_56DD88/eos8451/artifacts/framework/code/main.py", line 80, in <module>
    avg_preds, test_smiles = make_predictions(args, train_args)
  File "/home/simran/eos/repository/eos8451/20230726023000_56DD88/eos8451/artifacts/framework/code/task/predict.py", line 96, in make_predictions
    scaler, features_scaler = load_scalars(path)
  File "/home/simran/eos/repository/eos8451/20230726023000_56DD88/eos8451/artifacts/framework/code/grover/util/utils.py", line 736, in load_scalars
    state = torch.load(path, map_location=lambda storage, loc: storage)
  File "/home/simran/miniconda3/envs/eos8451/lib/python3.7/site-packages/torch/serialization.py", line 594, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/home/simran/miniconda3/envs/eos8451/lib/python3.7/site-packages/torch/serialization.py", line 230, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/home/simran/miniconda3/envs/eos8451/lib/python3.7/site-packages/torch/serialization.py", line 211, in __init__
    super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '/home/simran/eos/repository/eos8451/20230726023000_56DD88/eos8451/artifacts/framework/finetune/esol/fold_0/model_0/model.pt'


Hints:
- Visit the fetch troubleshooting site

If this error message is not helpful, open an issue at:
 - https://github.com/ersilia-os/ersilia
Or feel free to reach out to us at:
 - hello[at]ersilia.io

If you haven't, try to run your command in verbose mode (-v in the CLI)
 - You will find the console log file in: /home/simran/eos/current.log

@GemmaTuron
Copy link
Member Author

Hi @simrantan

I will not be able to help out until you add the whole log of the error, please.

@simrantan
Copy link
Collaborator

sorry! This is the whole error log.
error log 8451.txt

I actually found that the source of the error is a path problem in main.py that is caused because of folders that moved, so I am working on fixing it and almost done. Will update soon!

@simrantan
Copy link
Collaborator

simrantan commented Jul 27, 2023

I fixed the path issues I was dealing with for the finetune folder, and I'm now working on this issue:

100%|##########| 3/3 [00:11<00:00,  4.05s/it]
100%|##########| 3/3 [00:11<00:00,  3.91s/it]
Saving predictions to /home/simran/eos/repository/eos8451/20230727081117_B64FE8/eos8451/artifacts/framework/example_output.csv

error log:
errorlog8451output.txt

It seems that it is taking sys.argv[2] as an output_path in this code snippet:

# Initialize MolVocab
   mol_vocab = MolVocab

   input_txt_path =  sys.argv[1]
   output_path = sys.argv[2]
   csv_path = smiles_to_dataframe(input_txt_path)

However, this path did work when I first tested this model after fixing the workflow error (rdkit install), and has only stopped working with my refactoring work.
I tested on bash to see if would work with an output outside of the fetch statement, and found this:


WARNING:root:No normalization for BCUT2D_MWHI
WARNING:root:No normalization for BCUT2D_MWLOW
WARNING:root:No normalization for BCUT2D_CHGHI
WARNING:root:No normalization for BCUT2D_CHGLO
WARNING:root:No normalization for BCUT2D_LOGPHI
WARNING:root:No normalization for BCUT2D_LOGPLOW
WARNING:root:No normalization for BCUT2D_MRHI
WARNING:root:No normalization for BCUT2D_MRLOW
[WARNING] Horovod cannot be imported; multi-GPU training is unsupported
Traceback (most recent call last):
  File "eos8451/model/framework/code/main.py", line 78, in <module>
    sf.save_features_main(csv_path,features_path)
  File "/home/simran/eos8451/model/framework/code/scripts/save_features.py", line 131, in save_features_main
    generate_and_save_features(args)
  File "/home/simran/eos8451/model/framework/code/scripts/save_features.py", line 54, in generate_and_save_features
    data = get_data(path=args.data_path, max_data_size=None)
  File "/home/simran/eos8451/model/framework/code/grover/util/utils.py", line 215, in get_data
    data = filter_invalid_smiles(data)
  File "/home/simran/eos8451/model/framework/code/grover/util/utils.py", line 132, in filter_invalid_smiles
    if mol.GetNumHeavyAtoms() == 0:
AttributeError: 'NoneType' object has no attribute 'GetNumHeavyAtoms'

So there is a possibility the error actually originates from this nonetype error, which would also create an empty output (the main error I am getting)
I am looking into why mol would be empty.

@simrantan
Copy link
Collaborator

Using a single input (in a csv file) I've found that the bash/run.sh works!

However, the fetch issue still exists. This shows that there likely isn't an error with paths which is good, but it seems like the issue could be in the inputs being fed to the model. Since the fetch is using inputs from ersilia, I am unsure how to work around this error - if this model only works on some smiles and not others, how can I still get ersilia to fetch it?

Should I work on getting the model to work on all kinds of smiles?

Error log
error log 8451.txt
:

@GemmaTuron GemmaTuron moved this from In Progress to Reviewed in Ersilia Model Hub Aug 1, 2023
@github-project-automation github-project-automation bot moved this from Reviewed to Done in Ersilia Model Hub Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants