-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
T018 Pipeline #125
T018 Pipeline #125
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Hi @corey-taylor, Thanks a lot for your edits on T018! 🚀 I started with stream-lining the text format with the other notebooks, e.g.
Before I continue with the content review, could you please go through the notebook and
Do ping me if in doubt regarding the text format. |
Yup, wasn't sure whether to use html or markdown as the original had a mix of both, hence why I made it consistent at least. Cheers for clarifying and for providing the template. |
All the HTML is now removed, other than the hyperlink tags you mentioned. On execution times, there were already notes with each cell where it takes a while to run but without specifically saying something like Note: this will take x minutes to complete. Is this okay with you? |
@corey-taylor I went through the theory part, reads well, just tried to shorten it further. I mainly included a Small things that need to be checked:
|
@AndreaVolkamer, @corey-taylor Figure 2 is copied from an old lecture slide that I had, but there was no reference for the figure in the slide. Figure 4 is a screenshot from the DoGSiteScorer website, which is mentioned in the caption as "...as detected by the DoGSiteScorer web-service". Figure 8 is a screenshot from the PLIP website, indicated in the caption as "...detected by the PLIP web-service". All other figures are made by myself; that's why there is no reference in the captions. Also, I've noticed that some figures have been removed, but the figure numbering has not been updated, so there are some gaps between the figure numbers. [-> @Armin-Ariamajd: regarding your figure numbering comment, note, I did not remove figures, but some are hidden in the details section] |
@corey-taylor (and @Armin-Ariamajd), nicely done. Again, I mostly shortened the text a bit and added the I do have a few questions or comments for you:
Note:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@corey-taylor ready for you again :) Maybe coordinate with @Armin-Ariamajd he might be able to take over some of the questions/comments ...
Hi @Armin-Ariamajd, Thank you for your offer, that would be great indeed! I won't work on this PR until Thursday. If you are free in the meantime, you could take over something from this list:
If you are making changes, could you please push them until Thursday morning or let me know to hold off for a bit longer so that we do not run into conflicts? Thanks again for your help! |
Proteins.Plus is back online; so rolled back the BindingSiteDetection section to its original form
there is no need for an ```else``` statement here, because the ```if``` clause in wrapped inside a while loop with an ```else``` statement. Basically, if the ```if``` statement evaluates to ```True``` then it breaks out of the while loop, otherwise the loop counter is increased by 1. At the end, when the loop counter reaches the defined value (without the ```if``` statement ever evaluating to ```True```), then it executes the ```else``` statement, i.e. it raises a ```ValueError```
corrected the docstring of the ```init``` method;
added a new optional parameter called ```frozen_data_filepath```, which can be used to provide a filepath of a CSV file containing the frozen analogs' data (i.e. CID and CanonicalSMILES), in which case, instead of querying the PubChem server, the class instance is built using the frozen data.
Added a demonstration for the ```similarity_search``` function of the ```pubchem``` module.
removed a redundant import of the ```LeadOptimizationPipeline``` class
corrected typo
activated BindingSiteDetection (rolled it back to the original form since ProteinsPlus is back online). Added ```frozen_data_filepath``` parameter to the ```run``` function, which passes it to the ```LigandSimilaritySearch``` class.
Thanks a lot, @Armin-Ariamajd, for solving so many open TODOs!! I am summarizing here the last bits before we can merge :)
|
@dominiquesydow and I will knock these off today, hopefully. |
@dominiquesydow, @corey-taylor, You're welcome; I'm not completely finished though. |
Hi @Armin-Ariamajd, could you post a checklist of things that you still would like to do, please? |
Hi @dominiquesydow, I'm in the middle of modifying the talktorials to run with the frozen CIDs, so there are some changes that are still not finalized and I haven't pushed yet. Also in the initial test after freezing the CIDs it seemed to me that the results are still not completely deterministic. So I have to take a closer look at that as well. I'm in the middle of a lecture right now. Today is a bit busy day for me; I'll try to finalize and push the changes later today in the evening, or tomorrow evening at the latest, if that's okay with you. |
Hi @Armin-Ariamajd, Alrighty, we will wait :) By tomorrow morning
|
adapted the notebook for the new way of freezing ligands for both runs.
corrected spacing in the list in "Aim of this talktorial"
proofread again and corrected some wording and formatting
Hi @dominiquesydow, So far, I have:
The still remaining issues are:
Please feel free to let me know if there is anything else I can help you with. P.S. I could not resolve the conflicts, as they do not show up for me on my local machine. |
Hi @Armin-Ariamajd, Thanks a lot for all the work you put in!! We will take it from here and try to resolve the last outstanding issues.
I like the new setup!
Perfect, thanks!
Thanks for boiling the problem down to this. My first though was as well to freeze the PDBQT files. I will give some more thought.
Looking into this, thanks.
This is a warning from MDAnalysis < 2.0.0; since our environment requires >= 2.0.0 we should be fine. We'll check both warnings with a fresh |
Can't see a way around freezing the pdbqt's. At least with something like Omega, the starting point is the same i.e. if you ask for 50 conformers in one run then 2000 in another, the first 50 of the latter should be identical. This is as opposed to Obabel where if you do multiple runs of 50, each batch of 50 will be slightly different. For rigid docking, this is a really irritating problem as, unlike MD, it's meant to be deterministic. Cheers for hunting it down, @Armin-Ariamajd. |
Closing this PR; superseded by #179. |
Note: This PR has been superseded by #179
Details
Tech
Content review
utils.py
script to make reading the notebook a bit more fluenthere
.DataFrames
)Code review
a_variable_name
vsaVariableName
)black -l 99
)for i in range(len(list))
(see slides)# NBVAL_CHECK_OUTPUT
import ...
lines are at the top (practice part) cell, ordered by standard library / 3rd party packages / our own (teachopencadd.*
)opencadd
is not an in-house package any more; it is a third party packageflake8
on the Python file andflake8-nb
on the Jupyter notebook