-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add function to check if there is a route with a given set of starting molecules #34
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A very general question: what would happen if we just set the is_purchasable
attribute to False
for all nodes apart from those in starting_mols
, and then computed the has_solution
attribute using run_message_passing
? Would that not give us what we want? Presumably this would give an affirmative answer even if a route exists that uses a subset of starting_mols
, so a related question would be whether we only care about routes that use exactly that set of leaves instead of its subset?
(Even more generally, I wonder how useful this graph-based formulation of this metric is: if a graph contains a combinatorial number of routes, then stating that there exists a route with the right set of leaves seems not so practically useful. But I guess it doesn't hurt to have a function to compute this.)
# Initialize to empty sets, except for nodes with purchasable mols | ||
node_to_mols: dict[ANDOR_NODE, set[Molecule]] = {n: set() for n in graph.nodes()} | ||
for n in graph.nodes(): | ||
if isinstance(n, OrNode): | ||
node_to_mols[n].add(n.mol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says "purchasable", but is this reflected in the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch: I used to check whether a molecule is purchasable but removed it to make the method more general (i.e. the nodes marked purchasable in the graph don't need to be the starting molecules). I will fix this by changing the comment.
You are right the I thought a bit about how to do this but could not think of a better way than what I coded: my impression is that a dynamic programming solution is not possible here because it there is no clear way to divide the problem into independent subproblems. |
I converted this to a draft PR since some experiments I am running are very slow. I might change the implementation. |
…es are solvable at all).
Just added some extra pruning code to increase the efficiency. I updated the main PR description to reflect this. |
I think I will close this PR because:
We can re-visit and re-open this PR later if we want to add this function. |
First fix: error message for converting an AND/OR graph to a synthesis route would say there was more than one node if in fact there were zero nodes. Second fix: reaction string in `conftest.py`. This was also fixed in #34, but since I am not merging that I decided to fix it here.
In the paper FusionRetro they proposed checking whether the starting molecules for a route found during multi-step planning match the starting materials used in an actual synthesis route. This is essentially a less strict version of checking whether an exact ground-truth route is found. This PR implements code to check for the existence of a route with a specific set of starting materials.
In FusionRetro this metric is called "set-wise exact match", however this name is very vague so I instead called it
is_route_with_starting_mols
. I only wrote a version for AND/OR graphs because these typically contain a prohibitively large number of routes to explicitly enumerate them. The method is essentially a brute-force search with pruning of search branches. The inner workings of the methods are described reasonably-well in the docstrings (but I want to improve the description before merging)I also fixed a random typo in a
conftest.py
docstring that I found while writing tests.TODO: