Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update mdf_to_pytorch to current spec #10

Open
pgleeson opened this issue Apr 7, 2021 · 7 comments
Open

Update mdf_to_pytorch to current spec #10

pgleeson opened this issue Apr 7, 2021 · 7 comments

Comments

@pgleeson
Copy link
Member

pgleeson commented Apr 7, 2021

@patrickstock The current example of mdf json you use is different from the format in the current python api,

https://github.com/ModECI/MDFTests/blob/main/mdf_to_pytorch/example_mdfs/mlp_classifier/mlp_classifier.json

Some differences:

  • it uses lists for functions. Can you update those to use dicts?
  • A "value" attribute is required on the output nodes
  • reciever_port -> receiver_port

Ignoring for now the loading of the weights from h5 for params, it should be possible to load this and export to other formats already, e.g. graphviz with these changes, e.g. using python -m modeci_mdf.export.graphviz mlp_classifier.json 2 with the latest main of mdf (using a hacked version of your file):

Screenshot 2021-04-07 at 19 32 08

patrickstock added a commit that referenced this issue Apr 7, 2021
@patrickstock
Copy link
Contributor

@pgleeson I fixed and pushed item 2 and 3, but I thought we had agreed on lists to contain a node's functions so that an order is implicitly specified by their order in the list?

@pgleeson
Copy link
Member Author

pgleeson commented Apr 8, 2021

@patrickstock This was under discussion here: ModECI/MDF#18, but @kmantel seemed to be coming round to using dictionaries for these. The main point was that putting the functions in a certain order is not a guarantee that that is sufficient to work out the order of evaluation.

As a general point though it might be best to default to the usage/naming/conventions in the Python API if at all possible and wait until any differences with the written spec are hammered out in the issues: https://github.com/ModECI/MDF/labels/specification. Will make it easier to test compatibility of implementations across codebases.

@patrickstock
Copy link
Contributor

@pgleeson Not using lists works for me. Although if we want to unambiguously specify the order of execution of multiple functions in a single node, it seems like a "WhenFinished" condition would be necessary or some other similar control flow. My concern though is that we have only defined NodeSpecific and termination conditions in our docs, and I don't know where the conditions to specify the execution of intra-node functions would go, without breaking these out to being only single-function nodes themselves?

It seems to me we would need to add a category like "IntraNode" conditions, and then use dot-notion naming for these such as:
"Node.Function2":{"type": "WhenFinished", "kwargs":{"dependency":"Node.Function1"}}

If you agree with this I will refactor. If we want to postpone this idea, I can break everything out to single-function nodes for the purpose of this example.

@patrickstock
Copy link
Contributor

Just to clarify, I see how multiple functions to a given node work here: https://github.com/ModECI/MDF/blob/main/examples/Simple.json
Where the output port only specifies returning logistic_1, but my concern is if more than one function were to be run on a given node pass, how to specify that.

@kmantel
Copy link
Contributor

kmantel commented Apr 13, 2021

For Simple.json, the lines

for linear_1:
"variable0": "input_port1",

for logistic_1:
"variable0": "linear_1",

give linear then logistic. This seems better than what I expected in ModECI/MDF#18, but I don't see how this is any more specific in terms of ordering than just ordering in a list. I think if (in the old version) the spec was roughly

[
    "custom_1": "3 * linear_1()",    
    "linear_1": {...},
]

then this is just an invalid model, rather than an ambiguous but valid model.

Either way, It seems fair to require any intra-node functions to execute in an explicit order, once each, specified using this variable0 key, and that if any more complex patterns are needed then the functions should belong to separate nodes.

@pgleeson
Copy link
Member Author

simple_example gv

The above is the graphical representation of the https://github.com/ModECI/MDF/blob/main/examples/Simple.yaml example.

Clearly there is an order in which variables inside the node need to be calculated: input_port1 -> linear_1 -> logistic_1 -> output_1. Even if the functions were mixed up in the yaml file, there would still be only one order in which a program can and should execute these (determined by the values of the attributes, i.e. linear_1 is neded by logistic_1, so evaluate linear_1 before it).

The main question is: If there is enough information in the set of function definitions to unambiguously determine the proper order, should we get the user to annotate/order them to make it easier for programs to execute them?

My personal preference is no; keep the specification files simple, and throw an error if it can't be determined form the values of the arguments. Helper methods in the API can be used for determining the order (just needs to be worked out once on model loading) and this can be used every time the node needs to be evaluated.

In this first version of MDF I would see that the full sequence of evaluations from inputs to functions to outputs in a node is always run to completion, so there should be no issue of anything else in the graph influencing whether a function is run or the order.

@pgleeson
Copy link
Member Author

FYI 1: I've added functionality to the simple scheduler here: https://github.com/ModECI/MDF/blob/cdae3be54e05253400c488758f23d0eb3989373a/src/modeci_mdf/simple_scheduler.py#L132-L157 to work out the proper sequence of function calls in a node. The functions can be written in the Graph in any order, but the EvaluableGraph that's returned here has the EvaluableFunctions in the correct order.

FYI 2: @patrickstock Even without incorporating these changes for functions, could you ensure the mlp_test.py example is working with your code and updated mdf script, and open a new PR? I can't get it running here, and would like to test running an equivalent version of the MLP in MDF built via the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants