-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interaction transformation method and Operon. #39
Comments
Hi, If I understand correctly the question is how to implement autodiff for your custom function.
This requires that all intermediate results are stored in the primal to be used later by the traces. In your I would suggest to implement your own Interpreter instead. You could use dual numbers directly and then you don't need multiple passes. You can probably adapt this code to your needs: operon/test/source/operon_test.hpp Line 210 in 28ac41b
|
I see, having seen the paper I think what you are saying makes sense a little bit. So in this case, should I store in primal the result of each interaction (meaning Additionally have you looked at this method of symbolic regression in the past, do you think if I manage to make it work the results will be worth it? |
@greenleaf641 as one of the authors of ITEA, I would say go for it :-D but I might be biased |
It would be fun I think, since I liked the idea. But for now I'm just changing Operon's existing code, if it works well, I should figure a way to add this as an additional option alongside Operon's traditional tree structure. |
Btw, you described the IT representation but pointed out to TIR paper, in TIR you would have something like:
and the expression would be
this would be a little bit more difficult to implement, but it often generates smaller and more accurate expressions. |
My mistake. No I meant the IT method [https://arxiv.org/pdf/1902.03983], I just linked the wrong paper somehow! |
Would I be correct in saying that in the case of IT, |
I think so, the derivatives for IT expressions are straightforward, like you said |
Your idea looks plausible, then you would only need the primal. I suggest starting with some small expressions and checking if the derivatives are correct, like we do in the unit test: https://github.com/heal-research/operon/blob/main/test/source/implementation/autodiff.cpp |
I think I can train now "somewhat correctly". I have a few questions though. Why does Operon pass arguments by value and not by reference on the mutation/crossover functions? Wouldn't it be more efficient to do the latter? By using this with the "Variations of vertical fluid" dataset as reference I get: Coefficient optimization happens with the scaled data so maybe this makes sense but just to be sure. |
Most of these operators work by performing some changes on a copy of the parent. Therefore, it makes sense to accept the arguments by value. One can also use
I am assuming that you are using the cli programs ( |
Correct, the cli nsgp program. But given that I've edited some things to work with IT, then maybe it could be a problem on my end. I need to investigate further since the returned function is not correct without the scaling factor. |
I am trying to see if it is possible to have Operon work not with a tree structure but by using the "Transformation Interaction representation". Briefly what this method hopes to do, is simplify the resulting functions by representing the population not as a tree but as a collection of multiple unaries acting on top of a polynomial. Something like:
weight1 * function1(x1^poly11 * x2^ poly12 * ...) + weight2 * function2(x1^poly21 * x2^poly22 * ...) + ...
I made a new generator that will only select functions with arity 1 and instead of a tree will generate a collection of genotypes that have the form described above:
I altered the
Interpreter
as well so it can evaluate the genotypes of this form; I added this function to the interpreter class:So then the new
ForwardPass
can evaluate them like this:but I'm not sure what to do when it comes to the
ForwardTrace
andReverseTrace
which I assume will be necessary for optimizing the coefficients later on.Why are we multiplying trace with dot, or trace with primal exactly, and do you have any idea on how these two functions should be altered to support transformation interaction instead?
The text was updated successfully, but these errors were encountered: