-
Notifications
You must be signed in to change notification settings - Fork 18
Adding supporting code for blog post #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
function layer = fourierLayer(tWidth,numModes,args) | ||
|
||
arguments | ||
tWidth | ||
numModes | ||
args.Name = "" | ||
end | ||
name = args.Name; | ||
|
||
net = dlnetwork; | ||
|
||
layers = [ | ||
identityLayer(Name="in") | ||
spectralConvolution1dLayer(tWidth,numModes,Name="specConv") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer the arguments here to be the other way around, fourierLayer(numModes, tWidth)
and spectralConvolution1dLayer(numModes, tWidth)
. I think that has a better analogy to convolution1dLayer(filterSize, numFilters)
.
The numFilters
and tWidth
are analogous in that they control some hidden size of some linear operation with the weights. The numModes
and filterSize
are analogous in that they control the size/resolution of the convolution filter in some sense, and both can have N entries for an N-d conv or spectral conv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to Ben's comment about the ordering of the inputs!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to (numModes, tWidth) in the most recent commit. Btw I got this from the doc example: https://www.mathworks.com/help/deeplearning/ug/solve-pde-using-fourier-neural-operator.html, should it be changed there too?
@@ -0,0 +1,97 @@ | |||
classdef spectralConvolution1dLayer < nnet.layer.Layer ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect we could use the same version here and in the 1d FNO example - maybe we should consider a directory of shared tools for the examples.
That's something we'll have to look into in future I think, not something to try deal with in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome Mae!
I still need to review the other MLX files, but I wanted to post my initial comments first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it frustrating working with MLX files on GitHub because no-one can see the code. This also makes it hard to review.
My personal preference is to work locally in the MLX, then generate a markdown version which is added as a README
, a plain M-code version so that the raw code is readable, and (optionally) the MLX file itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
$\omega_0$ is undefined in the intro. - There's something wrong with this sentence "Using automatic differentiation, the enforces uses a custom loss function"
- There is a comment that once the network is trained predictions are made by numerically approximating Hamilton's equations. Is that true? The AD derivatives are exact. The value of the derivative is a numerical approximation, in the same way that computing, say,
sin(2)
in MATLAB is a numerical approximation <-- but we don't often refer to the latter as an approximation. - Instead of transposing the inputs, just flip the labels. E.g. instead of
inputs = dlarray([q'; p'],"BC");
we can writeinputs = dlarray([q p],"CB");
.
-
- In 25a, and possibly earlier releases, it should be for the network to be unformatted -- i.e. to not have any labels on the data at all. I propose we do this. Let me know if I can help.
- Use the name
modelLoss
overmodelGradients
-- since the primary output of the function is the loss. - "The original ode45 can't use dlarray..." it can! What's not supported are formatted arrays -- all the more reason to use an unformatted network.
- Prefer camel case over underscores, e.g.
dqLoss
orlossDq
instead ofloss_dq
. - I don't like the function name
dlderivative
because it's not functioning as a deep learning specific variant of a derivative function, in the same way thatdlconv
is a DL variant ofconv
. Also the function isn't really computing a derivative, it's taking a Hamiltonian$H$ and a set of$q-p$ inputs and returning$H_p$ and$H_q$ . Looking at the code, it's almost always that$H_q$ isn't really what's needed -- instead it's$\dot{p}=-H_q$ So I think we should call the function something likecomputeDerivativesFromHamiltonian
and make the function return$\dot{p}$ and$\dot{q}$ . Since "compute" is kind of a redundant work we could even go withhamiltonianDerivatives
orstateDerivatives
? - Instead of
predmodel
how about plain oldmodel
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Conor, thank you so much, I made a bunch of these changes.
Regarding the formatting: I removed the formatting (I pushed these changes hopefully you can see them?) but now I have a bunch of transposes all over the place so not sure if I did this correctly. For example, in the function hamiltonianDerivatives (was dlderivative), I used to have
dH = dljacobian(H,qp,1)
. Now I have
dH = dljacobian(H,qp,2)
,
and I need to do
dHdq = dH(:,:,1)';
dHdp = dH(:,:,2)';
to compensate. Wasn't expecting them to become tensors? Maybe I should flatten them or something?
Further, when I go to make predictions by solving Hamilton's equations, I now need to transpose the x or else I get "was expecting input of size 2" error:
f = @(t,x) accOde(dlarray(t),dlarray(x'),net);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding "The original ode45 can't use dlarray..." -- I copied and pasted that from our HNN example on our Github: https://github.com/matlab-deep-learning/SciML-and-Physics-Informed-Machine-Learning-Examples/tree/main/hamiltonian-neural-network, which was created with 22b so it's good you are helping me update the code for new releases!
Do I still need to have the lines
accOde = dlaccelerate(@model);
and
'f = @(t,x) accOde(dlarray(t),dlarray(x'),net);'
before calling ode45? I tried 'f = @(t,x) accOdet,x',net);' and while this worked, it was much slower than converting them to dlarrays first in the function handle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most recent commits adds the readme.md and .m files for each corresponding live script, as suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it absolutely necessary to include this MAT-file (and the others) in the repo? Can the data be created on-the-fly?
We should avoid including data with the repo -- unless it is absolutely necessary for some reason. The last time I looked into this, every time the repo is versioned git records a copy of binary files. This causes the repo to become exorbitantly large and leads to a painful clone experience for users. I think this is particularly important for the SciML repo which houses lots of examples. If every example stored data or networks, the repo would quickly balloon in size and become unusable.
The best approach is for binaries -- whether it is data or saved networks -- to be stored on supportfiles. You can see how we've done that in other repos for with large binaries: https://github.com/matlab-deep-learning/resnet-50/blob/317ea186da1cb3d429d53559fed7386548ff8e47/assembleResNet50.m#L18 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I requested support files be made. Until then, I have added the files which generate the data. In my Live Scripts, I add a flag for generateData = 0
(data is already generated, don't generate the data) or = 1
(data is not generated, generate the data). However, I noticed that the value of this flag being 0 or 1 changed the results of the training, even though I had rng(0)
at the top of my scripts. This issue was only resolved when I added rng(0)
before each function that could add randomness, e.g. before calling dlnetwork
, before calling shuffle(mbq)
, before calling trainnet
...
I am wondering if this is expected behavior? I have not come across this before. I would have thought that setting rng(0)
at the top of the script would be sufficient.
phiml-blog-supporting-code/physics-informed-ml-blog-supporting-code.prj
Outdated
Show resolved
Hide resolved
…g m files and md files for each subdirectory
… add startup script
Supporting code for the blog post Physics-Informed Machine Learning: Methods and Implementation