You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a user coming to this repo you are struck by the awesome speedup compared to Flux, but still you are left wondering what "small" really means. In the example the network is indeed small. There is probably some limit to the number of parameters and/or depth/width of a network that we can state?
Like a 5 million parameter fully connected feed forward network has no speed gains while a 5000 parameter network does. I'm making up numbers here but I hope you get the gist.
I think guidelines, and I know it's hard, like this would help new users to evaluate if they should use SimpleChains or Flux for their problem.
The text was updated successfully, but these errors were encountered:
Like a 5 million parameter fully connected feed forward network has no speed gains while a 5000 parameter network does. I'm making up numbers here but I hope you get the gist.
The largest model I ever tried was LeNET*MNIST. At just over 44k parameters, it is a far cry from your 5 million, so I have no idea how it'd perform for a model that large.
Benchmark results for MNIST were shared here, where it (at that size) still did substantially better than the competition on the CPU: https://julialang.org/blog/2022/04/simple-chains/
and was still competitive with Flux + a very beefy GPU.
But with 5 million parameters, you're almost certainly better off on a GPU, which isn't supported by SimpleChains at the moment.
As a user coming to this repo you are struck by the awesome speedup compared to Flux, but still you are left wondering what "small" really means. In the example the network is indeed small. There is probably some limit to the number of parameters and/or depth/width of a network that we can state?
Like a 5 million parameter fully connected feed forward network has no speed gains while a 5000 parameter network does. I'm making up numbers here but I hope you get the gist.
I think guidelines, and I know it's hard, like this would help new users to evaluate if they should use SimpleChains or Flux for their problem.
The text was updated successfully, but these errors were encountered: