-
-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Save/load from disk (serializing / marshalling) #163
Comments
After further research including Thrift, Avro, Cap'n Proto and FlatBuffers, I've concluded that a binary serializer with schema would be best:
|
Suggested by @sherjilozair, .npy file support would be great. Description , specs and C utilities here:
In the same vein, it will be useful to be able to load common files like:
NESM also should be considered, not sure it can be used for memory-mapping though: https://xomachine.github.io/NESM/ |
* Add Numpy .npy file reader see #163 * Add Numpy writer * Add tests for .npy writer * openFileStream is only on devel + enable numpy tests * SomeFloat is only on devel as well ...
I'll try to find some time in the near future to allow loading of .hdf5 files using nimhdf5 (arraymancer is a dependency anyways). Do you know of any examples of neural networks stored in an .hdf5 file, which I can use as reference? |
Lots of examples here: https://github.com/fchollet/deep-learning-models/releases/ |
Sweet, thanks! |
This issue is still open and I am wondering… what would be the canonical way to save/load a neural network defined in Arraymancer in early 2020? HDF5, msgpack, … ? Will there be an interface for serialization of NNs for deployments or do I have to define my own structure? An example for noobs like me would be really helpful though, but I will try also myself. |
Currently there is no model-wide saving support. For individual tensors you can use Numpy or HDF5. See the tests for usage: HDF5Arraymancer/tests/io/test_hdf5.nim Lines 46 to 105 in 407cae4
NumpyArraymancer/tests/io/test_numpy.nim Lines 17 to 85 in 407cae4
The best way forward would be to implement a serializer to HDF5 that can (de)serialize any type only made only of tensors, including with nested types support. I didn't work on the NN part of Arraymancer during 2019 because I'm revamping the low-level routines in Laser and will adopt a compiler approach: https://github.com/numforge/laser/tree/master/laser/lux_compiler. This is to avoid maintaining a code path for CPU, Cuda, OpenCL, Metal, ... And also I'm working on Nim core multithreading routines to provide a high-level efficient, lightweight and composable foundation to build multithreaded programs like Arraymancer on top as I've had and still have multiple OpenMP issues (see: https://github.com/mratsim/weave). And as I have a full-time job as well, I'm really short on time to tackle issues that require careful design and usability tradeoffs. |
For practicality, it could make sense to provide this functionality in a limited manner (will be explained below) rather than waiting to come up with a perfect solution that covers all cases, especially considering that is a very central feature and the issue is open since almost three years. I would be interested in helping you with it. I think it is much more critical to save model parameters than network topology and context information, for example by providing a function that is similar to pytorchs The major problem with this approach is the way the Please let me know if you would be interested in a solution like this, if yes, I would gladly take the issue and provide a more concrete design before moving on with an implementation. |
@mratsim Forgot to mention. Thanks for the great library by the way. |
Ran into this too, running the Here's all that's needed for the handwritten digits example. Define helpers for Conv2DLayer, and LinearLyaer for the the import arraymancer, streams, msgpack4nim
proc pack_type*[ByteStream](s: ByteStream, layer: Conv2DLayer[Tensor[float32]]) =
let weight: Tensor[float32] = layer.weight.value
let bias: Tensor[float32] = layer.bias.value
s.pack(weight) # let the compiler decide
s.pack(bias) # let the compiler decide
proc unpack_type*[ByteStream](s: ByteStream, layer: var Conv2DLayer[Tensor[float32]]) =
s.unpack(layer.weight.value)
s.unpack(layer.bias.value)
proc pack_type*[ByteStream](s: ByteStream, layer: LinearLayer[Tensor[float32]]) =
let weight: Tensor[float32] = layer.weight.value
let bias: Tensor[float32] = layer.bias.value
s.pack(weight) # let the compiler decide
s.pack(bias) # let the compiler decide
proc unpack_type*[ByteStream](s: ByteStream, layer: var LinearLayer[Tensor[float32]]) =
s.unpack(layer.weight.value)
s.unpack(layer.bias.value)
proc loadData*[T](data: var T, fl: string) =
var ss = newFileStream(fl, fmRead)
if not ss.isNil():
ss.unpack(data)
ss.close()
else:
raise newException(ValueError, "no such file?")
proc saveData*[T](data: T, fl: string) =
var ss = newFileStream(fl, fmWrite)
if not ss.isNil():
ss.pack(data)
ss.close() Then calling var model = ctx.init(DemoNet)
# ... train model ...
model.saveData("test_model.mpack") ## restart model
model.loadData("test_model.mpack")
## continues at last training accuracy |
A note on the above, MsgPack does pretty well in size compared to pure JSON. The exported msgpack file from above is ~22MB (or 16MB when bzipped), or when converted to JSON it results in an 87MB file (33M when bzipped). Not sure how HDF5 or npy would compare. Probably similar, unless the Tensor type was converted from float32's or some other optimizations occur. |
I'm running into what looks to be incomplete saving of a trained model. Saving a fully trained The
Originally I though those must not have state and therefore not need to be stored. But now with the serialize/deserialize not working as intended I am not sure. Is there any other state that I would need to ensure is saved to fully serialize a model and de-serialize it? Perhaps the de-serializing isn't re-packing the all the correct fields? Here are the "custom" type overrides for the serialized layers for reference: import arraymancer, streams, msgpack4nim
proc pack_type*[ByteStream](s: ByteStream, layer: Conv2DLayer[Tensor[float32]]) =
let weight: Tensor[float32] = layer.weight.value
let bias: Tensor[float32] = layer.bias.value
s.pack(weight) # let the compiler decide
s.pack(bias) # let the compiler decide
proc unpack_type*[ByteStream](s: ByteStream, layer: var Conv2DLayer[Tensor[float32]]) =
s.unpack(layer.weight.value)
s.unpack(layer.bias.value)
proc pack_type*[ByteStream](s: ByteStream, layer: LinearLayer[Tensor[float32]]) =
let weight: Tensor[float32] = layer.weight.value
let bias: Tensor[float32] = layer.bias.value
s.pack(weight) # let the compiler decide
s.pack(bias) # let the compiler decide
proc unpack_type*[ByteStream](s: ByteStream, layer: var LinearLayer[Tensor[float32]]) =
s.unpack(layer.weight.value)
s.unpack(layer.bias.value)
|
AFAIK you're doing the correct thing for weights/bias: Arraymancer/src/arraymancer/nn_dsl/dsl_types.nim Lines 63 to 81 in 88edbb6
For the other I don't store the shape metadata in the layers (they are compile-time transformed away) Arraymancer/src/arraymancer/nn_dsl/dsl_types.nim Lines 43 to 57 in 88edbb6
but I probably should to ease serialization |
Ok, thanks that's good to know the weights/biases seem correct. There's a good chance I am missing a part of the serialization or messing up the prediction. All of the re-serialized Tensors values appear to be correct. One last question, is there anything special for the var model = ctx.init(DemoNet)
...
model.loadData(model_file_path) The |
Eventually that would be nice. I currently am just redefining the model code which works for my use case. |
Arraymancer/src/arraymancer/autograd/autograd_common.nim Lines 59 to 70 in 1a2422a
The Arraymancer/src/arraymancer/autograd/autograd_common.nim Lines 44 to 57 in 1a2422a
and then as we pass through layers, a record of the layers applied is appended to The The |
Thanks, I tried reading the dsl.nim but got lost on where things were defined. Based on that those code snippets, the only place that I'm not sure is setup correctly is If I understand it correctly, the |
I'm only saving the edit: Looking through this more, it doesn't appear so. I think the model is being saved/restored correctly. It may be a bug in how I'm ordering my tests when doing predictions. The |
Probably not terribly useful long-term, but for rough purposes you might try https://github.com/disruptek/frosty. It’s kinda designed for “I know what I’m doing” hacks and it could help your differential diagnosis. I used msgpack4nim but wanted more of a fire-and-forget solution that I could trust. |
Any Update? |
1 similar comment
Any Update? |
I'm so at a loss for saving and loading models.. respectfully, how are we supposed to use arraymancer for deep learning without being able to do this? |
Things I learned from trying to solve this problem all day, hope it helps someone: In order to save/load weights and biases of your model, you'll first need to define these manually-
working test example: template weightInit(shape: varargs[int], init_kind: untyped): Variable = proc newExampleNetwork(ctx: Context[Tensor[float32]]): ExampleNetwork = proc forward(network: ExampleNetwork, x: Variable): Variable = Then, you'll need to create your save/load procs. I'll save you the headache here as well- use numpy files. Long story short, forget about hdf5.. and the others aren't as efficient. working test example: proc load(ctx: Context[Tensor[float32]]): ExampleNetwork = At some point in the future I'll work on getting the |
Format to be defined:
Non-binary (will certainly have size issues)
Binary
The text was updated successfully, but these errors were encountered: