-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hitting 2GB protobuf limit in sparse transforms #177
Comments
I looked into this a bit, but there's not an obvious solution. The issue is that during the Keras model construction it's building a symbolic graph, which happens in TensorFlow's graph mode (which has the 2GB limit). The bottleneck in #160 occurs in a different (eager) part of the process, which is why we could resolve that by switching to eager mode. You could work around this by manually splitting up your matrix into smaller pieces. Something like weimat = wattsstrogatz_adjacencies(n_neurons)
n_split = 5
split_neurons = n_neurons // n_split
for i in range(n_split):
split_weimat = weimat[i * split_neurons : (i + 1) * split_neurons]
if sparse:
transform = nengo.transforms.Sparse(
(split_neurons, n_neurons),
init=split_weimat,
)
else:
transform = split_weimat.toarray()
nengo.Connection(
ens.neurons,
ens.neurons[i * split_neurons : (i + 1) * split_neurons],
transform=transform,
synapse=0.1,
) (caveat: I haven't tested this thoroughly) Note that when you do this you will also need to disable the operator merging ( You will pay some performance penalty when splitting up connections like that, but hopefully not too bad. There isn't too much else we can do until TensorFlow does something about that underlying 2GB limit I don't think. |
That's a good idea. It will work for me. I think the penalty will come at build time, not run time, and I can live with that. I have read a bunch of Stack Overflows and developer forums about this protobuf limit, and I still don't understand why this data structure is being used by TF. What are the advantages? Why aren't they outweighed by the obvious disadvantage? Are there no 2GB constants used in machine learning? That's not a Nengo DL thing, so we don't have to answer it here, but any insight from a neuromorphic-getting-into-ML perspective would be appreciated. For Nengo DL, all I would suggest is putting that allocation within a try/except to give the user more information. Thanks |
The plan is to add some documentation about the 2GB protobuf limit and possible workarounds as part of the same memory "tips and tricks" documentation discussed in #178. Going to close this so that we have a single place to track any updates there, but feel free to reopen if there is anything else that isn't addressed! |
This is related to #160, which was solved, so I'm hoping the fix will be manageable. Before, it was a 10k x 10k
Dense
transform; now, it is aSparse
transform with 500M nonzeros. Is it possible to use the approach of #163 and apply it to these lines?nengo-dl/nengo_dl/op_builders.py
Line 505 in 9fb7854
Here is the command, the trace, and some debug lines ("ipdb">) pointing out the offending variable
TF 2.3.0,
Nengo DL master,
Nengo 3.0.0 PyPI
The text was updated successfully, but these errors were encountered: