You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try to run the resnet-32 model on cifar-100 dataset, with only the difference of the training data in "Deep_Residual_Learning_CIFAR-10.py", but it causes the error like this:
Starting training...
Traceback (most recent call last):
File "/home/changchen/anaconda3/lib/python3.6/site-packages/theano/compile/function_module.py", line 903, in __call__
self.fn() if output_subset is None else\
RuntimeError: error getting worksize: CUDNN_STATUS_BAD_PARAM
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "resnet.py", line 390, in <module>
main(**kwargs)
File "resnet.py", line 319, in main
train_err += train_fn(inputs, targets)
File "/home/changchen/anaconda3/lib/python3.6/site-packages/theano/compile/function_module.py", line 917, in __call__
storage_map=getattr(self.fn, 'storage_map', None))
File "/home/changchen/anaconda3/lib/python3.6/site-packages/theano/gof/link.py", line 325, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/home/changchen/anaconda3/lib/python3.6/site-packages/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/home/changchen/anaconda3/lib/python3.6/site-packages/theano/compile/function_module.py", line 903, in __call__
self.fn() if output_subset is None else\
RuntimeError: error getting worksize: CUDNN_STATUS_BAD_PARAM
Apply node that caused the error: GpuDnnConv{algo='small', inplace=True, num_groups=1}(GpuContiguous.0, GpuContiguous.0, GpuAllocEmpty{dtype='float32', context_name=None}.0, GpuDnnConvDesc{border_mode='half', subsample=(1, 1), dilation=(1, 1), conv_mode='cross', precision='float32', num_groups=1}.0, Constant{1.0}, Constant{0.0})
Toposort index: 399
Inputs types: [GpuArrayType<None>(float32, 4D), GpuArrayType<None>(float32, 4D), GpuArrayType<None>(float32, 4D), <theano.gof.type.CDataType object at 0x7fa464893a20>, Scalar(float32), Scalar(float32)]
Inputs shapes: [(128, 3, 32, 32), (16, 3, 3, 3), (128, 16, 32, 32), 'No shapes', (), ()]
Inputs strides: [(12288, 4096, 128, 4), (108, 36, 12, 4), (65536, 4096, 128, 4), 'No strides', (), ()]
Inputs values: ['not shown', 'not shown', 'not shown', <capsule object NULL at 0x7fa3997c61e0>, 1.0, 0.0]
Outputs clients: [[GpuElemwise{sub,no_inplace}(GpuDnnConv{algo='small', inplace=True, num_groups=1}.0, InplaceGpuDimShuffle{x,0,x,x}.0), GpuContiguous(GpuDnnConv{algo='small', inplace=True, num_groups=1}.0), GpuElemwise{sub,no_inplace}(GpuDnnConv{algo='small', inplace=True, num_groups=1}.0, GpuElemwise{Composite{(((i0 / i1) / i2) / i3)}}[]<gpuarray>.0)]]
Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File "resnet.py", line 390, in <module>
main(**kwargs)
File "resnet.py", line 267, in main
prediction = lasagne.layers.get_output(network)
File "/home/changchen/anaconda3/lib/python3.6/site-packages/lasagne/layers/helper.py", line 197, in get_output
all_outputs[layer] = layer.get_output_for(layer_inputs, **kwargs)
File "/home/changchen/anaconda3/lib/python3.6/site-packages/lasagne/layers/conv.py", line 352, in get_output_for
conved = self.convolve(input, **kwargs)
File "/home/changchen/anaconda3/lib/python3.6/site-packages/lasagne/layers/conv.py", line 650, in convolve
**extra_kwargs)
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Seems something wrong happend in the convolution operation. Could you please give any advice? Thanks a lot.
The text was updated successfully, but these errors were encountered:
Does this also happen with the original CIFAR-10 code? Sometimes it helps enforcing a different cuDNN algorithm or letting it choose automatically using:
Hi,
I try to run the resnet-32 model on cifar-100 dataset, with only the difference of the training data in "Deep_Residual_Learning_CIFAR-10.py", but it causes the error like this:
Seems something wrong happend in the convolution operation. Could you please give any advice? Thanks a lot.
The text was updated successfully, but these errors were encountered: