-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix invalid mode changes during tests #2511
Conversation
Similarly, FloatGPU and DoubleGPU are replaced by a new type GPUDevice<T>.
338447f
to
6cedd62
Compare
…Test and GPUStochasticPoolingLayerTest
These new classes can be used to implement test cases that are only running on the GPU or the CPU. The goal is to move all calls to Caffe::set_mode() inside the test framework, to discourage any test to change the mode halfway through the execution, which is documented to be illegal.
6cedd62
to
68133e7
Compare
This looks good to me. Independent of deciding what's right for mode and device in general, templated tests seem like a more robust approach to making sure the mode is right. Exactly what to do with mode and device is an ongoing conversation, but I think diffusing mode + device to I don't know that we ever fully converged on this however. @longjon @jeffdonahue comment if you remember any threads. |
LGTM too. Not sure what we'll end up doing with |
Fix invalid mode changes during tests
Some existing tests are modifying the Caffe mode halfway through the execution, this is documented to be invalid:
https://github.com/BVLC/caffe/blob/8df472/include/caffe/common.hpp#L140-L143
If, for performance reasons, host memory is allocated through
cudaMallocHost
, changing the mode halfway can cause a pointer returned bycudaMallocHost
to be freed byfree(2)
, resulting in undefined behavior. The reciprocal is also possible. Another possible issue is that if some tests incorrectly assume that the default mode is CPU, the test could actually run on the GPU if the previous test clobbered the global mode. See the full analysis of this issue in #2398The solution is, IMHO, to forbid calls to
Caffe::set_mode()
in individual test cases, this function should only be called by the test framework in order to limit the risks of a misuse. To achieve this, the following patch set reuses the existingMultiDeviceTest
class and similarly add new classesGPUDeviceTest
andCPUDeviceTest
. In the case where we need to share code between CPU and GPU tests, the shared test code can directly derive from classMultiDeviceTest
but derived classes needs to be defined for CPU and GPU.