Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix invalid mode changes during tests #2511

Merged
merged 13 commits into from
May 30, 2015

Conversation

flx42
Copy link
Contributor

@flx42 flx42 commented May 26, 2015

Some existing tests are modifying the Caffe mode halfway through the execution, this is documented to be invalid:
https://github.com/BVLC/caffe/blob/8df472/include/caffe/common.hpp#L140-L143

If, for performance reasons, host memory is allocated through cudaMallocHost, changing the mode halfway can cause a pointer returned by cudaMallocHost to be freed by free(2), resulting in undefined behavior. The reciprocal is also possible. Another possible issue is that if some tests incorrectly assume that the default mode is CPU, the test could actually run on the GPU if the previous test clobbered the global mode. See the full analysis of this issue in #2398

The solution is, IMHO, to forbid calls to Caffe::set_mode() in individual test cases, this function should only be called by the test framework in order to limit the risks of a misuse. To achieve this, the following patch set reuses the existing MultiDeviceTest class and similarly add new classes GPUDeviceTest and CPUDeviceTest. In the case where we need to share code between CPU and GPU tests, the shared test code can directly derive from class MultiDeviceTest but derived classes needs to be defined for CPU and GPU.

flx42 added 2 commits May 26, 2015 12:17
@flx42 flx42 force-pushed the fix_illegal_mode_changes branch from 338447f to 6cedd62 Compare May 26, 2015 19:50
@flx42 flx42 force-pushed the fix_illegal_mode_changes branch from 6cedd62 to 68133e7 Compare May 26, 2015 20:56
@shelhamer
Copy link
Member

This looks good to me. Independent of deciding what's right for mode and device in general, templated tests seem like a more robust approach to making sure the mode is right.

Exactly what to do with mode and device is an ongoing conversation, but I think diffusing mode + device to Nets, Layers, and Solvers and making it immutable is reasonable. The dismantling of the singleton / diffusing of the handles is on the charts as #1500 (at least for Net).

I don't know that we ever fully converged on this however. @longjon @jeffdonahue comment if you remember any threads.

@jeffdonahue
Copy link
Contributor

LGTM too. Not sure what we'll end up doing with mode but this looks like it could only make any transition away from what we have now smoother (by centralizing/decreasing the number of references to mode in the code base).

shelhamer added a commit that referenced this pull request May 30, 2015
Fix invalid mode changes during tests
@shelhamer shelhamer merged commit 3cc9bac into BVLC:master May 30, 2015
@flx42 flx42 deleted the fix_illegal_mode_changes branch June 8, 2015 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants