Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added gpu_memory::buffer class to help in no-pool config #68

Merged
merged 1 commit into from
Nov 4, 2015
Merged

Added gpu_memory::buffer class to help in no-pool config #68

merged 1 commit into from
Nov 4, 2015

Conversation

borisfom
Copy link

@borisfom borisfom commented Nov 3, 2015

Using temp_buffer class helps retain memory in layers as we did before.

@lukeyeager
Copy link
Member

I built this with Make and without CNMeM. It built fine and passed all the caffe tests. But then I tried to train AlexNet on it with DIGITS and got this error:

F1103 16:11:35.832607 25846 cudnn_conv_layer.cu:48] Check failed: status == CUDNN_STATUS_SUCCESS (5 vs. 0)  CUDNN_STATUS_INVALID_VALUE
*** Check failure stack trace: ***
@     0x7fa4afd20daa  (unknown)
@     0x7fa4afd20ce4  (unknown)
@     0x7fa4afd206e6  (unknown)
@     0x7fa4afd23687  (unknown)
@     0x7fa4b0466941  caffe::CuDNNConvolutionLayer<>::Forward_gpu()
@     0x7fa4b0443f51  caffe::Net<>::ForwardFromTo()
@     0x7fa4b04442c7  caffe::Net<>::ForwardPrefilled()
@     0x7fa4b042ad5f  caffe::Solver<>::Test()
@     0x7fa4b042b53e  caffe::Solver<>::TestAll()
@     0x7fa4b042b679  caffe::Solver<>::Step()
@     0x7fa4b042bfc5  caffe::Solver<>::Solve()
@           0x40913b  train()
@           0x406ab5  main
@     0x7fa4af232ec5  (unknown)
@           0x4071ed  (unknown)
@              (nil)  (unknown)

I'm going to rebuild with DEBUG symbols to see if I can find any more information ...

@lukeyeager
Copy link
Member

Actually, it didn't pass all the Caffe tests. I spoke too soon.

[ RUN      ] CuDNNConvolutionLayerTest/1.TestGradientCuDNN
F1103 16:15:24.877022 26361 cudnn_conv_layer.cu:48] Check failed: status == CUDNN_STATUS_SUCCESS (5 vs. 0)  CUDNN_STATUS_INVALID_VALUE
*** Check failure stack trace: ***
    @     0x2b4c431e5daa  (unknown)
    @     0x2b4c431e5ce4  (unknown)
    @     0x2b4c431e56e6  (unknown)
    @     0x2b4c431e8687  (unknown)
    @     0x2b4c44c83061  caffe::CuDNNConvolutionLayer<>::Forward_gpu()
    @           0x462f56  caffe::Layer<>::Forward()
    @           0x4a15e3  caffe::GradientChecker<>::CheckGradientSingle()
    @           0x4a264b  caffe::GradientChecker<>::CheckGradientExhaustive()
    @           0x6c811e  caffe::CuDNNConvolutionLayerTest_TestGradientCuDNN_Test<>::TestBody()
    @           0x826293  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @           0x81cf77  testing::Test::Run()
    @           0x81d01e  testing::TestInfo::Run()
    @           0x81d125  testing::TestCase::Run()
    @           0x820468  testing::internal::UnitTestImpl::RunAllTests()
    @           0x8206f7  testing::UnitTest::Run()
    @           0x4582fb  main
    @     0x2b4c459eaec5  (unknown)
    @           0x45eeb9  (unknown)
    @              (nil)  (unknown)
make: *** [runtest] Aborted (core dumped)

@borisfom
Copy link
Author

borisfom commented Nov 4, 2015

Can you restart this one? I just pushed a fix.


From: Luke Yeager notifications@github.com
Sent: Tuesday, November 3, 2015 4:15 PM
To: NVIDIA/caffe
Cc: Boris Fomitchev
Subject: Re: [caffe] Added gpu_memory::buffer class to help in no-pool config (#68)

Actually, it didn't pass all the Caffe tests. I spoke too soon.

[ RUN ] CuDNNConvolutionLayerTest/1.TestGradientCuDNN
F1103 16:15:24.877022 26361 cudnn_conv_layer.cu:48] Check failed: status == CUDNN_STATUS_SUCCESS (5 vs. 0) CUDNN_STATUS_INVALID_VALUE
*** Check failure stack trace: ***
@ 0x2b4c431e5daa (unknown)
@ 0x2b4c431e5ce4 (unknown)
@ 0x2b4c431e56e6 (unknown)
@ 0x2b4c431e8687 (unknown)
@ 0x2b4c44c83061 caffe::CuDNNConvolutionLayer<>::Forward_gpu()
@ 0x462f56 caffe::Layer<>::Forward()
@ 0x4a15e3 caffe::GradientChecker<>::CheckGradientSingle()
@ 0x4a264b caffe::GradientChecker<>::CheckGradientExhaustive()
@ 0x6c811e caffe::CuDNNConvolutionLayerTest_TestGradientCuDNN_Test<>::TestBody()
@ 0x826293 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x81cf77 testing::Test::Run()
@ 0x81d01e testing::TestInfo::Run()
@ 0x81d125 testing::TestCase::Run()
@ 0x820468 testing::internal::UnitTestImpl::RunAllTests()
@ 0x8206f7 testing::UnitTest::Run()
@ 0x4582fb main
@ 0x2b4c459eaec5 (unknown)
@ 0x45eeb9 (unknown)
@ (nil) (unknown)
make: *** [runtest] Aborted (core dumped)

Reply to this email directly or view it on GitHubhttps://github.com//pull/68#issuecomment-153531256.


This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by

reply email and destroy all copies of the original message.

@lukeyeager
Copy link
Member

Can you restart this one?

It restarted when you pushed a commit.

lukeyeager added a commit that referenced this pull request Nov 4, 2015
Added gpu_memory::buffer class to help in no-pool config
@lukeyeager lukeyeager merged commit 83401d6 into NVIDIA:caffe-0.14 Nov 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants