-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow using encoded images in Datum, LevelDB, LMDB #1239
Conversation
@s-gupta Could you give it a try and let me know if it is faster for you? |
Thanks @sguada, this has been on my wishlist for a while. Looking forward to wasting far less disk when it's merged. One thing I'm not sure about is the |
@jeffdonahue I tried to keep to the I will fix the warnings and want to merge soon, since this is also very useful for parallel #1148, since it decrease greatly the memory need to map multiple LMDB in memory. |
Why disable resizing the images for storage? You will want to be careful about the quality factor used, but storing resized images can drastically speed up the decode. Most encoders will default to q85 which also mean 420 sampling in JPEG, so could change sampling and artifacting in the input image a little, however 420 increases compression ratio a lot, one of the main reasons in the large change in final size at q85. Also, decode, especially of progressive images can be more expensive than one would think, so we might want more decode threads, at least for "fat node" cases where there is a lot of horsepower to feed. Secondarily, why do we use OpenCV for image decode and not GraphicsMagick/ImageMagick which can be faster (sometime much faster because of parallel DCT support) and more flexible on decode (and encode)? I see cvMat's in the code in a few places, so if that is really needed it makes sense. |
Although, not resizing makes the database more useful for exploring different layer 1 input sizes... |
@thatguymike thanks for your comments. |
@thatguymike GraphicsMagick looks pretty good, unfortunately it has a Blob class that could be conflicting with Caffe::Blob. |
@jeffdonahue I have remove the TIMING flag and change |
Cool, looks good to me, but not sure if #1238 should be merged first. |
@shelhamer I think this PR is ready to go, so I'm not sure either if #1238 should be merge first. |
@@ -142,6 +146,11 @@ void DataLayer<Dtype>::DataLayerSetUp(const vector<Blob<Dtype>*>& bottom, | |||
// This function is used to create a thread that prefetches the data. | |||
template <typename Dtype> | |||
void DataLayer<Dtype>::InternalThreadEntry() { | |||
Timer batch_timer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this Timer implementation isn't thread safe when Caffe::mode() == Caffe::GPU. It frequently causes a segfault in the cuda library on my system.
It's implemented using the cudaEvent*
API which records events with respect to the active cuda stream.
http://developer.download.nvidia.com/compute/cuda/4_1/rel/toolkit/docs/online/group__CUDART__EVENT_ga324d5ce3fbf46899b15e5e42ff9cfa5.html
"Records an event. If stream is non-zero, the event is recorded after all preceding operations in stream have been completed; otherwise, it is recorded after all preceding operations in the CUDA context have been completed."
src/caffe/util/benchmark.cpp
void Timer::Start() {
if (!running()) {
if (Caffe::mode() == Caffe::GPU) {
#ifndef CPU_ONLY
CUDA_CHECK(cudaEventRecord(start_gpu_, 0));
#else
NO_GPU;
#endif
} else {
start_cpu_ = boost::posix_time::microsec_clock::local_time();
}
running_ = true;
has_run_at_least_once_ = true;
}
}
In any case, I'm pretty sure you don't want your prefetch timer to be synchronized with what's happening on the GPU.
@kmatzen I have never had any problems using that |
For reference, when storing ImageNet resized to 256x256 into a LMDB with |
7c910e8
to
1493661
Compare
@kmatzen Could you try the new version with |
@kmatzen is the owner of unifying the usages of data stores. He will clean up all the existing interfaces anyway. Why not merge this first? |
4f5714f
to
3033da8
Compare
Conflicts: tools/convert_imageset.cpp
Added root_folder to WindowDataLayer to locate images
…print mean_values
Make timing more precise using double and microseconds
Allow using encoded images in Datum, LevelDB, LMDB
I'm confused about what happened here; @jeffdonahue suggested that Is that what both parties intended? I'm in agreement with @jeffdonahue's concern here (and in fact it doesn't seem like there is Makefile support for Also, just glancing at the code, it seems that the data transformations that we worked so carefully to de-duplicate have doubled again, once for As with #1238, this seems like a lot of code merged pretty quickly, and my general concerns at #1238 (comment) apply. |
@longjon sorry I thought I have removed all Yeah @jeffdonahue and me agreed to remove all the |
@longjon Regarding the de-duplication of code for different Transform, depending if it is Datum, cv::Mat or Blob what you are transforming, I couldn't find a way to do it while being efficient. |
@@ -179,9 +179,102 @@ void DataTransformer<Dtype>::Transform(const vector<Datum> & datum_vector, | |||
template<typename Dtype> | |||
void DataTransformer<Dtype>::Transform(const cv::Mat& cv_img, | |||
Blob<Dtype>* transformed_blob) { | |||
Datum datum; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked this better as 3 lines. Is the cv::Mat
pipeline speed so crucial / does it make such a difference to first transform to Datum
?
@sguada re: timing: Good. I haven't had a chance to try it out yet, but the improvements to Re: the transformation code (I guess there are actually three copies of it right now...): the differences I see right now between the three cases are: There's no reason we can't abstract over those things (without introducing any copying). For dealing with the different memory layout, one specifies strides along each axis; this is how cuDNN works, how numpy works internally, and something Eigen could probably do for us (though I am not suggesting that we introduce it here). |
It's a bit more complicated than that, cv::mat can be discontinuous in On Saturday, October 18, 2014, longjon notifications@github.com wrote:
Sergio |
@sguada From the documentation at http://docs.opencv.org/modules/core/doc/basic_structures.html#mat, it seems pretty clear that So I don't see any need for |
@longjon thanks, I think I understand what you mean now, so I will give it a try, and ask you if have doubts. :) |
Allow using encoded images in Datum, LevelDB, LMDB
This PR allows to store the image files directly into the DB, which saves a lot of space and sacrifices only a little bit of speed (not noticeable since it is done during the prefetching).
./build/tools/conver_imageset -encoded=true