Allow using encoded images in Datum, LevelDB, LMDB #1239

sguada · 2014-10-08T01:19:02Z

This PR allows to store the image files directly into the DB, which saves a lot of space and sacrifices only a little bit of speed (not noticeable since it is done during the prefetching).

To convert a set of images and leave them encoded (just copy the copy image file) do:
./build/tools/conver_imageset -encoded=true
Added a TIMING flag to Makefile to see if prefetch is fast enough compared to the forward/backward pass.
- To use add TIMING := 1 to your Makefile.config
- This would display the time used in prefetch (reading images + transforming images)
- The time used in the forward pass
- The time used in the backward pass
Speeded up and update the code in WindowDatalayer:
- Now it read and transform faster the window images into blobs.
- It has an option to cache_images into memory in case of slow HDD. It caches the encoded images to save memory.
- Now it also accepts mean_values instead of mean_file.
This PR also speed up the transformation applied to cv::Mat, this should allow to add easily new transformation done directly with cv::Mat using OpenCV

sguada · 2014-10-08T01:20:17Z

@s-gupta Could you give it a try and let me know if it is faster for you?

jeffdonahue · 2014-10-09T03:48:03Z

Thanks @sguada, this has been on my wishlist for a while. Looking forward to wasting far less disk when it's merged.

One thing I'm not sure about is the TIMING parts. As useful as they are for benchmarking prefetching and such, IMO they clutter the code in some places -- like ForwardBackward, which goes from 4 clean lines to 11 lines that are mostly about timing. Maybe we could hold off on merging those commits, or figure out a slightly cleaner way to do it? Ideally we'd find a way to improve the caffe time tool such that we can actually benchmark prefetching, but that might be pretty difficult.

sguada · 2014-10-09T16:16:58Z

@jeffdonahue I tried to keep to the TIMING minimum to the minimum, but still it clutters the code some places.
So I can remove the all the TIMING code, and leave for a future PR. Probably just move it to the caffe time tool.

I will fix the warnings and want to merge soon, since this is also very useful for parallel #1148, since it decrease greatly the memory need to map multiple LMDB in memory.

thatguymike · 2014-10-09T16:30:02Z

Why disable resizing the images for storage? You will want to be careful about the quality factor used, but storing resized images can drastically speed up the decode. Most encoders will default to q85 which also mean 420 sampling in JPEG, so could change sampling and artifacting in the input image a little, however 420 increases compression ratio a lot, one of the main reasons in the large change in final size at q85. Also, decode, especially of progressive images can be more expensive than one would think, so we might want more decode threads, at least for "fat node" cases where there is a lot of horsepower to feed.

Secondarily, why do we use OpenCV for image decode and not GraphicsMagick/ImageMagick which can be faster (sometime much faster because of parallel DCT support) and more flexible on decode (and encode)? I see cvMat's in the code in a few places, so if that is really needed it makes sense.

thatguymike · 2014-10-09T16:35:55Z

Although, not resizing makes the database more useful for exploring different layer 1 input sizes...

sguada · 2014-10-09T16:56:09Z

@thatguymike thanks for your comments.
The reason of not allowing the resizing of images while storing them encoded, is because it just store the raw file, without any operation. But you can resize the images in parallel before creating the DB using your prefered method.
Regarding the OpenCv, currently we only use it for loading images and doing simple transformations, like cropping or resizing. So if you, or someone else is willing to add support to GraphicsMagick/ImageMagick for that, I will be happy to include that option.

sguada · 2014-10-09T17:10:23Z

@thatguymike GraphicsMagick looks pretty good, unfortunately it has a Blob class that could be conflicting with Caffe::Blob.
@jeffdonahue should I leave the code for timing the prefetching, I think the other part of timing could be done in caffe time but that could not access the prefetch timing. I found it very useful to know the prefetch time during my test and optimizing the code.

shelhamer · 2014-10-10T06:25:20Z

@sguada this should be coordinated with #1238. Perhaps the new interfaces should come first, and then the finished version of this idea once all the discussion is in?

I'm excited for the shrunken DBs!

sguada · 2014-10-10T18:19:28Z

@jeffdonahue I have remove the TIMING flag and change caffe time to compute the time per layer but doing forward/backward passes over all the layers. This way we will get a better sense of the actual time spent by layer.
I have leaved the timing for prefetch in the data layers, but only prints the result when Debug is on.

jeffdonahue · 2014-10-11T21:12:57Z

Cool, looks good to me, but not sure if #1238 should be merged first.

sguada · 2014-10-11T21:16:48Z

@shelhamer I think this PR is ready to go, so I'm not sure either if #1238 should be merge first.

shelhamer · 2014-10-11T21:41:13Z

@sguada let's merge #1238 first for the interfaces then merge this for the encoding and timing if that's alright with you. Please take a look at #1238 and merge, then rebase and merge your work in this PR.

kmatzen · 2014-10-13T02:05:43Z

src/caffe/layers/data_layer.cpp

@@ -142,6 +146,11 @@ void DataLayer<Dtype>::DataLayerSetUp(const vector<Blob<Dtype>*>& bottom,
 // This function is used to create a thread that prefetches the data.
 template <typename Dtype>
 void DataLayer<Dtype>::InternalThreadEntry() {
+  Timer batch_timer;


I'm pretty sure this Timer implementation isn't thread safe when Caffe::mode() == Caffe::GPU. It frequently causes a segfault in the cuda library on my system.

It's implemented using the cudaEvent* API which records events with respect to the active cuda stream.
http://developer.download.nvidia.com/compute/cuda/4_1/rel/toolkit/docs/online/group__CUDART__EVENT_ga324d5ce3fbf46899b15e5e42ff9cfa5.html
"Records an event. If stream is non-zero, the event is recorded after all preceding operations in stream have been completed; otherwise, it is recorded after all preceding operations in the CUDA context have been completed."

src/caffe/util/benchmark.cpp

void Timer::Start() { if (!running()) { if (Caffe::mode() == Caffe::GPU) { #ifndef CPU_ONLY CUDA_CHECK(cudaEventRecord(start_gpu_, 0)); #else NO_GPU; #endif } else { start_cpu_ = boost::posix_time::microsec_clock::local_time(); } running_ = true; has_run_at_least_once_ = true; } }

In any case, I'm pretty sure you don't want your prefetch timer to be synchronized with what's happening on the GPU.

sguada · 2014-10-13T05:01:56Z

@kmatzen I have never had any problems using that Timer and cuda. I'm not sure about the desired behaviour for the GPU timers, since we don't use streams, I think waiting until all the streams to be done makes sense.
However I see the problem that you mentioned for code that run in CPU, like prefetching, which should not be waiting for other code running in GPU to finish.
Maybe we should have a CPUTimer that runs in CPU independently of the Caffe::mode()

sguada · 2014-10-13T06:35:47Z

For reference, when storing ImageNet resized to 256x256 into a LMDB with encoded=true occupy 10-12x less space that when encoded=false. Prefetching times are a bit slower with encoded=true although it is way under the forward-backward pass time.

sguada · 2014-10-13T07:33:28Z

@kmatzen Could you try the new version with CPUTimer and see if still fails for you.

kloudkl · 2014-10-13T15:55:25Z

@kmatzen is the owner of unifying the usages of data stores. He will clean up all the existing interfaces anyway. Why not merge this first?

Conflicts: tools/convert_imageset.cpp

Added root_folder to WindowDataLayer to locate images

…print mean_values

Make timing more precise using double and microseconds

Allow using encoded images in Datum, LevelDB, LMDB

longjon · 2014-10-18T10:52:56Z

I'm confused about what happened here; @jeffdonahue suggested that FowardBackward not be invaded with TIMING ifdefs, and it seems @sguada "remove[d] the TIMING" flag, but the changes to ForwardBackward are merged in dev.

Is that what both parties intended? I'm in agreement with @jeffdonahue's concern here (and in fact it doesn't seem like there is Makefile support for TIMING anyway).

Also, just glancing at the code, it seems that the data transformations that we worked so carefully to de-duplicate have doubled again, once for Datum and once for cv::Mat. What's going on here?

As with #1238, this seems like a lot of code merged pretty quickly, and my general concerns at #1238 (comment) apply.

sguada · 2014-10-18T15:05:41Z

@longjon sorry I thought I have removed all TIMING, but it seems I forgot one place. Done in #1320

Yeah @jeffdonahue and me agreed to remove all the TIMING and instead improve caffe time, which I did.

sguada · 2014-10-18T15:13:03Z

@longjon Regarding the de-duplication of code for different Transform, depending if it is Datum, cv::Mat or Blob what you are transforming, I couldn't find a way to do it while being efficient.
Since Datum and cv::Mat organize the data in memory differently I couldn't find a way to do both with a single code. I experiment with a simple way to unify in https://github.com/BVLC/caffe/pull/1239/files#diff-5 but it was slower due to copying back a cv::Mat into a Datum.

shelhamer · 2014-10-18T16:51:14Z

src/caffe/data_transformer.cpp

@@ -179,9 +179,102 @@ void DataTransformer<Dtype>::Transform(const vector<Datum> & datum_vector,
 template<typename Dtype>
 void DataTransformer<Dtype>::Transform(const cv::Mat& cv_img,
                                       Blob<Dtype>* transformed_blob) {
-  Datum datum;


I liked this better as 3 lines. Is the cv::Mat pipeline speed so crucial / does it make such a difference to first transform to Datum?

longjon · 2014-10-18T19:39:55Z

@sguada re: timing: Good. I haven't had a chance to try it out yet, but the improvements to caffe time seem right to me in spirit.

Re: the transformation code (I guess there are actually three copies of it right now...): the differences I see right now between the three cases are: Datum and Blob have different types, and Blob has an extra dimension; cv::Mat has a different axis ordering.

There's no reason we can't abstract over those things (without introducing any copying). For dealing with the different memory layout, one specifies strides along each axis; this is how cuDNN works, how numpy works internally, and something Eigen could probably do for us (though I am not suggesting that we introduce it here).

sguada · 2014-10-18T20:01:42Z

It's a bit more complicated than that, cv::mat can be discontinuous in
memory. There different types to cast. But I guess is possible to unify all
with a lot ifs.

On Saturday, October 18, 2014, longjon notifications@github.com wrote:

@sguada https://github.com/sguada re: timing: Good. I haven't had a
chance to try it out yet, but the improvements to caffe time seem right
to me in spirit.

Re: the transformation code (I guess there are actually three copies of it
right now...): the differences I see right now between the three cases are:
Datum and Blob have different types, and Blob has an extra dimension;
cv::Mat has a different axis ordering.

There's no reason we can't abstract over those things (without introducing
any copying). For dealing with the different memory layout, one specifies
strides along each axis; this is how cuDNN works, how numpy works
internally, and something Eigen could probably do for us (though I am not
suggesting that we introduce it here).

—
Reply to this email directly or view it on GitHub
#1239 (comment).

Sergio

longjon · 2014-10-18T22:53:33Z

@sguada From the documentation at http://docs.opencv.org/modules/core/doc/basic_structures.html#mat, it seems pretty clear that cv::Mat uses a strided format just like a numpy ndarray, or Blob, or Datum. (Yes, it can be discontinuous, but it still uses a strided addressing scheme (rather than, say, storing offsets, which would actually need different indexing code (though note that the indexing can (and should?) be abstracted away separately from the transformation itself...)).)

So I don't see any need for if casing, really, just a function (template) that takes strides as input, with a type parameter for uint8 vs Dtype storage. If it's not clear, I'm happy to add the unification to my own agenda, but I won't get to it for a while.

sguada · 2014-10-18T23:28:26Z

@longjon thanks, I think I understand what you mean now, so I will give it a try, and ask you if have doubts. :)

Allow using encoded images in Datum, LevelDB, LMDB

This was referenced Oct 8, 2014

Aspect keeping resize in convert_imageset #1178

Closed

Legacy nvcc support #1236

Merged

shelhamer force-pushed the dev branch from d8eb4df to 914da95 Compare October 8, 2014 16:36

sguada mentioned this pull request Oct 9, 2014

Parallel / distributed training #1148

Closed

6 tasks

sguada force-pushed the encoded branch from bd05df6 to 4e83c2c Compare October 10, 2014 18:14

sguada mentioned this pull request Oct 11, 2014

leveldb/lmdb refactoring #1238

Merged

kmatzen mentioned this pull request Oct 12, 2014

Proto data layer #1266

Closed

kmatzen reviewed Oct 13, 2014
View reviewed changes

sguada force-pushed the encoded branch 2 times, most recently from 7c910e8 to 1493661 Compare October 13, 2014 07:29

sguada force-pushed the encoded branch 2 times, most recently from 4f5714f to 3033da8 Compare October 13, 2014 17:59

sguada added 2 commits October 15, 2014 15:56

Added encoded option and check_size to convert_imageset

ae88994

Conflicts: tools/convert_imageset.cpp

Added encoded datum to io

9fc7f36

sguada added 3 commits October 15, 2014 15:56

Added test for encoded Datum to test_io.cpp

1d586ff

Added timers for benchmarking

917e90a

Add fast code for transform(cv::Mat,Blob)

d337b04

sguada force-pushed the encoded branch from 3033da8 to a552f4c Compare October 15, 2014 23:27

sguada added 6 commits October 15, 2014 17:03

Add root_folder to ImageDataLayer

e9d6e5a

Speed up WindowDataLayer and add mean_values

3744598

Added cache_images to WindowDataLayer

14f548d

Added root_folder to WindowDataLayer to locate images

Change caffe time to do forward/backward and accumulate time per layer

d6b71e5

Upgrade compute_image_mean to use gflags, accept list_of_images, and …

65688be

…print mean_values

Added CPUTimer

8867a54

Make timing more precise using double and microseconds

sguada force-pushed the encoded branch from a552f4c to 8867a54 Compare October 16, 2014 00:04

sguada added a commit that referenced this pull request Oct 16, 2014

Merge pull request #1239 from sguada/encoded

e354478

Allow using encoded images in Datum, LevelDB, LMDB

sguada merged commit e354478 into BVLC:dev Oct 16, 2014

sguada deleted the encoded branch October 16, 2014 00:17

shelhamer reviewed Oct 18, 2014
View reviewed changes

RazvanRanca pushed a commit to RazvanRanca/caffe that referenced this pull request Nov 4, 2014

Merge pull request BVLC#1239 from sguada/encoded

7effdca

Allow using encoded images in Datum, LevelDB, LMDB

ducha-aiki mentioned this pull request Dec 29, 2014

Cannot reproduce ilsvrc2012 validation results #1644

Closed

shelhamer mentioned this pull request Feb 17, 2015

Next: release candidater #1849

Merged

ih4cku mentioned this pull request May 9, 2016

Timer ih4cku/caffe-notes#8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow using encoded images in Datum, LevelDB, LMDB #1239

Allow using encoded images in Datum, LevelDB, LMDB #1239

sguada commented Oct 8, 2014

sguada commented Oct 8, 2014

jeffdonahue commented Oct 9, 2014

sguada commented Oct 9, 2014

thatguymike commented Oct 9, 2014

thatguymike commented Oct 9, 2014

sguada commented Oct 9, 2014

sguada commented Oct 9, 2014

shelhamer commented Oct 10, 2014

sguada commented Oct 10, 2014

jeffdonahue commented Oct 11, 2014

sguada commented Oct 11, 2014

shelhamer commented Oct 11, 2014

kmatzen Oct 13, 2014

sguada commented Oct 13, 2014

sguada commented Oct 13, 2014

sguada commented Oct 13, 2014

kloudkl commented Oct 13, 2014

longjon commented Oct 18, 2014

sguada commented Oct 18, 2014

sguada commented Oct 18, 2014

shelhamer Oct 18, 2014

longjon commented Oct 18, 2014

sguada commented Oct 18, 2014

longjon commented Oct 18, 2014

sguada commented Oct 18, 2014

Allow using encoded images in Datum, LevelDB, LMDB #1239

Allow using encoded images in Datum, LevelDB, LMDB #1239

Conversation

sguada commented Oct 8, 2014

sguada commented Oct 8, 2014

jeffdonahue commented Oct 9, 2014

sguada commented Oct 9, 2014

thatguymike commented Oct 9, 2014

thatguymike commented Oct 9, 2014

sguada commented Oct 9, 2014

sguada commented Oct 9, 2014

shelhamer commented Oct 10, 2014

sguada commented Oct 10, 2014

jeffdonahue commented Oct 11, 2014

sguada commented Oct 11, 2014

shelhamer commented Oct 11, 2014

kmatzen Oct 13, 2014

Choose a reason for hiding this comment

sguada commented Oct 13, 2014

sguada commented Oct 13, 2014

sguada commented Oct 13, 2014

kloudkl commented Oct 13, 2014

longjon commented Oct 18, 2014

sguada commented Oct 18, 2014

sguada commented Oct 18, 2014

shelhamer Oct 18, 2014

Choose a reason for hiding this comment

longjon commented Oct 18, 2014

sguada commented Oct 18, 2014

longjon commented Oct 18, 2014

sguada commented Oct 18, 2014