Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amazon Machine Instance (AMI) on EC2 #5

Closed
beniz opened this issue May 31, 2015 · 36 comments
Closed

Amazon Machine Instance (AMI) on EC2 #5

beniz opened this issue May 31, 2015 · 36 comments

Comments

@beniz
Copy link
Collaborator

beniz commented May 31, 2015

Providing AMI is a good idea to ease deployment for some users.

Related links of interest:
- Deprecated Caffe AMI: https://github.com/BVLC/caffe/wiki/Ubuntu-14.04-ec2-instance
- Newly contributed Caffe AMI (that includes Torch): http://blog.titocosta.com/post/110345699197/public-~ec2-ami-with-torch-and-caffe-deep-learning
- Another Caffe AMI and Docker: BVLC/caffe#1092 (comment)
- An AMI for deep learning, and that contains Caffe: https://www.kaggle.com/forums/f/208/getting-started/t/11505/an-aws-ami-mainly-for-deep-learning

After a long wait, the official AMIs with support for Caffe, XGBoost and Tensorflow backends are available for both GPU and CPU:

See https://deepdetect.com/products/ami/ for thorough documentation.

@beniz
Copy link
Collaborator Author

beniz commented Jun 7, 2015

A fresher AMI for Caffe with Cuda 7 is described below along with the script to create it:
https://github.com/BVLC/caffe/wiki/Caffe-on-EC2-Ubuntu-14.04-Cuda-7
This could provide a good enough starting point.

@revilokeb
Copy link

Hi @beniz, I have built an EC2 AMI with deepdetectv0.1, but I have not yet managed to access the server via internet, just as you have described in your tutorial on localhost:someport from another terminal. I am pretty sure that I did security groups on AWS right (e.g. simply allowing all traffic from my IP for the moment). Any hint for me how I have to start ./main/dede -host ? -port XYZ or how this might be configured?

@revilokeb
Copy link

OK got it: ./main/dede -host 0.0.0.0 -port XYZ

@beniz
Copy link
Collaborator Author

beniz commented Jul 3, 2015

Thanks @revilokeb this is good news. Will you be able to share both the recipe to create the image and the image itself ?

@revilokeb
Copy link

No prob, can do this. But first would like to check that everything is working as it should, will post it here in a few days.

@revilokeb
Copy link

I have put together a public AWS AMI with CUDA 7, cuDNN2 and deepdetect v0.1 in EU (Ireland) with ID ami-4c00423b. Be aware that this AMI can only be found in EU (Ireland), no other geolocation, but you might copy it to any other that has got g2.2xlarge instances.

When instantiating the AMI you might restrict your security group to the relevant IPs and one TCP port XXXX. If you would like to access the API via Internet remember to either attach an elastic IP to your instance or choose "Auto assign Public IP" 'Enable'.

Having logged into the instance you might then start deepdetect by cd ~/deepdetect/build; ./main/dede -host 0.0.0.0 -port XXXX

Now from your local machine you might open a browser and past http://YYY.YYY.YYY.YYY:XXXX/info into the browser (YYY.YYY.YYY.YYY obviously being public IP of your AWS instance) - or use curl as in the docs. You should see something like
{"status":{"code":200,"msg":"OK"},"head":{"method":"/info","version":"0.1","branch":"master","commit":"271b129d2b921ca7a1ce941534e31f1879f248f7","services":[]}}.

(in case you are worried about libdc1394 error - see e.g. there: https://groups.google.com/forum/#!topic/digits-users/uvQpHooD6WY)

@beniz
Copy link
Collaborator Author

beniz commented Jul 6, 2015

@revilokeb thanks, this is again great news!
I would be interested in setting up a page, typically on the wiki, with the instructions to (re)build the image from scratch. The rational being that DD is under development and continuous improvement. This would allow to maintain an AMI with reasonable freshness :)

If you agree, I will share your message on the dedicated thread on the caffe-users mailing list as this was a direct user request.

@revilokeb
Copy link

@beniz sure go ahead, I will also provide step-by-step instructions when I have time

@revilokeb
Copy link

I have made available another public AWS AMI with CUDA 7, cuDNN2 and deepdetect on latest commit (8b196e7..., as of Sep 1 15) in EU (Ireland) with ID ami-91092ee6. I am going to remove the previous AMI with ID ami-4c00423b soon.

For those who would like building it themselves the steps are straightforward and as follows:

  1. On AWS pick an ubuntu 14.04 64 bit AMI which is publicly available
  2. Follow the recipe here: http://www.joyofdata.de/blog/gpu-powered-deeplearning-with-nvidia-digits/, i.e. 1. update, 2. install CUDA 7 and 3. install cuDNN (not strictly required but recommended; be careful to choose the right version of cuDNN for caffe which currently is v2 for the dede branch)
  3. Then follow instructions on http://deepdetect.com/overview/installing/

If you are running into problems use the above AMI or drop a note.

@beniz
Copy link
Collaborator Author

beniz commented Feb 21, 2016

This is a placeholder comment to assert that current AMI found on some servers are out of date and should not be used until there's new ones available.

@revilokeb
Copy link

I have created a new publicly available AWS AMI in EU (Ireland) with ID ami-0564d876 that has got deepdetect commit 8d30a5e on ubuntu 14.04, CUDA 7.5 and cuDNN v4 installed.

As compared to a simple install I have in addition performed the first steps of setting up an imagenet classifier (http://www.deepdetect.com/tutorials/imagenet-classifier/), i.e. I have already done the following:

  1. cd build/caffe_dd/src/caffe_dd
  2. ./scripts/download_model_binary.py models/bvlc_googlenet/
  3. cd deepdetect, mkdir models mkdir models/imgnet
  4. mv build/caffe_dd/src/caffe_dd/models/bvlc_googlenet/bvlc_googlenet.caffemodel models/imgnet
  5. cp datasets/imagenet/corresp_ilsvrc12.txt models/imgnet/corresp.txt

The AWS AMI allows anyone to directly start a GoogleNet classification service on the 1000 ILSVRC categories in the possibly simplest manner one can think of and at almost no cost as follows:

  1. Login into your AWS console, select EC2, change to EU (Ireland) in the upper right corner if you are in a different geozone (the following now assumes you have got already a key pair, if not: get one)
  2. Under Images/AMIs search for "ami-0564d876" in public images, select that AMI and press "Launch"
  3. Then proceed with the following launch steps: in Step 2 select g2.2xlarge instance, in Step 3 set "Auto-assign Public IP" to "Enable", simply continue in Step 4 and Step 5 without changing anything, in Step 6 "Create a new security Group" and set "ssh" to "My IP" and "Add Rule", then "Custom TCP Rule" to "Port Range" e.g. 5010 on "My IP", in Step 7 simply Launch, select the appropriate key and get the instance going
  4. Go back to EC2 Dashboard, check the running instance and note the IP of the instance. Then, using your key and your terminal (or Putty if you are on Windows) ssh into the instance: ssh -i path_to_your_key ubuntu@IP_of_the_instance
  5. On your instance start the deepdetect Server as follows: screen, cd deepdetect/build/main, ./dede -host 0.0.0.0 - port 5010, ctrl a-d (to detach the session)
  6. Start the service:
    curl -X PUT "http://0.0.0.0:5010/services/imageserv" -d "{"mllib":"caffe","description":"image classification service","type":"supervised","parameters":{"input":{"connector":"image"},"mllib":{"gpu":true,"template":"googlenet","nclasses":1000}},"model":{"templates":"../templates/caffe/","repository":"../../models/imgnet"}}"

You are done!

Now you can use the service with any jpg you like as follows (from your EC2 instance):
curl -X POST "http://0.0.0.0:5010/predict" -d "{"service":"imageserv","parameters":{"input":{"width":224,"height":224},"output":{"best":3}},"data":["http://i.ytimg.com/vi/0vxOhd4qlnA/maxresdefault.jpg"]}"

Just insert the URL of the images that you want to query instead of http://i.ytimg.com/vi/0vxOhd4qlnA/maxresdefault.jpg
in the above. If you are querying from your own IP ("My IP" in 3. above) just replace 0.0.0.0 by the IP of the Amazon instance you have created (in 4. above).

All the above is directly taken from http://www.deepdetect.com/tutorials/imagenet-classifier/, I have only added information when describing how to start and access the Amazon AMI that I have made publicly available.

What does the fun cost? g2.2xlarge in Irleand cost $0.702 per hour (https://aws.amazon.com/de/ec2/pricing/). Executing the above will take less than 15min, if you are doing it for the first time maybe 30min. Then there is still a bit of time left to query other images, enjoy a little bit of deep learning and the fast API Server while staying below 1 USD cost (you can possibly do it even at a tenth of that cost if you are going for spot instances...)

@beniz
Copy link
Collaborator Author

beniz commented Feb 25, 2016

wow, this is dope thanks :) How could we help you have the AMI generated automatically after every new commit on master ? I've got reports from users having difficulties with dd, then I figured they were using outdated AMIs. Thanks again, great addition!

@beniz
Copy link
Collaborator Author

beniz commented Feb 25, 2016

@revilokeb Note that there's a front user interface in JS available from the repo in demo/imgdetect. If it is ever needed it'd be very easy to setup I believe: adding nginx and the configuration in the Readme.

This would yield something like:
dd_sshot

@ankon
Copy link

ankon commented Feb 25, 2016

This is awesome! Building the AMI could possibly be done using packer, as part of a simple CI configuration.

@revilokeb
Copy link

@beniz good point, I will definitely look into automation of making the AMI. BTW, a while ago I tried nginx and as you mention it is not difficult at all to do. But I could as well put it onto the AMI in one of the next AMI versions so that users just have to perform a minimal number of steps
@ankon Thanks for pointing me to packer, I will give it a try!

@revilokeb
Copy link

To the AMI that I have build and made available yesterday I have added support for all big residual nets (50, 101, 152), this is available as another publicly available AWS AMI with ID ami-7b00bc08 in EU (Ireland). So anyone could now run a classification service on GPU on the 1000 ILSVRC categories using the ILSVRC 2015 winning net of 152 layers (http://arxiv.org/abs/1512.03385) simply by using the above steps and then starting the service as follows:

Starting the service for 152-layer residual net:
curl -X PUT "http://0.0.0.0:5010/services/imageserv" -d "{"mllib":"caffe","description":"image classification service","type":"supervised","parameters":{"input":{"connector":"image"},"mllib":{"gpu":true,"template":"resnet_152","nclasses":1000}},"model":{"templates":"../templates/caffe/","repository":"../../models/imgnet_resnet152"}}"

(for 50-layer or 101-layer net just replace any 152 by 101 or 50)

Predictions can then be done as before:
curl -X POST "http://0.0.0.0:5010/predict" -d "{"service":"imageserv","parameters":{"input":{"width":224,"height":224},"output":{"best":3}},"data":["http://i.ytimg.com/vi/0vxOhd4qlnA/maxresdefault.jpg"]}"

Running the very deep residual nets on the 4GB GPUs of g2.2xlarge for predictions is achieved by reducing the batch size in deploy.prototxt to 4 (101, 152) / 8 (50). For example when predicting with the 152-layer residual net nvidia-smi shows a GPU Memory Usage of 3115MB.

Network weight files have been taken from https://github.com/KaimingHe/deep-residual-networks, license (MIT) has been added.

@beniz
Copy link
Collaborator Author

beniz commented Feb 26, 2016

@revilokeb dope comes back :) This is great again, thanks!

@revilokeb
Copy link

@ankon Tried Packer for automating AMI build, but so far not successful: hashicorp/packer#3365, for some reason Packer build produces linker error in the very last build step of deepdetect (while manual build with identical steps starting at identical AMI has got no such problems). If you happen to have an idea what could be behind it let me know...

@beniz
Copy link
Collaborator Author

beniz commented Mar 18, 2016

@revilokeb as you have noted in the packer issue, this is classic error message from not having the LD_LIBRARY_PATH set properly to:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64

I'm not familiar with packer but maybe you could add the line above to the configuration file https://gist.github.com/revilokeb/d9d90998edd7c1dc6ac7 that you have pointed to ?

@revilokeb
Copy link

@beniz Yes, had done this yesterday without success, but re-did it now and: it is working. Shame on me. Probably did some typo or whatever. Anyway, thanks so much ;-) (this does however not really solve the more fundamental packer issue with paths as far as I can see, however I will update that issue, too)

So the process could now be automated. As every build is incurring costs for instantiating a GPU instance at AWS I would suggest to build a fresh AMI at reasonable time intervals, say every week. I could do this if you want and make the AMI publicly available. If you would like having it build more often I could of course provide you with all pieces of information so that you do it on an AWS account of your own. What do you think?

Second question is what exactly should be on the AMI? The AMI that I had manually build a month ago had googlenet and residual nets readily available for prediction. You have brought up the frontend which could also be setup easily. Anything else that could make sense for a potential user of such AMI to have readily available?

Some ideas:

  • some standard dataset for benchmarking (license has to be watched however)
  • some additional "standard" convnet models such as inception v3 (but not available in Caffe to my knowledge yet)
  • ?

@beniz
Copy link
Collaborator Author

beniz commented Mar 18, 2016

So the process could now be automated.

Thanks, this is highly welcome. Is

packer build -var 'aws_access_key=...' - var 'aws_secret_key=...' dede.json 

enough to build the whole thing ? I do confess I have not yet spent any time with packer.

As every build is incurring costs for instantiating a GPU instance at AWS I would suggest to build a fresh AMI at reasonable time intervals, say every week. I could do this if you want and make the AMI publicly available. If you would like having it build more often I could of course provide you with all pieces of information so that you do it on an AWS account of your own. What do you think?

I would certainly appreciate if you can update the AMI at your own pace as of right now, and there's no doubt that we'll move the process onto our side in a close future, with more timely updates. The full recipe is always welcome.

Anything else that could make sense for a potential user of such AMI to have readily available

Maybe some of the models from http://www.deepdetect.com/applications/model/ could be of interest. Among the often requested ones are the clothing and the gender models. From http://www.deepdetect.com/applications/text_model/ the english sentiment model has generated some interest. Other than that we are open to requests that would be of common use, beginning with anything that could be of interest to you.

  • some standard dataset for benchmarking (license has to be watched however)

Sure, there's a way to do this easily if you are interested. When building with unit tests (-DBUILD_TESTS=ON), cmake downloads a set of reference datasets against which many critical functions are tested. This may not be ideal but the datasets can then be found in build/examples/, they include mnist, newsgroup20,forest_type and sflare.

  • some additional "standard" convnet models such as inception v3

It is kind of partially available now, see https://github.com/smichalowski/google_inception_v3_for_caffe. If this is of interest to you, you can open a dedicated issue for having inception v3 provided as a template. If you have the ability to help with reviewing the prototxt and try using the architecture on small datasets and report, it would be appreciated. Building the whole final model on Imagenet data is something we can do, or weights could be transfered from existing TF model for instance.

@revilokeb
Copy link

I have now setup a weekly cron to create a fresh AMI with most recent deepdetect master each week on Sun, 11pm. In addition the AMI has the following features:

  • Googlenet as well as ResNet 50, 101 and 152 can all be immediately used for ImageNet 1k prediction as described above
  • User interface for ImageNet classification can be directly used (see below for how to)
  • All image models from http://www.deepdetect.com/applications/model/ can be directly used for prediction: classification of age, bags, buildings, clothing, fabric, footwear, furnitures, gender, sports, trees (see below for how to)
  • English sentiment model from http://www.deepdetect.com/applications/text_model/ can be directly used for prediction (see below for how to)

You might find the publicly available AMI in EU ireland (eu-west-1), AMI name is "deepdetect_revilokeb_{{timestamp}}", where {{timestamp}} is the UNIX timestamp of the build time. So if you want the most recent built, then take the one with the highest timestamp (I will typically leave one or two previous versions publicly available). The first automatically generated AMI today has got AMI name deepdetect_revilokeb_1458665864 (ami-d79f1ba4), I have removed all manually build ones from before.

How to use the user interface:

  1. Set Security Groups at AWS appropriately, i.e. edit Inbound Rules and set Type: Http, Protocol: TCP, Port Range: 80, Source: IP_FROM_WHICH_YOU_ACCESS
  2. On your instance: sudo service nginx start
  3. Start deepdetect: screen, then cd ~/deepdetect/build/main, then ./dede, then ctrl a-d
  4. Start service (Googlenet or any ResNet) similar to above: e.g.
    curl -X PUT "http://0.0.0.0:8080/services/imageserv" -d "{"mllib":"caffe","description":"image classification service","type":"supervised","parameters":{"input":{"connector":"image"},"mllib":{"gpu":true,"template":"resnet_152","nclasses":1000}},"model":{"templates":"../templates/caffe/","repository":"../../models/imgnet_resnet152"}}"
  5. Open your favourite browser and enter IP of your instance as follows: IP_OF_YOUR_INSTANCE/index.html

How to use other image models: - example for gender classification (others can be used similarly)

  1. On your instance start deepdetect, i.e. screen, then cd ~/deepdetect/build/main, then ./dede -host 0.0.0.0 -port 5010, then ctrl a-d
  2. Initiate service:
    curl -X PUT "http://0.0.0.0:5010/services/imageserv" -d "{"mllib":"caffe","description":"gender classification service","type":"supervised","parameters":{"input":{"connector":"image"},"mllib":{"gpu":true,"template":"gender","nclasses":2}},"model":{"templates":"../templates/caffe/","repository":"../../models/gender"}}"
  3. Make prediction:
    curl -X POST "http://0.0.0.0:5010/predict" -d "{"service":"imageserv","parameters":{"input":{"width":224,"height":224},"output":{"best":1}},"data":["https://upload.wikimedia.org/wikipedia/commons/thumb/0/05/Francois_Hollande_2015.jpeg/800px-Francois_Hollande_2015.jpeg"]}"

Other service can be set up in exactly the same way, only differences are as follows:

  1. Replace "gender" in the above with what you would like to classify, i.e. bags, buildings, clothing etc
  2. Replace the number of classes in step 2 by the number of classes shown on the website http://www.deepdetect.com/applications/model/

How to use english sentiment model:

  1. On your instance start deepdetect, i.e. screen, then cd ~/deepdetect/build/main, then ./dede -host 0.0.0.0 -port 5010, then ctrl a-d
  2. Initiate service:
    curl -X PUT 'http://0.0.0.0:5010/services/sent_en' -d '{"mllib":"caffe","description":"English sentiment classification","type":"supervised","parameters":{"input":{"connector":"txt","characters":true,"alphabet":"abcdefghijklmnopqrstuvwxyz0123456789,;.!?'''","sequence":140},"mllib":{"nclasses":2,"template":"english_sentiment"}},"model":{"templates":"/home/ubuntu/deepdetect/build/templates/caffe/","repository":"/home/ubuntu/deepdetect/models/english_sentiment"}}'
  3. Make prediction:
    curl -X POST 'http://0.0.0.0:5010/predict' -d '{"service":"sent_en","parameters":{"mllib":{"gpu":true}},"data":["Thats so great"]}'

For lack of time I havent tested all the above models (but I have run a few plausibility checks on gender, clothing, bags, buildings, english_sentiment). I case there is an error lurking somewhere please drop a note.

I will publish the steps how to automatically create those AMIs as soon as I have time. If there is a good idea what could be usefully added to the AMI let me know...

Remarks:

  • I observed that the age model does not comprise a "train_val" and a "solver", those might need to be added manually for the moment
  • When trying the english sentiment model I noticed a few "unintuitive" results, such as "data":["I am loving it"] -> "prob":0.5171114802360535,"cat":"negative", pls let me know if there is something wrong with that model

@beniz
Copy link
Collaborator Author

beniz commented Mar 23, 2016

@revilokeb outstanding work, thank you! If you don't mind we'll link AMI users to your last comment for the time being but we'll monitor and help take care of any demand for help, issues, etc...

Would you have any data and insights that would help users of the AMI regarding:

  • recommendation for selecting the instance type, especially wrt CPU and GPU ?
  • any insights on the performances, such as single image classification time on googlenet or any related task (CPU and GPU) ?

Regarding the JS frontend, it should need to be patched in order to work properly with the non-Imagenet models. Are you using it for your own purpose by any chance ? Reason I'm asking is that we have a patch for making it work with potentially any model but we could target it a bit better with some user feedback. The patch has been on hold for awhile so it might be time to put it out.

Well noted the remarks, the Englisth sentiment model is rough, definitely best used for finetuning on similar or other tasks, and average for sentiment, though I had a user report it is on par with some other models based on social media data.

Let me know if there are elements we can help with.

@revilokeb
Copy link

Instance Types:
There are currently 2 different current-generation GPU instances available in eu-west-1 (as in most other regions): g2.2xlarge (1 GK104, 4GB GPU RAM) and g2.8xlarge (4 GK104, 4 x 4GB GPU RAM). Net on-demand prices in eu-west-1 are $0.702 per hour for g2.2xlarge and $2.808 per hour for g2.8xlarge (https://aws.amazon.com/ec2/pricing/).

Other instance types at AWS (except for some previous generation CG1 instances) currently do not have GPU.

The AMIs I am preparing here have appropriate NVIDIA drivers, CUDA 7.5 toolkit and cuDNN v4 installed, and compile deepdetect with CUDA and cuDNN.

Therefore if someone chooses to instantiate the AMI on a non-GPU instance (e.g. such as m3.xlarge) there will be errors when doing prediction (independent of setting setting service to CPU or GPU by e.g. "mllib":{"gpu":true}). Errors will look as follows when doing prediction: {"status":{"code":500,"msg":"InternalError","dd_code":1007,"dd_msg":"src/caffe/layers/cudnn_conv_layer.cpp:52 / Check failed (custom): (error) == (cudaSuccess)"}}

Thus it makes only sense to instantiate the AMI with either g2.2xlarge or g2.8xlarge. As the latter is much more expensive this only makes sense if you make good use of multi-GPU, e.g. by training on multi-GPU or running many services at the same time and spreading them to different GPUs (the latter I havent tried so far).

Performance:
Here I am reporting the times as reported by deepdetect of second or later images on g2.2xlarge running the gender pre-trained model (googlenet 2014). Be aware that prediction of the first image after setting up the service can take much longer.

On CPU - single image from wikipedia
{"status":{"code":200,"msg":"OK"},"head":{"method":"/predict","service":"imageserv","time":2162.0},"body":{"predictions":{"uri":"https://upload.wikimedia.org/wikipedia/commons/thumb/0/05/Francois_Hollande_2015.jpeg/800px-Francois_Hollande_2015.jpeg","classes":{"last":true,"prob":1.0,"cat":"men"}}}}

On GPU - single image from wikipedia
{"status":{"code":200,"msg":"OK"},"head":{"method":"/predict","service":"imageserv","time":185.0},"body":{"predictions":{"uri":"https://upload.wikimedia.org/wikipedia/commons/thumb/0/05/Francois_Hollande_2015.jpeg/800px-Francois_Hollande_2015.jpeg","classes":{"last":true,"prob":1.0,"cat":"men"}}}}

On CPU - multiple images from wikipedia
{"status":{"code":200,"msg":"OK"},"head":{"method":"/predict","service":"imageserv","time":8620.0},"body":{"predictions":[{"uri":"https://upload.wikimedia.org/wikipedia/commons/thumb/d/d9/Donald_Tusk_2013-12-19.jpg/800px-Donald_Tusk_2013-12-19.jpg","classes":{"last":true,"prob":1.0,"cat":"men"}},{"uri":"https://upload.wikimedia.org/wikipedia/commons/thumb/8/8d/President_Barack_Obama.jpg/800px-President_Barack_Obama.jpg","classes":{"last":true,"prob":0.9999995231628418,"cat":"men"}},{"uri":"https://upload.wikimedia.org/wikipedia/commons/thumb/0/05/Francois_Hollande_2015.jpeg/800px-Francois_Hollande_2015.jpeg","classes":{"last":true,"prob":1.0,"cat":"men"}},{"uri":"https://upload.wikimedia.org/wikipedia/commons/thumb/8/80/Beata_Szydlo_2015.jpg/800px-Beata_Szydlo_2015.jpg","classes":{"last":true,"prob":0.9982165694236755,"cat":"women"}},{"uri":"https://upload.wikimedia.org/wikipedia/commons/9/96/Charles_Michel_%28politician%29.jpg","classes":{"last":true,"prob":1.0,"cat":"men"}},{"uri":"https://upload.wikimedia.org/wikipedia/commons/4/49/Angela_Merkel_Security_Conference_February_2015_%28cropped%29.jpg","classes":{"last":true,"prob":0.9999743700027466,"cat":"women"}}]}}

On GPU - multiple images from wikipedia
{"status":{"code":200,"msg":"OK"},"head":{"method":"/predict","service":"imageserv","time":267.0},"body":{"predictions":[{"uri":"https://upload.wikimedia.org/wikipedia/commons/thumb/8/80/Beata_Szydlo_2015.jpg/800px-Beata_Szydlo_2015.jpg","classes":{"last":true,"prob":0.9982165694236755,"cat":"women"}},{"uri":"https://upload.wikimedia.org/wikipedia/commons/thumb/d/d9/Donald_Tusk_2013-12-19.jpg/800px-Donald_Tusk_2013-12-19.jpg","classes":{"last":true,"prob":1.0,"cat":"men"}},{"uri":"https://upload.wikimedia.org/wikipedia/commons/thumb/8/8d/President_Barack_Obama.jpg/800px-President_Barack_Obama.jpg","classes":{"last":true,"prob":0.9999995231628418,"cat":"men"}},{"uri":"https://upload.wikimedia.org/wikipedia/commons/thumb/0/05/Francois_Hollande_2015.jpeg/800px-Francois_Hollande_2015.jpeg","classes":{"last":true,"prob":1.0,"cat":"men"}},{"uri":"https://upload.wikimedia.org/wikipedia/commons/9/96/Charles_Michel_%28politician%29.jpg","classes":{"last":true,"prob":1.0,"cat":"men"}},{"uri":"https://upload.wikimedia.org/wikipedia/commons/4/49/Angela_Merkel_Security_Conference_February_2015_%28cropped%29.jpg","classes":{"last":true,"prob":0.9999743700027466,"cat":"women"}}]}}

You can drive the above times down a lot by e.g. hosting your image server in the proximity of your server running deepdetect (given this is possible for your task). This can be achieved on AWS e.g. by placing your deepdetect server and your image server into an AWS placement group (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html). In such a setup I have achieved the following times for some custom-trained googlenet 2014:
GPU single image: ~45ms
GPU multi image: <20ms per image

Frontend:
I have tried the frontend only once a few months ago, and now again while setting up the scripts. However I have occasionally used the caffe demo for displaying trained models / results which can serve a similar purpose. So in fact I believe it would be really valuable if the frontend allowed to replace the service easily (I have tried, but failed when allowing myself to spend a couple of minutes on it). When using the caffe demo I adjusted small things here and there such as display of top-1, top-3 or top-5. Also, a couple of times I was doing multi-label regression and needed to display several numbers next to a picture. Another time I displayed most similar pictures to an uploaded picture based on high-level feature similarity. But of course it is probably impossible and not neccessary to provide all such things out of the box. If one were able to however replace the service easily this would in my eyes definitely a nice thing and helpful.

@beniz
Copy link
Collaborator Author

beniz commented Mar 24, 2016

Excellent, thank you very much. It is a very thorough review, it's good to see that the software appears to be useful to you.

My attempt to bring some more information on some points and discuss some others is below.

running many services at the same time and spreading them to different GPUs (the latter I havent tried so far)

This is something we rely on daily: the gpuid parameter distributes a given service and job to a target GPU. The only downside that I know of is that you may starve one of the GPU in case the storage is too slow, most especially when training more than one services at once. One way to mitigate this issue is to load the data from different disks or sources.

GPU single image: ~45ms
GPU multi image: <20ms per image

This sounds good to me. There still might be a slight overhead due to the server. I'm interested to know whether this can be a bottleneck in practice. Not sure where it exactly stands wrt pure Caffe to be very honest.

So in fact I believe it would be really valuable if the frontend allowed to replace the service easily (I have tried, but failed when allowing myself to spend a couple of minutes on it)
...
If one were able to however replace the service easily this would in my eyes definitely a nice thing and helpful

This is a very good point. I believe it should deserve a dedicated ticket and that someone on our side could take care of. This could go along with a pure JS client btw, which is something on our roadmap. Measuring practical needs such as this one is useful to us in order to prioritize some of the planned features. Our main JS developer is soon to get back to the world of diapers, but will see who can jump in.

Also, a couple of times I was doing multi-label regression and needed to display several numbers next to a picture

Were you using dd for multi-label regression ? If yes, there are several ways to achieve this, and only one is actually supported, so any feedback is welcome.

Another time I displayed most similar pictures to an uploaded picture based on high-level feature similarity. But of course it is probably impossible and not necessary to provide all such things out of the box.

Have you looked at https://github.com/beniz/deepdetect/tree/master/demo/imgsearch by any chance already ? Having a similar (or same, but configurable) UI as the one targeted at image classification but for this similarity search purpose is something that has been suggested. It should be pretty straightforward, and if you have code to share, it could prove useful. Don't hesitate if you have any remarks on image similarity search, its accuracy and packaging.

@beniz
Copy link
Collaborator Author

beniz commented Apr 1, 2016

@revilokeb, hi! Interested in the recipe for AMI, and moving forward when you have time to share.

@revilokeb
Copy link

@beniz sure you are right, I am a bit behind, Ill try posting everything on the weekend. I think what we need is a complete step-by-step guide starting at the standard Ubuntu AMI available at AWS, right? This would allow users to do two things, either set up deepdetect from scratch or automate the process and do what I am currently doing when building the deepdetect AMI incl. those models which are immediately usable.

@beniz
Copy link
Collaborator Author

beniz commented Apr 1, 2016

@revilokeb yes, and from there are good chances we modify and push this onto the AWS marketplace to get a dedicated landing page + spread the image onto more datacenters. My understanding is that users would benefit from a quick 1-click install as well. Let me know your thoughts btw.

@revilokeb
Copy link

@beniz I would definitely agree that 1-click install from most important regions might help some users.

I would expect that the most important benefits from such AMIs comprise the following:

  1. Having a GPU machine directly available and being able to start testing and gaining experience for as little money as a few EUR / USD
  2. Being able to scale training with various hyperparameter setups
  3. Being able to scale production services easily

I assume users dealing with 2 and/or 3 don't desperately need the AMI, they might be happy with a good docu. If on the other hand you have no GPU machine readily available I believe the AMI can indeed be of substantial help. I would assume that users belonging to that group start exploring by setting up some predictive service (here the pre-trained models on the current AMI might help a lot), then would love to try some useful frontend. After having played a bit I would assume that the next step is solving a problem of your own by means of finetuning a pre-trained net on your own n-class dataset. So I could imagine that facilitating such network finetuning on own datatset by making it as much out-of-the box as possible might be helpful. Not 100% sure though.

What do you think? Who do you think is looking for such AMI? And what is she looking for?

Here is the shell script I am running each Sunday evening from crontab: https://gist.github.com/revilokeb/9ea39cac9df1e1e8044c49b5aa4b0637

This obviously requires that packer (https://www.packer.io/downloads.html) and aws cli (http://docs.aws.amazon.com/cli/latest/userguide/installing.html) are both installed. Also aws cli needs to be appropriately configured with ACCESS_KEY and SECRET_ACCESS_KEY: http://docs.aws.amazon.com/cli/latest/reference/configure/.

The shell script needs the following packer template: https://gist.github.com/revilokeb/bb6b18a9f3a42405af5286de84c168a5
The template is really looking more complicated than it is. What it does is cloning and making deepdetect and then moving pre-trained models to the right locations. I have put all things that I expect not to change soon such as CUDA 7.5, cuDNNv4, pre-requisites / dependecies, pre-trained models onto the public AMI deepdetect_base_160322 (ami-4f9f1b3c). If any of those need to change I will need to adjust that AMI manually and reference the new base AMI in the packer script. In my view this seems acceptable because for some things I dont know if automation is easily possible anyway (e.g. cuDNN requires logging into some nvidia account). Also, building of the AMI is greatly accelerated if one does not repeat the steps at every build where nothing has changed (dependencies are of course updated / upgraded each time).

I will provide a description how to build the base AMI hopefully over the week. If there are any ideas how to improve those scripts or the whole setup, happy to learn!

@beniz
Copy link
Collaborator Author

beniz commented Apr 4, 2016

hi @revilokeb thanks again for sharing this very useful piece of work!

What do you think? Who do you think is looking for such AMI? And what is she looking for?

I believe your use cases 1 to 3 do actually cover most users. Most of dd recurrent users appear to be companies on a variety of markets. Easy install of the last software version with ready-to-use API endpoint and, as you suggested, some surrounding tools, some of which either we are working on or helping contributors with (1), are key elements I believe. As is scaling, for 2 and 3, accomodating the training/re-training of models, and prediction pics. We would also like to extend the offer for useful models, and having a 1-click install + drop-in new/on-demand model repository to work with would greatly ease setup, which is in line with our aim, making deep learning a commodity.

(1) there's a tool under dev for continuous integration of model iterations & development as well as for visualization of training metrics, let me know if you are interested in joining this conversation as well.

@revilokeb
Copy link

Hi Emmanuel, sharing your above view. Happy to join conversation on tool for continuous integration and visualization of training metrics.

@Isaacpm
Copy link

Isaacpm commented Apr 10, 2016

@revilokeb this is the repo @beniz has mentioned:
https://github.com/Isaacpm/deepgrid

I'm working on it and fixing stuff, any comments and/or help will be greatly appreaciated

@revilokeb
Copy link

@Isaacpm thanks I will have a look, give it a try and comment.

Btw, I have inadvertently deleted above AMI deepdetect_base_160322 (ami-4f9f1b3c) which is necessary to perform my weekly builds (just for info so that people are not looking for it), so this cant be found for the moment. It is however not necessary for instantiating and running deepdetect on AWS (there is a build from yesterday night: ami-950d8ee6). I will rebuild the base and document necessary steps along the way which were still missing anyway. I am thinking to move the AWS related docu to some other place as this issue here is probably getting difficult to follow / understand.

@beniz
Copy link
Collaborator Author

beniz commented Apr 11, 2016

@revilokeb having with the steps documented would be great. You should have access to the wiki at https://github.com/beniz/deepdetect/wiki as a place where to move the related documentation. This will save us some time in porting the image to the marketplace and other datacenters.

@Isaacpm
Copy link

Isaacpm commented Apr 11, 2016

@revilokeb, keep an eye on it. We are changing a lot of stuff and the readme is not correct at all now, need to change that as soon as I can. But you can check the code, it's pretty straightforward

beniz pushed a commit that referenced this issue May 1, 2016
first build with TF + Caffe, without Cuda, rough but fully automated,…
@beniz
Copy link
Collaborator Author

beniz commented Dec 28, 2016

After a long wait, the official AMIs with support for Caffe, XGBoost and Tensorflow backends are available for both GPU and CPU:

See https://deepdetect.com/products/ami/ for thorough documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants
@ankon @beniz @Isaacpm @revilokeb and others