Skip to content
This repository has been archived by the owner on Nov 16, 2019. It is now read-only.

Adding Docker support for CaffeOnSpark #208

Merged
merged 17 commits into from
Jan 21, 2017
Merged

Conversation

arundasan91
Copy link
Contributor

Hello,

Please test the docker file locally. Steps are documented in docker/README.md.

The docker image is for CPU version of CaffeOnSpark. I will work on a GPU version soon.

I checked the working twice on my local machine.

@anfeng , @mriduljain , Could you guys please let me know if this is okay ? It was a bit tricky getting Hadoop to work on Docker. I had to refer "https://hub.docker.com/r/sequenceiq/hadoop-docker/" to get it working finally.

Also, I found that PATH variables were not set using Dockerfile. I had to set them explicitly by calling a bootstrap.sh script when we start the container for the first time (No need to use it while attaching to an already existing container). bootstrap.sh also makes sure that hadoop dfs and yarn starts with the container.

Please note: Some of the Hadoop part extracted from "https://hub.docker.com/r/sequenceiq/hadoop-docker/~/dockerfile/"

CaffeOnSpark is in directory /opt/CaffeOnSpark. Hadoop and Spark is set up correctly. COS working properly. Did MNIST example successfully.
Required: Proper SSH without password for Hadoop.
Required: To start hadoop dfs and yarn when starting container and to pass required environment variables.
@yahoocla
Copy link

yahoocla commented Dec 8, 2016

CLA is valid!

@mriduljain
Copy link
Contributor

mriduljain commented Dec 9, 2016 via email

MAINTAINER arun.das@my.utsa.edu

RUN apt-get update && apt-get install -y software-properties-common
RUN add-apt-repository ppa:openjdk-r/ppa
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is repeated in 37 as well. Sorry about that. Didn't see it while uploading. Makes no errors, but redundant.

@javadba
Copy link
Contributor

javadba commented Dec 28, 2016

This is great idea! Thanks for contributing this!

@arundasan91
Copy link
Contributor Author

@javadba , were you able to get it working ? Could you please give it a try ?

@anfeng
Copy link
Contributor

anfeng commented Jan 10, 2017

@arundasan91 Can you expand this PR with GPU support?

@arundasan91
Copy link
Contributor Author

@anfeng , sure can. I was waiting for a reply to make sure that this CPU version is enough. I will add a folder named gpu and a Dockerfile to support NVIDIA GPU's.

I have a doubt on installing the correct version of CUDA based on the GPU's architecture and cuda compute. Will refer caffe's own way of doing it.

@arundasan91
Copy link
Contributor Author

@anfeng , Please see the changes. A GPU version of the Dockerfile is added. I tested the same in an NVIDIA K80 environment. Works great.

The only change to note here is to use nvidia-docker instead of docker for GPU builds. I will go ahead and update the README. Also, the host machine should have the proper NVIDIA drivers installed (not cuda).

Please let me know if there is any changes/additions required.

@anfeng
Copy link
Contributor

anfeng commented Jan 19, 2017

@arundasan91 Good work. Can you add copyright notices to all new source files?
Let's avoid mention your name in the files, to be consistent with all other source codes.

@javadba Have you tried out this PR?

@arundasan91
Copy link
Contributor Author

@anfeng , Yes I will. I totally forgot to remove my name from the maintainer tag. Will do.

@arundasan91
Copy link
Contributor Author

@anfeng , Just to make it clear, these are the copyright lines right ? Shall I copy the same into every file in the PR ?

# Copyright 2016 Yahoo Inc.
# Licensed under the terms of the Apache 2.0 license.
# Please see LICENSE file in the project root for terms.
#
# This file.......

Thanks.

@javadba
Copy link
Contributor

javadba commented Jan 19, 2017 via email

@anfeng
Copy link
Contributor

anfeng commented Jan 20, 2017

@arundasan91 Yes, for all new files.

@arundasan91
Copy link
Contributor Author

@anfeng , I added the copyright to every file.

I added them to the config/ssh_config file too. This file will be copied to the containers' .ssh/config. Since # will be treated as comments, this should be okay.

Could you please take a look at the files and let me know.
Thanks.

@anfeng
Copy link
Contributor

anfeng commented Jan 21, 2017

+1 @arundasan91 Excellent work.

@anfeng anfeng merged commit 19df500 into yahoo:master Jan 21, 2017
dillonfzw added a commit to bluemindor/CaffeOnSpark that referenced this pull request Dec 3, 2017
* 523b72e support build in mixed protobuf env
*   2a83f82 Merge branch 'bluemind' into trlcaffe_merge
|\
| * 16ec96f enhance makefile, no functional diff
| * 27edffa fix a bug
| * 379b673 add parameters for lmdbjni, spark, scala versions
| * aa54efe fix a makefile bug which break the build
| * 5f9f1ee build with new dli container image with protobuf and liblmdbjni version flexible
| *   8356ad4 Merge pull request yahoo#5 from degaochu/bluemind
| |\
| | * 316a1bc CaffeOnSpark copy weight file from Caffe is not real copy but move action . Change move file to copy file so that the file can inherit new diretory's acl attibutes
| |/
* | 24c7ab2 support migrated BVLC caffe
* | 8d26850 ankun's change about supporting NCCL enabled caffe
* | 63ecedd switch multigpu to nccl
* | f622620 update caffe-public submodule commit to trlmerge_yahoo2
* | 8c8b448 switch caffe-public to trlmerge_yahoo2
|/
* f8143cd pick up last bluemind caffe with fabric changes; switch caffe-public branch to "bluemind"
*   2d9b8f0 Merge pull request yahoo#4 from sunweisw/bluemind_v0.3
|\
| * dc4d26f Fix hang issue with multiple gpu
|/
*   2627390 Merge pull request yahoo#2 from fuzhiwen/dev
|\
| * 224c7a8 S.138410: lmdb enhancement for CaffeOnSpark, migrated to bluemind v0.3
| * effdcff support build CoS with protobuf v3.2
|/
* ef28f91 formally moved to latest community version of caffeOnSpark 19df500
*   2ccaa1d Merge pull request yahoo#1 from sunweisw/merge-hist
|\
| * 5d69279 Change for ppc64 env Disable test
|/
*   19df500 Merge pull request yahoo#208 from arundasan91/patch-3
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants