forked from bguthrie/crane
-
Notifications
You must be signed in to change notification settings - Fork 1
/
README
77 lines (54 loc) · 2.26 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
How to start a hadoop ec2 cluster using crane.
Workflow:
-def conf map using conf fn
-def new Jec2 class, stores creds
-launch hadoop cluster
Configuration maps:
creds.clj
{:key "AWS-KEY"
:secretkey "AWS-SECRET-KEY"
:private-key-path "/path/to/private-key"
:key-name "key-name"}
conf.clj
{:image "ami-"
:instance-type :m1.large
:group "ec2-group"
:instances int
:instances int
:creds "/path/to/creds.clj-dir"
:push ["/source/path" "/dest/path/file"
"/another/source/path" "/another/dest/path/file"]
:hadooppath "/path/to/hadoop" ;;used for remote pathing
:hadoopuser "hadoop-user" ;;remote hadoop user
:mapredsite "/path/to/local/mapred" ;;used as local templates
:coresite "/path/to/local/core-site"
:hdfssite "/path/to/local/hdfssite"}
Example:
;;read in config "aws.clj"
crane.cluster> (def my-conf (conf "/path/to/conf.clj-dir"))
;;create new Jec2 class
crane.cluster> (def ec (ec2 (creds "/path/to/creds.clj-dir")))
;;start cluster
(launch-hadoop-cluster ec my-conf)
To build:
1. Download and install leiningen http://github.com/technomancy/leiningen
2. Setup jclouds maven snapshot at http://jclouds.rimuhosting.com/maven2/snapshots
3. $ lein deps
4. $ lein install
5. $ java -cp `ls lib/*|xargs echo|sed 's/ /:/g'`:classes jline.ConsoleRunner clojure.main
core libs for compute cloud, blobstore, and ssh.
supported compute clouds: ec2, rackspace, rimuhosting, vcloud, terremark, hosting.com
supported blobstores: s3, rackspace, azure, atmos online
build upon the core is remote repl and cluster capabilities.
in a couple cases, we use an erlang convention of the ! syntax representing "reomote" (on another process) execution.
the first argument to ! in erlang is the pid, wheras in our api, it is an open ssh session or socket.
there is no ! operator or fn, we just use the ! suffix convention to communicate remote execution.
remote shell execution.
(sh! session "tar -xzf repl.tar.gz")
remote repl evaluation.
eval!
(eval! socket (execute (workflow some-cascading-workflow)))
TODO:
the persistent shell session using jsch ChannelShell is shaky at best, although exec works fine.
eval! may be simpler if we could use LineNumberingPushbackReader see comment in remote_repl.clj
stuff to poll and pull info from hadoop tracker url into repl