forked from cloudera/flume
-
Notifications
You must be signed in to change notification settings - Fork 0
/
DEVNOTES
260 lines (197 loc) · 8.44 KB
/
DEVNOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
Flume Developer Notes
=====================
Jonathan Hsieh <jon@cloudera.com>
6/29/10
// This is in asciidoc markup
== Introduction
This is meant to be a a guide for issues that occur when building,
debugging and setting up flume as developer.
== High level directory and file structure.
----
./bin/ flume scripts
./conf/ flume configuration files
./lib/ libraries used by flume
./libbuild/ libraries used by flume for building
./libtest/ libraries used by flume for testing
./src/ahocorasick a library for multple string search
./src/antlr flume config language ANTLR grammer files
./src/gen-java autogenrated java source files (from antlr/thrift)
./src/java flume java source code
./src/javaperf flume performance tests (out of date)
./src/javatest flume unit tests
./src/javatest-torture flume reliability tests (out of date)
./src/thrift flume thrift idl files (for rpc)
./src/webapps flume webapp jsp source code
----
Files created by build:
----
./build
----
== Files in `.gitignore`
The exclusions in .gitignore are either autogenerated or build/eclipse
specific
== eclipse project setup.
Run "ant eclipse", then create a new java project in eclipse with the
current directory as the base project directory.
----
./.eclipse default working directory for eclipse
----
Note: eclipse class files are not used by bin/flume, you must either
a) compile via ant for bin/flume to pick up your modified code, or b)
specify eclipse on the flume classpath, e.g.:
FLUME_CLASSPATH=./.eclipse/classes-main:./.eclipse/classes-test bin/flume
== Building thrift
This will create a repository in ./apache-thrift
----
git clone git://git.apache.org/thrift.git
cd thrift
git fetch
git checkout -b thrift-0.2.0 origin/tags/thrift-0.2.0
bootstrap.sh
configure
make
sudo make install
----
Problem: During bootstrap.sh
----
configure.ac:44: error: possibly undefined macro: AC_PROG_LIBTOOL
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.
----
Solution: install libtool
----
sudo apt-get install libtool
----
== Generated source
These files should not be checked in unless their source files are modified.
For ANTLR
----
src/gen-java/com/cloudera/flume/conf/FlumeDeployLexer.java
src/gen-java/com/cloudera/flume/conf/FlumeDeployParser.java
----
== Running test with ant.
----
ant test
----
=== Running only specified test cases from ant
----
ant test -Dtestcase=<TestFile>
----
where <TestFile> is a class name without .java or .class or path.
(How do is specify just a function?)
== origin/master invariants
Always should build.
Ideally tests all pass
== Push invariants
We should tag pushes with JIRA nubmers.
== Flume's web application
The default setup for flume is to run its servlets from precompiled
jsps. The default configuration points jetty (a jsp server) to
information found in the ./webapps directory. We assume that most
developers and users will be in core, so at the git project root dir,
./webapps is a symlink that points to the build/webapps directory.
This is where static files and auto-generated files that are generated when
flume is compiled. Using this symlink makes the flume webapp use
precompiled jsp pages.
One can debug the jsp pages or have them autogenerate at runtime
(useful for development) by changing this symlink to point to
src/webapps. This directory has subapps and jsp source
code. Debugging is easier when it is more dynamic and our servlet
container Jetty can dyamically comile jsp pages to hasten the
debugging process.
Here are some tips for getting the web apps for Flume Master or Flume
Node running from inside eclipse.
* Make sure `tools.jar` is in your java classpath. If JAVA_HOME is
set to a JDK JVM path (as opposed to a JRE) this jar should be
included. This jar includes the java compiler which is required to
enable the compilation of jsp's so they can be served on the fly.
* Ant is used to compile jsps. Make sure some version of
ant-launcher.jar and ant-1.x.x.jar is in your build path. (if you
are in eclipse for example). These files live in ./libbuild
* The default is to point to a the web app at a precompiled version of
of the servlets. There is a hook in flume-site.xml to point the
jetty at a directory full of jsps. It assumes that hte flume
directory is the base for relative paths or can use a fully
qualified path
Environment variables can be set in the +bin/flume-env.sh+ script.
----
# bin/flume-env.sh for Ubuntu installs
export JAVA_HOME=/usr/lib/jvm/java-6-sun
----
Alternately, instead of using symlnks, one can set the following
property in the system's flume-site.conf file, like below.
----
<property>
<name>flume.webapps.path</name>
<value>src/webapps</value>
<description>This is the path use to the web apps that display
flume node/master data
Use src webapps for development.
</description>
</property>
----
=== Problems when compiling JSPs
BUILD FAILED
/home/jon/flume/build.xml:471: java.lang.ExceptionInInitializerError
at org.apache.jasper.JspCompilationContext.createCompiler(JspCompilationContext.java:197)
at org.apache.jasper.JspC.processFile(JspC.java:772)
at org.apache.jasper.JspC.execute(JspC.java:908)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
...
Caused by java.lang.NoClassDefFoundError: org/apache/log4j/Category) (Caused by org.apache.commons.logging.LogConfigurationException: No suitable Log constructor [Ljava.lang.Class;@31554233 for org.apache.commons.logging.impl.Log4JLogger (Caused by java.lang.NoClassDefFoundError: org/apache/log4j/Category))
at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:543)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:235)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:209)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:351)
at org.apache.jasper.compiler.Compiler.<clinit>(Compiler.java:55)
... 25 more
Make sure that log4j-xxx.jar is in your CLASSPATH.
== Avro.
Using a post 1.2.0 version that has reflection that supports
Strings, byte[]'s and extracting fields defined in super classes.
requires (in repo):
----
lib/avro-1.2.0-dev.jar # (trunk hash 8911c848 ; more avro 1.3 than 1.2)
lib/paranamer-1.5.jar # extra reflection stuff
lib/jackson-core-asl-1.1.1.jar # json parser
lib/jackson-mapper-asl-1.1.1.jar # json parser
----
== Developer mode.
This is an option in the bin/flume for using eclipse built class files
instead of ant built class files.
in bash one would set FLUME_DEVMODE to true:
----
$ declare -x FLUME_DEVMODE=true
----
It is assumed that the eclipse build path is build_eclipse/.
== Building Windows packages
Building a full windows package and installer executable requires a
few steps. A cygwin envrionment is currently assumed.
1) Build flume jars ('ant tar').
2) Update the installer script to add versioning information ('ant
winstall-filter'). This generates ./flume.nsi.
3) Run nsis compiler on the generated flume.nsi. We've used v2.46 http://nsis.sourceforge.net/Download.
The current cut does not deal with differences between 32-bit vs
64-bit versions, proper error handling situations, or checks to see if
not run as administrator.
== License
All source files must include the following header:
/**
* Licensed to Cloudera, Inc. under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. Cloudera, Inc. licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/