forked from facebookarchive/hadoop-20
-
Notifications
You must be signed in to change notification settings - Fork 0
/
YAHOO-CHANGES.txt
506 lines (357 loc) · 21.3 KB
/
YAHOO-CHANGES.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
Yahoo! Distribution of Hadoop Change Log
Patches from the following Apache Jira issues have been applied
to this release in the order indicated. This is in addition to
the patches applied from issues referenced in CHANGES.txt.
yahoo-hadoop-0.20.1-3195383008
HADOOP-6521. Fix backward compatiblity issue with umask when applications
use deprecated param dfs.umask in configuration or use
FsPermission.setUMask(). (suresh)
MAPREDUCE-1372. Fixed a ConcurrentModificationException in jobtracker.
(Arun C Murthy via yhemanth)
MAPREDUCE-1316. Fix jobs' retirement from the JobTracker to prevent memory
leaks via stale references. (Amar Kamat via acmurthy)
MAPREDUCE-1342. Fixed deadlock in global blacklisting of tasktrackers.
(Amareshwari Sriramadasu via acmurthy)
HADOOP-6460. Reinitializes buffers used for serializing responses in ipc
server on exceeding maximum response size to free up Java heap. (suresh)
MAPREDUCE-1100. Truncate user logs to prevent TaskTrackers' disks from
filling up. (Vinod Kumar Vavilapalli via acmurthy)
MAPREDUCE-1143. Fix running task counters to be updated correctly
when speculative attempts are running for a TIP.
(Rahul Kumar Singh via yhemanth)
HADOOP-6151, 6281, 6285, 6441. Add HTML quoting of the parameters to all
of the servlets to prevent XSS attacks. (omalley)
MAPREDUCE-896. Fix bug in earlier implementation to prevent
spurious logging in tasktracker logs for absent file paths.
(Ravi Gummadi via yhemanth)
MAPREDUCE-676. Fix Hadoop Vaidya to ensure it works for map-only jobs.
(Suhas Gogate via acmurthy)
HADOOP-5582. Fix Hadoop Vaidya to use new Counters in
org.apache.hadoop.mapreduce package. (Suhas Gogate via acmurthy)
HDFS-595. umask settings in configuration may now use octal or
symbolic instead of decimal. Update HDFS tests as such. (jghoman)
MAPREDUCE-1068. Added a verbose error message when user specifies an
incorrect -file parameter. (Amareshwari Sriramadasu via acmurthy)
MAPREDUCE-1171. Allow the read-error notification in shuffle to be
configurable. (Amareshwari Sriramadasu via acmurthy)
MAPREDUCE-353. Allow shuffle read and connection timeouts to be
configurable. (Amareshwari Sriramadasu via acmurthy)
HADOOP-6428. HttpServer sleeps with negative values (cos)
HADOOP-6386. NameNode's HttpServer can't instantiate InetSocketAddress:
IllegalArgumentException is thrown. (cos)
HDFS-781. Namenode metrics PendingDeletionBlocks is not decremented. (suresh)
MAPREDUCE-1185. Redirect running job url to history url if job is already
retired. (Amareshwari Sriramadasu and Sharad Agarwal via sharad)
MAPREDUCE-754. Fix NPE in expiry thread when a TT is lost. (Amar Kamat
via sharad)
MAPREDUCE-896. Modify permissions for local files on tasktracker before
deletion so they can be deleted cleanly. (Ravi Gummadi via yhemanth)
HADOOP-5771. Implements unit tests for LinuxTaskController.
(Sreekanth Ramakrishnan and Vinod Kumar Vavilapalli via yhemanth)
MAPREDUCE-1124. Import Gridmix3 and Rumen. (cdouglas)
MAPREDUCE-1063. Document gridmix benchmark. (cdouglas)
HDFS-758. Changes to report status of decommissioining on the namenode web
UI. (jitendra)
HADOOP-6234. Add new option dfs.umaskmode to set umask in configuration
to use octal or symbolic instead of decimal. (Jakob Homan via suresh)
MAPREDUCE-1147. Add map output counters to new API. (Amar Kamat via
cdouglas)
MAPREDUCE-1182. Fix overflow in reduce causing allocations to exceed the
configured threshold. (cdouglas)
HADOOP-4933. Fixes a ConcurrentModificationException problem that shows up
when the history viewer is accessed concurrently.
(Amar Kamat via ddas)
MAPREDUCE-1140. Fix DistributedCache to not decrement reference counts for
unreferenced files in error conditions.
(Amareshwari Sriramadasu via yhemanth)
HADOOP-6203. FsShell rm/rmr error message indicates exceeding Trash quota
and suggests using -skpTrash, when moving to trash fails.
(Boris Shkolnik via suresh)
HADOOP-5675. Do not launch a job if DistCp has no work to do. (Tsz Wo
(Nicholas), SZE via cdouglas)
HDFS-457. Better handling of volume failure in Data Node storage,
This fix is a port from hdfs-0.22 to common-0.20 by Boris Shkolnik.
Contributed by Erik Steffl
HDFS-625. Fix NullPointerException thrown from ListPathServlet.
Contributed by Suresh Srinivas.
HADOOP-6343. Log unexpected throwable object caught in RPC.
Contributed by Jitendra Nath Pandey
yahoo-hadoop-0.20.1-3092118007:
MAPREDUCE-1186. Fixed DistributedCache to do a recursive chmod on just the
per-cache directory, not all of mapred.local.dir.
(Amareshwari Sriramadasu via acmurthy)
MAPREDUCE-1231. Add an option to distcp to ignore checksums when used with
the upgrade option.
(Jothi Padmanabhan via yhemanth)
yahoo-hadoop-0.20.1-3092118006:
MAPREDUCE-1219. Fixed JobTracker to not collect per-job metrics, thus
easing load on it. (Amareshwari Sriramadasu via acmurthy)
HDFS-761. Fix failure to process rename operation from edits log due to
quota verification. (suresh)
yahoo-hadoop-0.20.1-3092118005:
MAPREDUCE-1196. Fix FileOutputCommitter to use the deprecated cleanupJob
api correctly. (acmurthy)
yahoo-hadoop-0.20.1-3092118004:
HADOOP-6344. rm and rmr immediately delete files rather than sending
to trash, despite trash being enabled, if a user is over-quota. (jhoman)
MAPREDUCE-1160. Reduce verbosity of log lines in some Map/Reduce classes
to avoid filling up jobtracker logs on a busy cluster.
(Ravi Gummadi and Hong Tang via yhemanth)
HDFS-587. Add ability to run HDFS with MR test on non-default queue,
also updated junit dependendcy from junit-3.8.1 to junit-4.5 (to make
it possible to use Configured and Tool to process command line to
be able to specify a queue). Contributed by Erik Steffl.
MAPREDUCE-1158. Fix JT running maps and running reduces metrics.
(sharad)
MAPREDUCE-947. Fix bug in earlier implementation that was
causing unit tests to fail.
(Ravi Gummadi via yhemanth)
MAPREDUCE-1062. Fix MRReliabilityTest to work with retired jobs
(Contributed by Sreekanth Ramakrishnan)
MAPREDUCE-1090. Modified log statement in TaskMemoryManagerThread to
include task attempt id. (yhemanth)
MAPREDUCE-1098. Fixed the distributed-cache to not do i/o while
holding a global lock. (Amareshwari Sriramadasu via acmurthy)
MAPREDUCE-1048. Add occupied/reserved slot usage summary on
jobtracker UI. (Amareshwari Sriramadasu via sharad)
MAPREDUCE-1103. Added more metrics to Jobtracker. (sharad)
MAPREDUCE-947. Added commitJob and abortJob apis to OutputCommitter.
Enhanced FileOutputCommitter to create a _SUCCESS file for successful
jobs. (Amar Kamat & Jothi Padmanabhan via acmurthy)
MAPREDUCE-1105. Remove max limit configuration in capacity scheduler in
favor of max capacity percentage thus allowing the limit to go over
queue capacity. (Rahul Kumar Singh via yhemanth)
MAPREDUCE-1086. Setup Hadoop logging environment for tasks to point to
task related parameters. (Ravi Gummadi via yhemanth)
MAPREDUCE-739. Allow relative paths to be created inside archives.
(mahadev)
HADOOP-6097. Multiple bugs w/ Hadoop archives (mahadev)
HADOOP-6231. Allow caching of filesystem instances to be disabled on a
per-instance basis (ben slusky via mahadev)
MAPREDUCE-826. harchive doesn't use ToolRunner / harchive returns 0 even
if the job fails with exception (koji via mahadev)
HDFS-686. NullPointerException is thrown while merging edit log and
image. (hairong)
HDFS-709. Fix TestDFSShell failure due to rename bug introduced by
HDFS-677. (suresh)
HDFS-677. Rename failure when both source and destination quota exceeds
results in deletion of source. (suresh)
HADOOP-6284. Add a new parameter, HADOOP_JAVA_PLATFORM_OPTS, to
hadoop-config.sh so that it allows setting java command options for
JAVA_PLATFORM. (Koji Noguchi via szetszwo)
MAPREDUCE-732. Removed spurious log statements in the node
blacklisting logic. (Sreekanth Ramakrishnan via yhemanth)
MAPREDUCE-144. Includes dump of the process tree in task diagnostics when
a task is killed due to exceeding memory limits.
(Vinod Kumar Vavilapalli via yhemanth)
MAPREDUCE-979. Fixed JobConf APIs related to memory parameters to
return values of new configuration variables when deprecated
variables are disabled. (Sreekanth Ramakrishnan via yhemanth)
MAPREDUCE-277. Makes job history counters available on the job history
viewers. (Jothi Padmanabhan via ddas)
HADOOP-5625. Add operation duration to clienttrace. (Lei Xu
via cdouglas)
HADOOP-5222. Add offset to datanode clienttrace. (Lei Xu via cdouglas)
HADOOP-6218. Adds a feature where TFile can be split by Record
Sequence number. Contributed by Hong Tang and Raghu Angadi.
yahoo-hadoop-0.20.1-3041192001
MAPREDUCE-1088. Changed permissions on JobHistory files on local disk to
0744. Contributed by Arun C. Murthy.
HADOOP-6304. Use java.io.File.set{Readable|Writable|Executable} where
possible in RawLocalFileSystem. Contributed by Arun C. Murthy.
yahoo-hadoop-0.20.1-3041192000
MAPREDUCE-270. Fix the tasktracker to optionally send an out-of-band
heartbeat on task-completion for better job-latency. Contributed by
Arun C. Murthy
Configuration changes:
add mapreduce.tasktracker.outofband.heartbeat
MAPREDUCE-1030. Fix capacity-scheduler to assign a map and a reduce task
per-heartbeat. Contributed by Rahuk K Singh.
MAPREDUCE-1028. Fixed number of slots occupied by cleanup tasks to one
irrespective of slot size for the job. Contributed by Ravi Gummadi.
MAPREDUCE-964. Fixed start and finish times of TaskStatus to be
consistent, thereby fixing inconsistencies in metering tasks.
Contributed by Sreekanth Ramakrishnan.
HADOOP-5976. Add a new command, classpath, to the hadoop
script. Contributed by Owen O'Malley and Gary Murry
HADOOP-5784. Makes the number of heartbeats that should arrive
a second at the JobTracker configurable. Contributed by
Amareshwari Sriramadasu.
MAPREDUCE-945. Modifies MRBench and TestMapRed to use
ToolRunner so that options such as queue name can be
passed via command line. Contributed by Sreekanth Ramakrishnan.
yahoo-hadoop-0.20.0-3006291003
HADOOP:5420 Correct bug in earlier implementation
by Arun C. Murthy
HADOOP-5363 Add support for proxying connections to multiple
clusters with different versions to hdfsproxy. Contributed
by Zhiyong Zhang
HADOOP-5780. Improve per block message prited by -metaSave
in HDFS. (Raghu Angadi)
yahoo-hadoop-0.20.0-2957040010
HADOOP-6227. Fix Configuration to allow final parameters to be set
to null and prevent them from being overridden. Contributed by
Amareshwari Sriramadasu.
yahoo-hadoop-0.20.0-2957040007
MAPREDUCE-430 Added patch supplied by Amar Kamat to allow
roll forward on branch to includ externally committed
patch.
yahoo-hadoop-0.20.0-2957040006
MAPREDUCE-768. Provide an option to dump jobtracker configuration in
JSON format to standard output. Contributed by V.V.Chaitanya
yahoo-hadoop-0.20.0-2957040004
MAPREDUCE-834 Correct an issue created by merging this issue with
patch attached to external Jira.
yahoo-hadoop-0.20.0-2957040003
HADOOP-6184 Provide an API to dump Configuration in a JSON format.
Contributed by V.V.Chaitanya Krishna.
MAPREDUCE-745 Patch added for this issue to allow branch-0.20 to
merge cleanly.
yahoo-hadoop-0.20.0-2957040000
MAPREDUCE:478 Allow map and reduce jvm parameters, environment
variables and ulimit to be set separately.
MAPREDUCE:682 Removes reservations on tasktrackers which are blacklisted.
Contributed by Sreekanth Ramakrishnan.
HADOOP:5420 Support killing of process groups in LinuxTaskController
binary
HADOOP-5488 Removes the pidfile management for the Task JVM from the
framework and instead passes the PID back and forth between the
TaskTracker and the Task processes. Contributed by Ravi Gummadi.
MAPREDUCE:467 Provide ability to collect statistics about total tasks and
succeeded tasks in different time windows.
yahoo-hadoop-0.20.0.2949784002:
MAPREDUCE-817. Add a cache for retired jobs with minimal job
info and provide a way to access history file url
MAPREDUCE-814. Provide a way to configure completed job history
files to be on HDFS.
MAPREDUCE-838 Fixes a problem in the way commit of task outputs
happens. The bug was that even if commit failed, the task would be
declared as successful. Contributed by Amareshwari Sriramadasu.
yahoo-hadoop-0.20.0.2902658004:
MAPREDUCE-809 Fix job-summary logs to correctly record final status of
FAILED and KILLED jobs.
http://issues.apache.org/jira/secure/attachment/12414726/MAPREDUCE-809_0_20090728_yhadoop20.patch
MAPREDUCE-740 Log a job-summary at the end of a job, while
allowing it to be configured to use a custom appender if desired.
http://issues.apache.org/jira/secure/attachment/12413941/MAPREDUCE-740_2_20090717_yhadoop20.patch
MAPREDUCE-771 Fixes a bug which delays normal jobs in favor of
high-ram jobs.
http://issues.apache.org/jira/secure/attachment/12413990/MAPREDUCE-771-20.patch
HADOOP-5420 Support setsid based kill in LinuxTaskController.
http://issues.apache.org/jira/secure/attachment/12414735/5420-ydist.patch.txt
MAPREDUCE-733 Fixes a bug that when a task tracker is killed ,
it throws exception. Instead it should catch it and process it and
allow the rest of the flow to go through
http://issues.apache.org/jira/secure/attachment/12413015/MAPREDUCE-733-ydist.patch
MAPREDUCE-734 Fixes a bug which prevented hi ram jobs from being
removed from the scheduler queue.
http://issues.apache.org/jira/secure/attachment/12413035/MAPREDUCE-734-20.patch
MAPREDUCE-693 Fixes a bug that when a job is submitted and the
JT is restarted (before job files have been written) and the job
is killed after recovery, the conf files fail to be moved to the
"done" subdirectory.
http://issues.apache.org/jira/secure/attachment/12412823/MAPREDUCE-693-v1.2-branch-0.20.patch
MAPREDUCE-722 Fixes a bug where more slots are getting reserved
for HiRAM job tasks than required.
http://issues.apache.org/jira/secure/attachment/12412744/MAPREDUCE-722.1.txt
MAPREDUCE-683 TestJobTrackerRestart failed because of stale
filemanager cache (which was created once per jvm). This patch makes
sure that the filemanager is inited upon every JobHistory.init()
and hence upon every restart. Note that this wont happen in production
as upon a restart the new jobtracker will start in a new jvm and
hence a new cache will be created.
http://issues.apache.org/jira/secure/attachment/12412743/MAPREDUCE-683-v1.2.1-branch-0.20.patch
MAPREDUCE-709 Fixes a bug where node health check script does
not display the correct message on timeout.
http://issues.apache.org/jira/secure/attachment/12412711/mapred-709-ydist.patch
MAPREDUCE-708 Fixes a bug where node health check script does
not refresh the "reason for blacklisting".
http://issues.apache.org/jira/secure/attachment/12412706/MAPREDUCE-708-ydist.patch
MAPREDUCE-522 Rewrote TestQueueCapacities to make it simpler
and avoid timeout errors.
http://issues.apache.org/jira/secure/attachment/12412472/mapred-522-ydist.patch
MAPREDUCE-532 Provided ability in the capacity scheduler to
limit the number of slots that can be concurrently used per queue
at any given time.
http://issues.apache.org/jira/secure/attachment/12412592/MAPREDUCE-532-20.patch
MAPREDUCE-211 Provides ability to run a health check script on
the tasktracker nodes and blacklist nodes if they are unhealthy.
Contributed by Sreekanth Ramakrishnan.
http://issues.apache.org/jira/secure/attachment/12412161/mapred-211-internal.patch
MAPREDUCE-516 Remove .orig file included by mistake.
http://issues.apache.org/jira/secure/attachment/12412108/HADOOP-5964_2_20090629_yhadoop.patch
MAPREDUCE-416 Moves the history file to a "done" folder whenever
a job completes.
http://issues.apache.org/jira/secure/attachment/12411938/MAPREDUCE-416-v1.6-branch-0.20.patch
HADOOP-5980 Previously, task spawned off by LinuxTaskController
didn't get LD_LIBRARY_PATH in their environment. The tasks will now
get same LD_LIBRARY_PATH value as when spawned off by
DefaultTaskController.
http://issues.apache.org/jira/secure/attachment/12410825/hadoop-5980-v20.patch
HADOOP-5981 This issue completes the feature mentioned in
HADOOP-2838. HADOOP-2838 provided a way to set env variables in
child process. This issue provides a way to inherit tt's env variables
and append or reset it. So now X=$X:y will inherit X (if there) and
append y to it.
http://issues.apache.org/jira/secure/attachment/12410454/hadoop5981-branch-20-example.patch
HADOOP-5419 This issue is to provide an improvement on the
existing M/R framework to let users know which queues they have
access to, and for what operations. One use case for this would
that currently there is no easy way to know if the user has access
to submit jobs to a queue, until it fails with an access control
exception.
http://issues.apache.org/jira/secure/attachment/12410824/hadoop-5419-v20.2.patch
HADOOP-5420 Support setsid based kill in LinuxTaskController.
http://issues.apache.org/jira/secure/attachment/12414735/5420-ydist.patch.txt
HADOOP-5643 Added the functionality to refresh jobtrackers node
list via command line (bin/hadoop mradmin -refreshNodes). The command
should be run as the jobtracker owner (jobtracker process owner)
or from a super group (mapred.permissions.supergroup).
http://issues.apache.org/jira/secure/attachment/12410619/Fixed%2B5643-0.20-final
HADOOP-2838 Now the users can set environment variables using
mapred.child.env. They can do the following X=Y : set X to Y X=$X:Y
: Append Y to X (which should be taken from the tasktracker)
http://issues.apache.org/jira/secure/attachment/12409895/HADOOP-2838-v2.2-branch-20-example.patch
HADOOP-5818. Revert the renaming from FSNamesystem.checkSuperuserPrivilege
to checkAccess by HADOOP-5643. (Amar Kamat via szetszwo)
https://issues.apache.org/jira/secure/attachment/12409835/5818for0.20.patch
HADOOP-5801. Fixes the problem: If the hosts file is changed across restart
then it should be refreshed upon recovery so that the excluded hosts are
lost and the maps are re-executed. (Amar Kamat via ddas)
https://issues.apache.org/jira/secure/attachment/12409834/5801-0.20.patch
HADOOP-5643. HADOOP-5643. Adds a way to decommission TaskTrackers
while the JobTracker is running. (Amar Kamat via ddas)
https://issues.apache.org/jira/secure/attachment/12409833/Fixed+5643-0.20
HADOOP-5419. Provide a facility to query the Queue ACLs for the
current user. (Rahul Kumar Singh via yhemanth)
http://issues.apache.org/jira/secure/attachment/12409323/hadoop-5419-v20.patch
HADOOP-5733. Add map/reduce slot capacity and blacklisted capacity to
JobTracker metrics. (Sreekanth Ramakrishnan via cdouglas)
http://issues.apache.org/jira/secure/attachment/12409322/hadoop-5733-v20.patch
HADOOP-5738. Split "waiting_tasks" JobTracker metric into waiting maps and
waiting reduces. (Sreekanth Ramakrishnan via cdouglas)
https://issues.apache.org/jira/secure/attachment/12409321/5738-y20.patch
HADOOP-4842. Streaming now allows specifiying a command for the combiner.
(Amareshwari Sriramadasu via ddas)
http://issues.apache.org/jira/secure/attachment/12402355/patch-4842-3.txt
HADOOP-4490. Provide ability to run tasks as job owners.
(Sreekanth Ramakrishnan via yhemanth)
http://issues.apache.org/jira/secure/attachment/12409318/hadoop-4490-br20-3.patch
https://issues.apache.org/jira/secure/attachment/12410170/hadoop-4490-br20-3.2.patch
HADOOP-5442. Paginate jobhistory display and added some search
capabilities. (Amar Kamat via acmurthy)
http://issues.apache.org/jira/secure/attachment/12402301/HADOOP-5442-v1.12.patch
HADOOP-3327. Improves handling of READ_TIMEOUT during map output copying.
(Amareshwari Sriramadasu via ddas)
http://issues.apache.org/jira/secure/attachment/12399449/patch-3327-2.txt
HADOOP-5113. Fixed logcondense to remove files for usernames
beginning with characters specified in the -l option.
(Peeyush Bishnoi via yhemanth)
http://issues.apache.org/jira/secure/attachment/12409317/hadoop-5113-0.18.txt
HADOOP-2898. Provide an option to specify a port range for
Hadoop services provisioned by HOD.
(Peeyush Bishnoi via yhemanth)
http://issues.apache.org/jira/secure/attachment/12409316/hadoop-2898-0.20.txt
HADOOP-4930. Implement a Linux native executable that can be used to
launch tasks as users. (Sreekanth Ramakrishnan via yhemanth)
http://issues.apache.org/jira/secure/attachment/12409402/hadoop-4930v20.patch