Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README.txt #116

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@

本项目为本人近期阅读Hadoop源码时fork出来的,主要用于注释源码。


For the latest information about Hadoop, please visit our website at:

http://hadoop.apache.org/core/
Expand Down
2 changes: 1 addition & 1 deletion hadoop-common-project/hadoop-common/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -616,7 +616,7 @@
<goal>javah</goal>
</goals>
<configuration>
<javahPath>${env.JAVA_HOME}/bin/javah</javahPath>
<javahPath>${java.home}/bin/javah</javahPath>
<javahClassNames>
<javahClassName>org.apache.hadoop.io.compress.zlib.ZlibCompressor</javahClassName>
<javahClassName>org.apache.hadoop.io.compress.zlib.ZlibDecompressor</javahClassName>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,27 @@

@InterfaceAudience.LimitedPrivate({"MapReduce"})
@InterfaceStability.Unstable

/**
* Spill文件索引 在相应Reducer的数据请求时快速定位到相应的partition。
* 一个Spill文件对应一个索引,索引存储专门分配的缓冲中(对应map输出的
* 环形Buffer)
*/
public class IndexRecord {

/**
* 起始偏移量(字节数)
*/
public long startOffset;

/**
* Partition数据原始长度(字节数)
*/
public long rawLength;

/**
* partition数据长度,如果压缩则算压缩后的长度(字节数)
*/
public long partLength;

public IndexRecord() { }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,17 @@
*
* <p>Note: If you require your Partitioner class to obtain the Job's
* configuration object, implement the {@link Configurable} interface.</p>
*
* Partitioner对map输出的key进行划分,决定key及其记录应该发往哪一个reducer.
*
* 即指定每一个key应该由哪个reducer来处理.
*
* 发往同一个reducer的所有key组成一个Partition.
*
* 如果作业只有一个reducer,则框架不会该作业创建Partitioner.
*
* 作业使用哪一个Partitioner由用户配置决定,Partitioner的逻辑中需要使用作业的配置信息,
* 可以通过实现Configurable接口访问配置信息.
*
* @see Reducer
*/
Expand All @@ -50,6 +61,9 @@ public abstract class Partitioner<KEY, VALUE> {
*
* <p>Typically a hash function on a all or a subset of the key.</p>
*
* 每个Partition(Reducer)对应一个整数编号,该方法返回代码key所属的Partition
* 的编号. 传入的Partition总数即为作业的reducer总数.
*
* @param key the key to be partioned.
* @param value the entry value.
* @param numPartitions the total number of partitions.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
/**
* Reduces a set of intermediate values which share a key to a smaller set of
* values.
*
*
* <p><code>Reducer</code> implementations
* can access the {@link Configuration} for the job via the
* {@link JobContext#getConfiguration()} method.</p>
Expand Down Expand Up @@ -114,7 +114,12 @@
* }
* }
* </pre></blockquote>
*
*
*
* Reducer逻辑实现的模板方法. 执行入口为run(). setup , reduce, cleanup 是3个模板方法,可以实现任意一个方法改变逻辑.
*
* 作业的配置信息可以通过Context的getConfigurable方法获取.
*
* @see Mapper
* @see Partitioner
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@

/**
* Enum for map, reduce, job-setup, job-cleanup, task-cleanup task types.
*
* 任务类型
*/
@InterfaceAudience.Public
@InterfaceStability.Stable
Expand Down