-
Notifications
You must be signed in to change notification settings - Fork 49
Hadrian MR
Jim Pivarski edited this page Nov 18, 2015
·
7 revisions
TODO...
Help message...
% hadoop jar target/hadrian-mr-TRUNK-jar-with-dependencies.jar --help
Usage: hadoop jar hadrian-mr.jar [options] input output
input
input path specification
output
output directory, must not yet exist
-m <value> | --mapper <value>
location of mapper PFA
-r <value> | --reducer <value>
location of reducer PFA
-i | --identity-reducer
use an identity reducer (key-grouping and possibly secondary sort, but no reducer action)
-n <value> | --num-reducers <value>
number of reducers (must be at least 1 if --reducer or --identity-reducer is used)
-s <value> | --snapshot <value>
output a snapshot of a reducer cell/pool after processing each key, rather than the reducer engine's output (pools take precedence over cells in case of name conflicts)
--help
print this help message
Hadrian-MR in "score" mode runs a PFA-encoded scoring engine as a
mapper and a PFA-encoded scoring engine as a reducer.
The output type of the mapper must be a record with two fields: "key"
and "value". The key must either be a string or a record containing a
string-valued "groupby" field. If the key is a string, that string
will be used for grouping with no secondary sort. If the key is a
record, its groupby field is used for grouping and the whole record is
used for secondary sort (according to the normal record-sorting Avro
rules).
The input type of the reducer must be a record with the same structure
as the mapper output.
Return to the Hadrian wiki table of contents.
Licensed under the Hadrian Personal Use and Evaluation License (PUEL).