Hadrian MR

TODO...

Help message...

% hadoop jar target/hadrian-mr-TRUNK-jar-with-dependencies.jar --help
Usage: hadoop jar hadrian-mr.jar [options] input output

  input
        input path specification
  output
        output directory, must not yet exist
  -m <value> | --mapper <value>
        location of mapper PFA
  -r <value> | --reducer <value>
        location of reducer PFA
  -i | --identity-reducer
        use an identity reducer (key-grouping and possibly secondary sort, but no reducer action)
  -n <value> | --num-reducers <value>
        number of reducers (must be at least 1 if --reducer or --identity-reducer is used)
  -s <value> | --snapshot <value>
        output a snapshot of a reducer cell/pool after processing each key, rather than the reducer engine's output (pools take precedence over cells in case of name conflicts)
  --help
        print this help message

Hadrian-MR in "score" mode runs a PFA-encoded scoring engine as a
mapper and a PFA-encoded scoring engine as a reducer.

The output type of the mapper must be a record with two fields: "key"
and "value".  The key must either be a string or a record containing a
string-valued "groupby" field.  If the key is a string, that string
will be used for grouping with no secondary sort.  If the key is a
record, its groupby field is used for grouping and the whole record is
used for secondary sort (according to the normal record-sorting Avro
rules).

The input type of the reducer must be a record with the same structure
as the mapper output.

Return to the Hadrian wiki table of contents.

Licensed under the Hadrian Personal Use and Evaluation License (PUEL).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hadrian MR

Clone this wiki locally