Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR 2998: Unhandled internal error. Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected #42

Open
dbl001 opened this issue Oct 10, 2014 · 2 comments

Comments

@dbl001
Copy link

dbl001 commented Oct 10, 2014

Hi,

I'm getting errors running this simple Pig test:

David-Laxers-MacBook-Pro:pig davidlaxer$ cat test.pig
/* Set Home Directory - where we install software */
%default HOME echo \$HOME

REGISTER /Users/davidlaxer/pig-0.13.0/build/ivy/lib/Pig/avro-1.7.5.jar
REGISTER /Users/davidlaxer/pig-0.13.0/build/ivy/lib/Pig/json-simple-1.1.jar
REGISTER /Users/davidlaxer/pig-0.13.0/contrib/piggybank/java/piggybank.jar

/* DEFINE AvroStorage org.apache.pig.piggybank.storage.avro.AvroStorage();*/

/* Load the emails in avro format (edit the path to match where you saved them) using the AvroStorage UDF from Piggybank */
messages = LOAD '/tmp/test_mbox' USING org.apache.pig.piggybank.storage.avro.AvroStorage();

DESCRIBE messages;
EXPLAIN messages;
ILLUSTRATE messages;
lmt = LIMIT messages 100;
dump messages;

STORE messages INTO '/tmp/messages' USING org.apache.pig.piggybank.storage.avro.AvroStorage();

Mac OS X 10.9.5

$ java -version
java version "1.6.0_65"
Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)

$ pig -version
Apache Pig version 0.13.0 (r1606446)
compiled Jun 29 2014, 02:27:58

$ virtualenv --version
1.11.6
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ env | grep PIG
PIG_HOME=/Users/davidlaxer/pig-0.13.0
PIG_CLASSPATH=/users/davidlaxer/hadoop-2.3.0-src/src/conf
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ env | grep HADOOP
HADOOP_HOME=/Users/davidlaxer/hadoop-2.3.0-src
HADOOP_CONF_DIR=/Users/davidlaxer/hadoop-2.3.0-src/src/conf

$ head /tmp/test_mbox/part-1.avro
Objavro.schema?{"fields": [{"doc": "", "type": ["null", "string"], "name": "message_id"}, {"doc": "", "type": ["null", "string"], "name": "thread_id"}, {"type": ["string", "null"], "name": "in_reply_to"}, {"type": ["string", "null"], "name": "subject"}, {"type": ["string", "null"], "name": "body"}, {"type": ["string", "null"], "name": "date"}, {"type": {"fields": [{"doc": "", "type": ["null", "string"], "name": "real_name"}, {"doc": "", "type": ["null", "string"], "name": "address"}], "type": "record", "name": "from"}, "name": "from"}, {"doc": "", "type": ["null", {"items": ["null", {"fields": [{"doc": "", "type": ["null", "string"], "name": "real_name"}, {"doc": "", "type": ["null", "string"], "name": "address"}], "type": "record", "name": "to"}], "type": "array"}], "name": "tos"}, {"doc": "", "type": ["null", {"items": ["null", {"fields": [{"doc": "", "type": ["null", "string"], "name": "real_name"}, {"doc": "", "type": ["null", "string"], "name": "address"}], "type": "record", "name": "cc"}], "type": "array"}], "name": "ccs"}, {"doc": "", "type": ["null", {"items": ["null", {"fields": [{"doc": "", "type": ["null", "string"], "name": "real_name"}, {"doc": "", "type": ["null", "string"], "name": "address"}], "type": "record", "name": "bcc"}], "type": "array"}], "name": "bccs"}, {"doc": "", "type": ["null", {"items": ["null", {"fields": [{"doc": "", "type": ["null", "string"], "name": "real_name"}, {"doc": "", "type": ["null", "string"], "name": "address"}], "type": "record", "name": "reply_to"}], "type": "array"}], "name": "reply_tos"}], "type": "record", "name": "Email"}avro.codenullZB?;?ԑ???LY????CAG+1DhTqe94W5mof4ZYpM5XCMK8nRazcC7L6H4ySt8s4GQL4Hw@mail.gmail.com&1480877866094699206b11ae7ec2-c20f-4dcf-a3fc-a9190cc3415b@continuum.io?Re: [Anaconda Support] ipython notebook fails to launch in Anaconda 2.1?rHi, I got a similar problem with ipython notebook. But once I defined
PYTHONPATTH point to the site-packages. All work well. In the past, it
sounds no need to define PYTHONPATH, so maybe something changed.

BTW, I installed miniconda for python 3.4.1, and ipython, ipython-notebook.

Hope it helps,

Wanli Wu

...

(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ !p
pig -l /tmp -x local -w -v test.pig
2014-10-10 01:20:26,141 INFO [main] pig.ExecTypeProvider (ExecTypeProvider.java:selectExecType(41)) - Trying ExecType : LOCAL
2014-10-10 01:20:26,145 INFO [main] pig.ExecTypeProvider (ExecTypeProvider.java:selectExecType(43)) - Picked LOCAL as the ExecType
2014-10-10 01:20:26,265 [main] INFO org.apache.pig.Main - Apache Pig version 0.13.0 (r1606446) compiled Jun 29 2014, 02:27:58
2014-10-10 01:20:26,265 [main] INFO org.apache.pig.Main - Logging error messages to: /private/tmp/pig_1412922026132.log
2014-10-10 01:20:26.727 java[80092:1003] Unable to load realm info from SCDynamicStore
2014-10-10 01:20:26,735 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2014-10-10 01:20:27,409 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /Users/davidlaxer/.pigbootup not found
2014-10-10 01:20:27,441 [main] INFO org.apache.pig.tools.parameters.PreprocessorContext - Executing command : echo $HOME
2014-10-10 01:20:27,611 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-10 01:20:27,611 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2014-10-10 01:20:27,612 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2014-10-10 01:20:27,934 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-10 01:20:28,114 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-10 01:20:28,258 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-10 01:20:31,764 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
messages: {message_id: chararray,thread_id: chararray,in_reply_to: chararray,subject: chararray,body: chararray,date: chararray,from: (real_name: chararray,address: chararray),tos: {ARRAY_ELEM: (real_name: chararray,address: chararray)},ccs: {ARRAY_ELEM: (real_name: chararray,address: chararray)},bccs: {ARRAY_ELEM: (real_name: chararray,address: chararray)},reply_tos: {ARRAY_ELEM: (real_name: chararray,address: chararray)}}
2014-10-10 01:20:32,830 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter], RULES_DISABLED=[FilterLogicExpressionSimplifier]}

-----------------------------------------------

New Logical Plan:

-----------------------------------------------

messages: (Name: LOStore Schema: message_id#26:chararray,thread_id#27:chararray,in_reply_to#28:chararray,subject#29:chararray,body#30:chararray,date#31:chararray,from#32:tuple(real_name#33:chararray,address#34:chararray),tos#35:bag{ARRAY_ELEM#36:tuple(real_name#37:chararray,address#38:chararray)},ccs#39:bag{ARRAY_ELEM#40:tuple(real_name#41:chararray,address#42:chararray)},bccs#43:bag{ARRAY_ELEM#44:tuple(real_name#45:chararray,address#46:chararray)},reply_tos#47:bag{ARRAY_ELEM#48:tuple(real_name#49:chararray,address#50:chararray)})
|
|---messages: (Name: LOLoad Schema: message_id#26:chararray,thread_id#27:chararray,in_reply_to#28:chararray,subject#29:chararray,body#30:chararray,date#31:chararray,from#32:tuple(real_name#33:chararray,address#34:chararray),tos#35:bag{ARRAY_ELEM#36:tuple(real_name#37:chararray,address#38:chararray)},ccs#39:bag{ARRAY_ELEM#40:tuple(real_name#41:chararray,address#42:chararray)},bccs#43:bag{ARRAY_ELEM#44:tuple(real_name#45:chararray,address#46:chararray)},reply_tos#47:bag{ARRAY_ELEM#48:tuple(real_name#49:chararray,address#50:chararray)})RequiredFields:null

-----------------------------------------------

Physical Plan:

-----------------------------------------------

messages: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-1
|
|---messages: Load(/tmp/test_mbox:org.apache.pig.piggybank.storage.avro.AvroStorage) - scope-0

--------------------------------------------------

Map Reduce Plan

--------------------------------------------------

No MR jobs. Fetch only.
2014-10-10 01:20:33,366 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-10 01:20:33,368 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2014-10-10 01:20:33,443 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[LoadTypeCastInserter, StreamTypeCastInserter], RULES_DISABLED=[AddForEach, ColumnMapKeyPrune, FilterLogicExpressionSimplifier, GroupByConstParallelSetter, LimitOptimizer, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter]}
2014-10-10 01:20:33,575 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-10-10 01:20:33,624 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-10-10 01:20:33,625 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-10-10 01:20:33,683 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2014-10-10 01:20:33,689 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2014-10-10 01:20:33,689 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-10-10 01:20:33,690 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
2014-10-10 01:20:34,210 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: messages[11,11] C: R:
2014-10-10 01:20:34,226 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-10 01:20:34,227 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
2014-10-10 01:20:34,310 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-10-10 01:20:34,316 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
2014-10-10 01:20:34,320 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.apache.pig.piggybank.storage.avro.PigAvroInputFormat.listStatus(PigAvroInputFormat.java:96)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:375)
at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:190)
at org.apache.pig.impl.io.ReadToEndLoader.(ReadToEndLoader.java:146)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad.setUp(POLoad.java:95)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad.getNextTuple(POLoad.java:123)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:202)
at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:259)
at org.apache.pig.pen.ExampleGenerator.readBaseData(ExampleGenerator.java:223)
at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:155)
at org.apache.pig.PigServer.getExamples(PigServer.java:1282)
at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:810)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:802)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:381)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:228)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:608)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Details also at logfile: /private/tmp/pig_1412922026132.log
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ pig -x local
2014-10-10 08:49:33,765 INFO [main] pig.ExecTypeProvider (ExecTypeProvider.java:selectExecType(41)) - Trying ExecType : LOCAL
2014-10-10 08:49:33,794 INFO [main] pig.ExecTypeProvider (ExecTypeProvider.java:selectExecType(43)) - Picked LOCAL as the ExecType
2014-10-10 08:49:33,866 [main] INFO org.apache.pig.Main - Apache Pig version 0.13.0 (r1606446) compiled Jun 29 2014, 02:27:58
2014-10-10 08:49:33,866 [main] INFO org.apache.pig.Main - Logging error messages to: /Users/davidlaxer/Agile_Data_Code/ch03/pig/pig_1412948973860.log
2014-10-10 08:49:33,897 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /Users/davidlaxer/.pigbootup not found
2014-10-10 08:49:34,217 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-10 08:49:34,221 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2014-10-10 08:49:34,223 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2014-10-10 08:49:34.824 java[83134:1003] Unable to load realm info from SCDynamicStore
2014-10-10 08:49:34,844 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2014-10-10 08:49:35,111 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2014-10-10 08:49:35,117 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
grunt> REGISTER /Users/davidlaxer/pig-0.13.0/build/ivy/lib/Pig/avro-1.7.5.jar
2014-10-10 08:49:58,340 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-10 08:49:58,342 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
grunt> REGISTER /Users/davidlaxer/pig-0.13.0/build/ivy/lib/Pig/json-simple-1.1.jar
2014-10-10 08:49:58,562 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-10 08:49:58,565 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
grunt> REGISTER /Users/davidlaxer/pig-0.13.0/contrib/piggybank/java/piggybank.jar
2014-10-10 08:49:58,745 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-10 08:49:58,754 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
grunt>
grunt> /* DEFINE AvroStorage org.apache.pig.piggybank.storage.avro.AvroStorage();/
grunt>
grunt> /
Load the emails in avro format (edit the path to match where you saved them) using the AvroStorage UDF from Piggybank */
grunt> messages = LOAD '/tmp/test_mbox' USING org.apache.pig.piggybank.storage.avro.AvroStorage();
2014-10-10 08:49:59,251 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-10 08:49:59,253 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
grunt> DESCRIBE messages;
messages: {message_id: chararray,thread_id: chararray,in_reply_to: chararray,subject: chararray,body: chararray,date: chararray,from: (real_name: chararray,address: chararray),tos: {ARRAY_ELEM: (real_name: chararray,address: chararray)},ccs: {ARRAY_ELEM: (real_name: chararray,address: chararray)},bccs: {ARRAY_ELEM: (real_name: chararray,address: chararray)},reply_tos: {ARRAY_ELEM: (real_name: chararray,address: chararray)}}
grunt> EXPLAIN messages;
2014-10-10 08:50:22,370 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter], RULES_DISABLED=[FilterLogicExpressionSimplifier]}

-----------------------------------------------

New Logical Plan:

-----------------------------------------------

messages: (Name: LOStore Schema: message_id#51:chararray,thread_id#52:chararray,in_reply_to#53:chararray,subject#54:chararray,body#55:chararray,date#56:chararray,from#57:tuple(real_name#58:chararray,address#59:chararray),tos#60:bag{ARRAY_ELEM#61:tuple(real_name#62:chararray,address#63:chararray)},ccs#64:bag{ARRAY_ELEM#65:tuple(real_name#66:chararray,address#67:chararray)},bccs#68:bag{ARRAY_ELEM#69:tuple(real_name#70:chararray,address#71:chararray)},reply_tos#72:bag{ARRAY_ELEM#73:tuple(real_name#74:chararray,address#75:chararray)})
|
|---messages: (Name: LOLoad Schema: message_id#51:chararray,thread_id#52:chararray,in_reply_to#53:chararray,subject#54:chararray,body#55:chararray,date#56:chararray,from#57:tuple(real_name#58:chararray,address#59:chararray),tos#60:bag{ARRAY_ELEM#61:tuple(real_name#62:chararray,address#63:chararray)},ccs#64:bag{ARRAY_ELEM#65:tuple(real_name#66:chararray,address#67:chararray)},bccs#68:bag{ARRAY_ELEM#69:tuple(real_name#70:chararray,address#71:chararray)},reply_tos#72:bag{ARRAY_ELEM#73:tuple(real_name#74:chararray,address#75:chararray)})RequiredFields:null

-----------------------------------------------

Physical Plan:

-----------------------------------------------

messages: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-1
|
|---messages: Load(/tmp/test_mbox:org.apache.pig.piggybank.storage.avro.AvroStorage) - scope-0

--------------------------------------------------

Map Reduce Plan

--------------------------------------------------

No MR jobs. Fetch only.
grunt> ILLUSTRATE messages;
2014-10-10 08:50:22,620 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2014-10-10 08:50:22,622 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-10 08:50:22,641 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2014-10-10 08:50:22,710 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[LoadTypeCastInserter, StreamTypeCastInserter], RULES_DISABLED=[AddForEach, ColumnMapKeyPrune, FilterLogicExpressionSimplifier, GroupByConstParallelSetter, LimitOptimizer, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter]}
2014-10-10 08:50:22,793 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-10-10 08:50:22,812 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-10-10 08:50:22,812 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-10-10 08:50:22,846 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2014-10-10 08:50:22,854 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2014-10-10 08:50:22,883 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-10-10 08:50:22,883 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
2014-10-10 08:50:23,137 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: messages[1,11] C: R:
2014-10-10 08:50:23,139 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2014-10-10 08:50:23,140 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-10 08:50:23,141 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
2014-10-10 08:50:23,180 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-10-10 08:50:23,184 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
Details at logfile: /Users/davidlaxer/Agile_Data_Code/ch03/pig/pig_1412948973860.log
grunt> lmt = LIMIT messages 100;
grunt> dump messages;
2014-10-10 08:50:23,309 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2014-10-10 08:50:23,320 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[LoadTypeCastInserter, StreamTypeCastInserter], RULES_DISABLED=[AddForEach, ColumnMapKeyPrune, FilterLogicExpressionSimplifier, GroupByConstParallelSetter, LimitOptimizer, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter]}
2014-10-10 08:50:23,372 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2014-10-10 08:50:23,409 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2081: Unable to setup the load function.
Details at logfile: /Users/davidlaxer/Agile_Data_Code/ch03/pig/pig_1412948973860.log
grunt>

@dbl001
Copy link
Author

dbl001 commented Oct 12, 2014

(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ !v
vi test.pig
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ pig -l /tmp -x local -w -v test.pig
2014-10-12 00:19:36,173 INFO [main] pig.ExecTypeProvider (ExecTypeProvider.java:selectExecType(41)) - Trying ExecType : LOCAL
2014-10-12 00:19:36,175 INFO [main] pig.ExecTypeProvider (ExecTypeProvider.java:selectExecType(43)) - Picked LOCAL as the ExecType
2014-10-12 00:19:36,243 [main] INFO org.apache.pig.Main - Apache Pig version 0.13.0 (r1606446) compiled Jun 29 2014, 02:27:58
2014-10-12 00:19:36,243 [main] INFO org.apache.pig.Main - Logging error messages to: /private/tmp/pig_1413091176168.log
2014-10-12 00:19:36.583 java[13484:1003] Unable to load realm info from SCDynamicStore
2014-10-12 00:19:36,587 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2014-10-12 00:19:37,220 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /Users/davidlaxer/.pigbootup not found
2014-10-12 00:19:37,273 [main] INFO org.apache.pig.tools.parameters.PreprocessorContext - Executing command : echo $HOME
2014-10-12 00:19:37,387 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-12 00:19:37,388 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2014-10-12 00:19:37,391 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2014-10-12 00:19:37,760 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-12 00:19:37,901 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-12 00:19:38,003 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-12 00:19:38,802 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
messages: {message_id: chararray,thread_id: chararray,in_reply_to: chararray,subject: chararray,body: chararray,date: chararray,from: (real_name: chararray,address: chararray),tos: {ARRAY_ELEM: (real_name: chararray,address: chararray)},ccs: {ARRAY_ELEM: (real_name: chararray,address: chararray)},bccs: {ARRAY_ELEM: (real_name: chararray,address: chararray)},reply_tos: {ARRAY_ELEM: (real_name: chararray,address: chararray)}}
2014-10-12 00:19:39,538 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter], RULES_DISABLED=[FilterLogicExpressionSimplifier]}
#-----------------------------------------------

New Logical Plan:

#-----------------------------------------------
messages: (Name: LOStore Schema: message_id#26:chararray,thread_id#27:chararray,in_reply_to#28:chararray,subject#29:chararray,body#30:chararray,date#31:chararray,from#32:tuple(real_name#33:chararray,address#34:chararray),tos#35:bag{ARRAY_ELEM#36:tuple(real_name#37:chararray,address#38:chararray)},ccs#39:bag{ARRAY_ELEM#40:tuple(real_name#41:chararray,address#42:chararray)},bccs#43:bag{ARRAY_ELEM#44:tuple(real_name#45:chararray,address#46:chararray)},reply_tos#47:bag{ARRAY_ELEM#48:tuple(real_name#49:chararray,address#50:chararray)})
|
|---messages: (Name: LOLoad Schema: message_id#26:chararray,thread_id#27:chararray,in_reply_to#28:chararray,subject#29:chararray,body#30:chararray,date#31:chararray,from#32:tuple(real_name#33:chararray,address#34:chararray),tos#35:bag{ARRAY_ELEM#36:tuple(real_name#37:chararray,address#38:chararray)},ccs#39:bag{ARRAY_ELEM#40:tuple(real_name#41:chararray,address#42:chararray)},bccs#43:bag{ARRAY_ELEM#44:tuple(real_name#45:chararray,address#46:chararray)},reply_tos#47:bag{ARRAY_ELEM#48:tuple(real_name#49:chararray,address#50:chararray)})RequiredFields:null
#-----------------------------------------------

Physical Plan:

#-----------------------------------------------
messages: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-1
|
|---messages: Load(/tmp/test_mbox:org.apache.pig.piggybank.storage.avro.AvroStorage) - scope-0

#--------------------------------------------------

Map Reduce Plan

#--------------------------------------------------
No MR jobs. Fetch only.
2014-10-12 00:19:39,948 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-12 00:19:39,950 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2014-10-12 00:19:39,969 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[LoadTypeCastInserter, StreamTypeCastInserter], RULES_DISABLED=[AddForEach, ColumnMapKeyPrune, FilterLogicExpressionSimplifier, GroupByConstParallelSetter, LimitOptimizer, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter]}
2014-10-12 00:19:40,024 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-10-12 00:19:40,044 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-10-12 00:19:40,044 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-10-12 00:19:40,070 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2014-10-12 00:19:40,150 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2014-10-12 00:19:40,150 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-10-12 00:19:40,151 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
2014-10-12 00:19:40,454 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: messages[11,11] C: R:
2014-10-12 00:19:40,457 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-12 00:19:40,458 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
2014-10-12 00:19:40,492 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-10-12 00:19:40,500 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
2014-10-12 00:19:40,500 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.apache.pig.piggybank.storage.avro.PigAvroInputFormat.listStatus(PigAvroInputFormat.java:96)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:190)
at org.apache.pig.impl.io.ReadToEndLoader.(ReadToEndLoader.java:146)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad.setUp(POLoad.java:95)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad.getNextTuple(POLoad.java:123)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:202)
at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:259)
at org.apache.pig.pen.ExampleGenerator.readBaseData(ExampleGenerator.java:223)
at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:155)
at org.apache.pig.PigServer.getExamples(PigServer.java:1282)
at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:810)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:802)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:381)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:228)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:608)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Details also at logfile: /private/tmp/pig_1413091176168.log
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ !v
vi test.pig
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ pig -l /tmp -x local -w -v test.pig
2014-10-12 00:20:23,747 INFO [main] pig.ExecTypeProvider (ExecTypeProvider.java:selectExecType(41)) - Trying ExecType : LOCAL
2014-10-12 00:20:23,749 INFO [main] pig.ExecTypeProvider (ExecTypeProvider.java:selectExecType(43)) - Picked LOCAL as the ExecType
2014-10-12 00:20:23,818 [main] INFO org.apache.pig.Main - Apache Pig version 0.13.0 (r1606446) compiled Jun 29 2014, 02:27:58
2014-10-12 00:20:23,818 [main] INFO org.apache.pig.Main - Logging error messages to: /private/tmp/pig_1413091223742.log
2014-10-12 00:20:24.117 java[13512:1003] Unable to load realm info from SCDynamicStore
2014-10-12 00:20:24,122 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2014-10-12 00:20:24,796 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /Users/davidlaxer/.pigbootup not found
2014-10-12 00:20:24,852 [main] INFO org.apache.pig.tools.parameters.PreprocessorContext - Executing command : echo $HOME
2014-10-12 00:20:25,076 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-12 00:20:25,077 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2014-10-12 00:20:25,079 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2014-10-12 00:20:25,386 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-12 00:20:25,542 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-12 00:20:25,623 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-12 00:20:26,599 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
messages: {message_id: chararray,thread_id: chararray,in_reply_to: chararray,subject: chararray,body: chararray,date: chararray,from: (real_name: chararray,address: chararray),tos: {ARRAY_ELEM: (real_name: chararray,address: chararray)},ccs: {ARRAY_ELEM: (real_name: chararray,address: chararray)},bccs: {ARRAY_ELEM: (real_name: chararray,address: chararray)},reply_tos: {ARRAY_ELEM: (real_name: chararray,address: chararray)}}
2014-10-12 00:20:27,338 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter], RULES_DISABLED=[FilterLogicExpressionSimplifier]}
#-----------------------------------------------

New Logical Plan:

#-----------------------------------------------
messages: (Name: LOStore Schema: message_id#26:chararray,thread_id#27:chararray,in_reply_to#28:chararray,subject#29:chararray,body#30:chararray,date#31:chararray,from#32:tuple(real_name#33:chararray,address#34:chararray),tos#35:bag{ARRAY_ELEM#36:tuple(real_name#37:chararray,address#38:chararray)},ccs#39:bag{ARRAY_ELEM#40:tuple(real_name#41:chararray,address#42:chararray)},bccs#43:bag{ARRAY_ELEM#44:tuple(real_name#45:chararray,address#46:chararray)},reply_tos#47:bag{ARRAY_ELEM#48:tuple(real_name#49:chararray,address#50:chararray)})
|
|---messages: (Name: LOLoad Schema: message_id#26:chararray,thread_id#27:chararray,in_reply_to#28:chararray,subject#29:chararray,body#30:chararray,date#31:chararray,from#32:tuple(real_name#33:chararray,address#34:chararray),tos#35:bag{ARRAY_ELEM#36:tuple(real_name#37:chararray,address#38:chararray)},ccs#39:bag{ARRAY_ELEM#40:tuple(real_name#41:chararray,address#42:chararray)},bccs#43:bag{ARRAY_ELEM#44:tuple(real_name#45:chararray,address#46:chararray)},reply_tos#47:bag{ARRAY_ELEM#48:tuple(real_name#49:chararray,address#50:chararray)})RequiredFields:null
#-----------------------------------------------

Physical Plan:

#-----------------------------------------------
messages: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-1
|
|---messages: Load(/tmp/test_mbox:org.apache.pig.piggybank.storage.avro.AvroStorage) - scope-0

#--------------------------------------------------

Map Reduce Plan

#--------------------------------------------------
No MR jobs. Fetch only.
2014-10-12 00:20:27,675 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-12 00:20:27,692 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2014-10-12 00:20:27,704 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[LoadTypeCastInserter, StreamTypeCastInserter], RULES_DISABLED=[AddForEach, ColumnMapKeyPrune, FilterLogicExpressionSimplifier, GroupByConstParallelSetter, LimitOptimizer, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter]}
2014-10-12 00:20:27,985 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-10-12 00:20:28,013 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-10-12 00:20:28,013 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-10-12 00:20:28,096 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2014-10-12 00:20:28,105 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2014-10-12 00:20:28,106 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-10-12 00:20:28,106 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
2014-10-12 00:20:28,514 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: messages[11,11] C: R:
2014-10-12 00:20:28,516 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-12 00:20:28,518 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
2014-10-12 00:20:28,584 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-10-12 00:20:28,589 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
2014-10-12 00:20:28,589 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.apache.pig.piggybank.storage.avro.PigAvroInputFormat.listStatus(PigAvroInputFormat.java:96)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:190)
at org.apache.pig.impl.io.ReadToEndLoader.(ReadToEndLoader.java:146)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad.setUp(POLoad.java:95)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad.getNextTuple(POLoad.java:123)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:202)
at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:259)
at org.apache.pig.pen.ExampleGenerator.readBaseData(ExampleGenerator.java:223)
at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:155)
at org.apache.pig.PigServer.getExamples(PigServer.java:1282)
at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:810)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:802)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:381)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:228)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:608)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Details also at logfile: /private/tmp/pig_1413091223742.log
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ pig -secretDebugCmd
Find hadoop at /Users/davidlaxer/hadoop-2.3.0/bin/hadoop
dry run:
HADOOP_CLASSPATH: /Users/davidlaxer/pig-0.13.0/conf:/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/lib/tools.jar:/users/davidlaxer/hadoop-2.3.0-src/src/conf:/Users/davidlaxer/hadoop-2.3.0/src/conf:/Users/davidlaxer/pig-0.13.0/lib/accumulo-core-1.5.0.jar:/Users/davidlaxer/pig-0.13.0/lib/accumulo-fate-1.5.0.jar:/Users/davidlaxer/pig-0.13.0/lib/accumulo-server-1.5.0.jar:/Users/davidlaxer/pig-0.13.0/lib/accumulo-start-1.5.0.jar:/Users/davidlaxer/pig-0.13.0/lib/accumulo-trace-1.5.0.jar:/Users/davidlaxer/pig-0.13.0/lib/avro-1.7.5.jar:/Users/davidlaxer/pig-0.13.0/lib/avro-mapred-1.7.5.jar:/Users/davidlaxer/pig-0.13.0/lib/avro-tools-1.7.5-nodeps.jar:/Users/davidlaxer/pig-0.13.0/lib/groovy-all-1.8.6.jar:/Users/davidlaxer/pig-0.13.0/lib/hbase-0.94.1.jar:/Users/davidlaxer/pig-0.13.0/lib/jruby-complete-1.6.7.jar:/Users/davidlaxer/pig-0.13.0/lib/js-1.7R2.jar:/Users/davidlaxer/pig-0.13.0/lib/json-simple-1.1.jar:/Users/davidlaxer/pig-0.13.0/lib/jython-standalone-2.5.3.jar:/Users/davidlaxer/pig-0.13.0/lib/piggybank.jar:/Users/davidlaxer/pig-0.13.0/lib/protobuf-java-2.4.0a.jar:/Users/davidlaxer/pig-0.13.0/lib/zookeeper-3.4.5.jar:/Users/davidlaxer/pig-0.13.0/pig-0.13.0-withouthadoop-h2.jar:
HADOOP_OPTS: -Xmx1000m -Dpig.log.dir=/Users/davidlaxer/pig-0.13.0/logs -Dpig.log.file=pig.log -Dpig.home.dir=/Users/davidlaxer/pig-0.13.0
HADOOP_CLIENT_OPTS: -Xmx1000m -Dpig.log.dir=/Users/davidlaxer/pig-0.13.0/logs -Dpig.log.file=pig.log -Dpig.home.dir=/Users/davidlaxer/pig-0.13.0
/Users/davidlaxer/hadoop-2.3.0/bin/hadoop jar /Users/davidlaxer/pig-0.13.0/pig-0.13.0-withouthadoop-h2.jar

(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ env | grep HADOOP
HADOOP_HOME=/Users/davidlaxer/hadoop-2.3.0
HADOOP_CONF_DIR=/Users/davidlaxer/hadoop-2.3.0/src/conf
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ vi ~/.bash_profile
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ source !$
source ~/.bash_profile
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ pig -version
Apache Pig version 0.13.0 (r1606446)
compiled Jun 29 2014, 02:27:58
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ hadoop version
Hadoop 2.3.0
Subversion http://svn.apache.org/repos/asf/hadoop/common -r 1567123
Compiled by jenkins on 2014-02-11T13:40Z
Compiled with protoc 2.5.0
From source with checksum dfe46336fbc6a044bc124392ec06b85
This command was run using /Users/davidlaxer/hadoop-2.3.0/share/hadoop/common/hadoop-common-2.3.0.jar
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ java version

java version "1.6.0_65"
Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$ cat test.pig
/* Set Home Directory - where we install software */
%default HOME echo \$HOME

REGISTER /Users/davidlaxer/pig-0.13.0/lib/avro-1.7.5.jar
REGISTER /Users/davidlaxer/pig-0.13.0/lib/json-simple-1.1.jar
REGISTER /Users/davidlaxer/pig-0.13.0/lib/piggybank.jar

/* DEFINE AvroStorage org.apache.pig.piggybank.storage.avro.AvroStorage();*/

/* Load the emails in avro format (edit the path to match where you saved them) using the AvroStorage UDF from Piggybank */
messages = LOAD '/tmp/test_mbox' USING org.apache.pig.piggybank.storage.avro.AvroStorage();

DESCRIBE messages;
EXPLAIN messages;
ILLUSTRATE messages;
lmt = LIMIT messages 100;
dump messages;

STORE messages INTO '/tmp/messages' USING org.apache.pig.piggybank.storage.avro.AvroStorage();
(virtualenv)David-Laxers-MacBook-Pro:pig davidlaxer$

@dbl001
Copy link
Author

dbl001 commented Oct 13, 2014

Pig Stack Trace

ERROR 2998: Unhandled internal error. Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected

java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.apache.pig.piggybank.storage.avro.PigAvroInputFormat.listStatus(PigAvroInputFormat.java:96)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:190)
at org.apache.pig.impl.io.ReadToEndLoader.(ReadToEndLoader.java:146)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad.setUp(POLoad.java:95)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad.getNextTuple(POLoad.java:123)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:202)
at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:259)
at org.apache.pig.pen.ExampleGenerator.readBaseData(ExampleGenerator.java:223)
at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:155)
at org.apache.pig.PigServer.getExamples(PigServer.java:1282)
at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:810)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:802)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:381)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:228)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:608)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Does this imply the UDF Avro is using a depricated API which is causing the Java exception?

[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

j

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant