ArrayIndexOutOfBoundsException #5

flamingofugang · 2016-07-14T14:07:08Z

I got the following exception error when I run hadoop mr:

Sampling started
16/07/14 09:25:29 INFO input.FileInputFormat: **Total input paths to process : 0**
16/07/14 09:25:29 INFO partition.InputSampler: Using 0 samples
16/07/14 09:25:29 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
16/07/14 09:25:29 INFO compress.CodecPool: Got brand-new compressor [.deflate]
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
        at org.apache.hadoop.mapreduce.lib.partition.InputSampler.writePartitionFile(InputSampler.java:340)
        at org.rdfhdt.mrbuilder.HDTBuilderDriver.runDictionaryJob(HDTBuilderDriver.java:242)
        at org.rdfhdt.mrbuilder.HDTBuilderDriver.main(HDTBuilderDriver.java:112)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Here is the code snippet causing exception:
InputSampler.writePartitionFile(job, new InputSampler.IntervalSampler<Text, Text>(this.conf.getSampleProbability()));

It seems the input files are not found... I created 'input' directory, and put ntriples '.nt' files in it.

Any idea?

Best,
Gang

The text was updated successfully, but these errors were encountered:

artob · 2016-08-08T15:33:12Z

@flamingofugang Just to check, is this a regression given the changes in the last two months, or is this the furthest as yet in making HDT-MR actually work?

flamingofugang · 2016-08-08T16:22:58Z

There is still lzo compression library issue, I will report later on.

artob · 2016-08-11T06:22:02Z

Related pull request: #4

artob · 2016-08-11T07:54:50Z

@flamingofugang Does your pull request #4 resolve this?

flamingofugang · 2016-08-11T18:38:21Z

The java program takes in lzo compressed ntriples file as input, and the lzo file should be indexed as far as I understand.

I changed the pom.xml to make dependency on a locally build hadoop lzo package with native lzo library available.

I recommend this should be explained in the README file:

First the user need to install lzo and lzop
Second, build hadoop lzo package: https://github.com/twitter/hadoop-lzo
Then register that jar in the local .m2 repository, then build this hdt-mr package.

tangina-sultana · 2020-07-02T10:46:02Z

Hi, can you share the installation process of HDT-MR?

artob added the help wanted label Aug 8, 2016

artob mentioned this issue Aug 11, 2016

Use Hadoop 2 and package executable JAR #4

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ArrayIndexOutOfBoundsException #5

ArrayIndexOutOfBoundsException #5

flamingofugang commented Jul 14, 2016 •

edited by artob

Loading

artob commented Aug 8, 2016

flamingofugang commented Aug 8, 2016 •

edited by artob

Loading

artob commented Aug 11, 2016

artob commented Aug 11, 2016

flamingofugang commented Aug 11, 2016

tangina-sultana commented Jul 2, 2020

ArrayIndexOutOfBoundsException #5

ArrayIndexOutOfBoundsException #5

Comments

flamingofugang commented Jul 14, 2016 • edited by artob Loading

artob commented Aug 8, 2016

flamingofugang commented Aug 8, 2016 • edited by artob Loading

artob commented Aug 11, 2016

artob commented Aug 11, 2016

flamingofugang commented Aug 11, 2016

tangina-sultana commented Jul 2, 2020

flamingofugang commented Jul 14, 2016 •

edited by artob

Loading

flamingofugang commented Aug 8, 2016 •

edited by artob

Loading