Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apache commons dependency issue #133

Closed
ardlema opened this issue Jul 15, 2019 · 6 comments
Closed

Apache commons dependency issue #133

ardlema opened this issue Jul 15, 2019 · 6 comments

Comments

@ardlema
Copy link

ardlema commented Jul 15, 2019

Expected Behavior

I'm trying to use the library to read an excel file using the following command:

val df = spark.read.format("com.crealytics.spark.excel").option("dataAddress","PARTA").option("useHeader","true").load("myfileXLSX")

Current Behavior

I'm getting the following NoClassDefFoundError:

Caused by: java.lang.NoClassDefFoundError: org/apache/commons/collections4/IteratorUtils
  at shadeio.poi.openxml4j.util.ZipInputStreamZipEntrySource.getEntries(ZipInputStreamZipEntrySource.java:58)
  at shadeio.poi.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:286)
  at shadeio.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:725)
  at shadeio.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:301)
  at shadeio.poi.xssf.usermodel.XSSFWorkbookFactory.createWorkbook(XSSFWorkbookFactory.java:129)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at shadeio.poi.ss.usermodel.WorkbookFactory.createWorkbook(WorkbookFactory.java:314)
  ... 80 more
Caused by: java.lang.ClassNotFoundException: org.apache.commons.collections4.IteratorUtils
  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  ... 90 more

Before executing the previous command I'm importing the following:

import spark.implicits._
import com.crealytics.spark.excel._

If I don't import spark.implicits._ I'm also getting some errors when importing the spark-excel library:

error: missing or invalid dependency detected while loading class file 'DataColumn.class'.
Could not access term poi in package org.apache,
because it (or its dependencies) are missing. Check your build definition for
missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
A full rebuild may help if 'DataColumn.class' was compiled against an incompatible version of org.apache.
error: missing or invalid dependency detected while loading class file 'DataColumn.class'.
Could not access term ss in value org.apache.poi,
because it (or its dependencies) are missing. Check your build definition for
missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
A full rebuild may help if 'DataColumn.class' was compiled against an incompatible version of org.apache.poi.

I'm using the version 0.12.0 of the library.

@bryanwnelson757
Copy link

Did you ever get this working? I'm having the same problem - though I'm using 0.12.3 for scala 2.11.

@fwani
Copy link

fwani commented Apr 16, 2020

I had same error.

java.io.IOException: org/apache/commons/collections4/IteratorUtils

I solved this problem by using a package which is commons-collections4-4.1.jar.

And, I got two other errors, like below.

# first error
java.lang.NoClassDefFoundError: org/apache/xmlbeans/XmlObject
# to solve, using xmlbeans-3.1.0.jar

# second error
java.lang.NoClassDefFoundError: org/openxmlformats/schemas/drawingml/x2006/main/ThemeDocument
# to solve, using poi-ooxml-schemas-4.1.2.jar

Hope it helps you solve your problem.

@AlbertFX91
Copy link

Thanks @fwani for your help!

@jakeatmsft
Copy link

I was able to solve by adding all recommended jars.

image

@quanghgx
Copy link
Collaborator

Thanks @jakeatmsft and @fwani for your solutions.
Added these to a wiki page Dependencies to help others with similar issues.
Sincerely,

@grajee-everest
Copy link

grajee-everest commented Nov 24, 2021

Thanks @jakeatmsft and @fwani for your solutions. Added these to a wiki page Dependencies to help others with similar issues. Sincerely,

What does this mean - "Beside the bundled jars"?_ I thought the only ones need are the 5(4) mentioned here in the red rectangle.

image

See my issue - #464

Spark Excel, as of August 2021, beside the bundled jars, need following dependencies

spark-excel_2.12 (or build with, for example sbt -Dspark.testVersion=3.1.2 assembly)
poi-ooxm (already bundled with spark-excel.jars)
poi-ooxml-schemas
xmlbeans
commons-collections4

@quanghgx , @jakeatmsft and @fwani

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants