Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue with spark-excel jar #344

Closed
tufanrakshit opened this issue Feb 11, 2021 · 7 comments
Closed

issue with spark-excel jar #344

tufanrakshit opened this issue Feb 11, 2021 · 7 comments

Comments

@tufanrakshit
Copy link

Hi ,
I am using the below format to read multitab , xlsx file .
I am using spark version 3.0.1 and scala 2.11 and spark-excel_2.11-0.8.2.jar .
Previously i tried with 2.13.1 , but failed as well

snippet :
val df_cmdb = spark
.read
.format("com.crealytics.spark.excel")
.option("dataAddress", "'CMDB'!B3:C35") // Optional, default: "A1"
.option("header", "true") // Required
.option("treatEmptyValuesAsNulls", "false") // Optional, default: true
.option("usePlainNumberFormat", "false") // Optional, default: false, If true, format the cells without rounding and scientific notations
.option("inferSchema", "false") // Optional, default: false
.option("addColorColumns", "true") // Optional, default: false
.option("timestampFormat", "MM-dd-yyyy HH:mm:ss") // Optional, default: yyyy-mm-dd hh:mm:ss[.fffffffff]
.option("maxRowsInMemory", 20) // Optional, default None. If set, uses a streaming reader which can help with big files
.option("excerptSize", 10) // Optional, default: 10. If set and if schema inferred, number of rows to infer schema from
//.option("workbookPassword", "pass") // Optional, default None. Requires unlimited strength JCE for older JVMs
//.schema(myCustomSchema) // Optional, default: Either inferred schema, or all columns are Strings
.load("C:\Project\MSF_RP_SCES_Backup_Report_PWC_20210202.xlsx")

Error : Exception in thread "main" java.lang.IllegalArgumentException: Parameter "location" is missing in options.

what should be version of scala and jar file I should use ?
can you please also provide the location of the jar .

many thanks
image

@nightscape
Copy link
Owner

You definitely need double backslashes \\ in the path. Not sure if that is the problem though.

@tufanrakshit
Copy link
Author

what is the latest version of the Jar that is available in maven central please ? Also Does it compatible with Spark 3.0.1 and what is the scala version I should use ?

@nightscape
Copy link
Owner

@tufanrakshit
Copy link
Author

you were right , it was my mistake in the path .
I have done a few things , i have Spark version 3.0.1 , Hadoop 3.2 , Scala 2.12.13 , and the jar I am using is spark-excel_2.12-0.13.6 .
The jar is added in my module
image

I am getting the stack trace
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/xmlbeans/XmlException
at shadeio.poi.xssf.usermodel.XSSFWorkbookFactory.createWorkbook(XSSFWorkbookFactory.java:97)
at shadeio.poi.xssf.usermodel.XSSFWorkbookFactory.createWorkbook(XSSFWorkbookFactory.java:147)
at shadeio.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:256)
at shadeio.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:221)
at com.crealytics.spark.excel.DefaultWorkbookReader.$anonfun$openWorkbook$1(WorkbookReader.scala:49)
at scala.Option.fold(Option.scala:251)
at com.crealytics.spark.excel.DefaultWorkbookReader.openWorkbook(WorkbookReader.scala:49)
at com.crealytics.spark.excel.WorkbookReader.withWorkbook(WorkbookReader.scala:14)
at com.crealytics.spark.excel.WorkbookReader.withWorkbook$(WorkbookReader.scala:13)
at com.crealytics.spark.excel.DefaultWorkbookReader.withWorkbook(WorkbookReader.scala:45)
at com.crealytics.spark.excel.ExcelRelation.excerpt$lzycompute(ExcelRelation.scala:31)
at com.crealytics.spark.excel.ExcelRelation.excerpt(ExcelRelation.scala:31)
at com.crealytics.spark.excel.ExcelRelation.headerColumns$lzycompute(ExcelRelation.scala:102)
at com.crealytics.spark.excel.ExcelRelation.headerColumns(ExcelRelation.scala:101)
at com.crealytics.spark.excel.ExcelRelation.$anonfun$inferSchema$1(ExcelRelation.scala:163)
at scala.Option.getOrElse(Option.scala:189)
at com.crealytics.spark.excel.ExcelRelation.inferSchema(ExcelRelation.scala:162)
at com.crealytics.spark.excel.ExcelRelation.(ExcelRelation.scala:35)
at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:35)
at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:13)
at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:8)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:344)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:297)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:286)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:286)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:232)
at Data_Import$.main(Data_Import.scala:35)
at Data_Import.main(Data_Import.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.xmlbeans.XmlException
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 29 more

@nightscape
Copy link
Owner

Have you manually added the jar to the project?
I would not recommend that. Use a build tool like SBT, Gradle or Maven to resolve the dependencies properly.

@quanghgx
Copy link
Collaborator

Hi @tufanrakshit

It seems that, the latest issue is about the dependency Jars.
There are solution from @jakeatmsft and @fwani in #133 and we also put this into a wiki page here.

Please take a look and feel free to reopen this ticket in case of further issue.

@tufanrakshit
Copy link
Author

tufanrakshit commented Aug 25, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants