Unable to run a job with main file point to s3 bucket #2301

nownikhil · 2024-10-31T00:06:58Z

What question do you want to ask?

[ Y ] ✋ I have searched the open/closed issues and my issue is not listed.
Similar issue found: Unable to use S3 File as mainApplicationFile #996

Error

│ Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found                          │
│   at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2688)                                                                              │
│   at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3431)                                                                            │
│   at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3466)                                                                              │
│   at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)                                                                                     │
│   at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)                                                                             │
│   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)                                                                                     │
│   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)                                                                                            │
│   at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1831)                                                                                  │
│   at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:727)                                                                                           │
│   at org.apache.spark.util.DependencyUtils$.downloadFile(DependencyUtils.scala:264)                                                                      │
│   at org.apache.spark.deploy.k8s.KubernetesUtils$.loadPodFromTemplate(KubernetesUtils.scala:103)                                                         │
│   ... 18 more                                                                                                                                            │
│ Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found                                                      │
│   at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2592)                                                                        │
│   at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2686)                                                                              │

Additional context

This is happening because spark-operator image doesn't have hadoop-aws jar. Is there a recommended way to pull jars from S3?

No response

Have the same question?

Give it a 👍 We prioritize the question with most 👍

The text was updated successfully, but these errors were encountered:

ujjawal-khare · 2024-11-12T17:14:02Z

Facing the same issue, the only solution that I found was to bake the hadoop jar inside the operator but not sure if this is the right way as we would be having dynamic jars coming in.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to run a job with main file point to s3 bucket #2301

Unable to run a job with main file point to s3 bucket #2301

nownikhil commented Oct 31, 2024 •

edited

Loading

ujjawal-khare commented Nov 12, 2024

Unable to run a job with main file point to s3 bucket #2301

Unable to run a job with main file point to s3 bucket #2301

Comments

nownikhil commented Oct 31, 2024 • edited Loading

What question do you want to ask?

Additional context

Have the same question?

ujjawal-khare commented Nov 12, 2024

nownikhil commented Oct 31, 2024 •

edited

Loading