Skip to content

Commit 1cd2bcd

Browse files
committed
HADOOP-19083. aws sdk is optional in release builds
* update building doc * LICENSE-binary makes clear it is optional * hadoop s3guard bucket-info tool reports error better * docs cover how to install. It's actually quite hard to manually install; unless we can give better instructions I almost think we'd want to create releases with and without the AWS SDK. Let's target 3.4.1 for that Change-Id: I2c91963a21b5c289e05218c2cbce0561b8e48b60
1 parent 8516483 commit 1cd2bcd

File tree

5 files changed

+60
-9
lines changed

5 files changed

+60
-9
lines changed

BUILDING.txt

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ Maven build goals:
146146
* Run clover : mvn test -Pclover
147147
* Run Rat : mvn apache-rat:check
148148
* Build javadocs : mvn javadoc:javadoc
149-
* Build distribution : mvn package [-Pdist][-Pdocs][-Psrc][-Pnative][-Dtar][-Preleasedocs][-Pyarn-ui]
149+
* Build distribution : mvn package [-Pdist][-Pdocs][-Psrc][-Pnative][-Dtar][-Preleasedocs][-Pyarn-ui][-Pawssdk]
150150
* Change Hadoop version : mvn versions:set -DnewVersion=NEWVERSION
151151

152152
Build options:
@@ -159,6 +159,7 @@ Maven build goals:
159159
* Use -Pyarn-ui to build YARN UI v2. (Requires Internet connectivity)
160160
* Use -DskipShade to disable client jar shading to speed up build times (in
161161
development environments only, not to build release artifacts)
162+
* Use -Pawssdk to include the AWS V2 SDK in the release distribution
162163

163164
YARN Application Timeline Service V2 build options:
164165

@@ -374,6 +375,13 @@ Create binary distribution with native code:
374375

375376
$ mvn package -Pdist,native -DskipTests -Dtar
376377

378+
Create binary distribution with AWS SDK:
379+
380+
$ mvn package -Pdist,awssdk -DskipTests -Dtar
381+
382+
This ensures that the hadoop-aws sdk has all its dependencies,
383+
but does approximately double the size of the tar file.
384+
377385
Create source distribution:
378386

379387
$ mvn package -Psrc -DskipTests

LICENSE-binary

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -363,8 +363,9 @@ org.objenesis:objenesis:2.6
363363
org.xerial.snappy:snappy-java:1.1.10.4
364364
org.yaml:snakeyaml:2.0
365365
org.wildfly.openssl:wildfly-openssl:1.1.3.Final
366-
software.amazon.awssdk:bundle:jar:2.23.19
367366

367+
In distributions which include the aws V2 sdk:
368+
software.amazon.awssdk:bundle:jar:2.23.19
368369

369370
--------------------------------------------------------------------------------
370371
This product bundles various third-party components under other open source

dev-support/bin/create-release

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -683,7 +683,7 @@ function signartifacts
683683
exit 1
684684
fi
685685
fi
686-
}
686+
}1
687687

688688
# find root of the source tree
689689
BIN=$(hadoop_abs "${BASH_SOURCE:-$0}")

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardTool.java

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -420,8 +420,14 @@ public int run(String[] args, PrintStream out)
420420
CommandFormat commands = getCommandFormat();
421421
URI fsURI = toUri(s3Path);
422422

423-
S3AFileSystem fs = bindFilesystem(
424-
FileSystem.newInstance(fsURI, getConf()));
423+
S3AFileSystem fs;
424+
try {
425+
fs = bindFilesystem(FileSystem.newInstance(fsURI, getConf()));
426+
} catch (NoClassDefFoundError e) {
427+
println(out, "Failed to instantiate S3A filesystem due to missing class: %s", e);
428+
println(out, "Make sure the AWS v2 SDK is on the classpath");
429+
throw e;
430+
}
425431
Configuration conf = fs.getConf();
426432
URI fsUri = fs.getUri();
427433
println(out, "Filesystem %s", fsUri);

hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md

Lines changed: 40 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,6 @@
1616

1717
<!-- MACRO{toc|fromDepth=0|toDepth=2} -->
1818

19-
20-
2119
## <a name="compatibility"></a> Compatibility
2220

2321

@@ -53,8 +51,46 @@ full details.
5351

5452
## <a name="overview"></a> Overview
5553

56-
Apache Hadoop's `hadoop-aws` module provides support for AWS integration.
57-
applications to easily use this support.
54+
Apache Hadoop's `hadoop-aws` module provides support for AWS integration,
55+
primarily the s3a open source connector to Amazon S3 Storage, including
56+
Amazon S3 Express One zone storage as well as third-party stores with S3
57+
compatibility.
58+
59+
## <a name="installation"></a> Installation
60+
61+
### <a name="SDK download"></a> SDK Download
62+
63+
This release uses the AWS SDK for Java 2.0
64+
65+
Unless using a hadoop release with the AWS SDK `bundle.jar` JAR included
66+
in the binary distribution, the library MUST be downloaded and installed
67+
into the hadoop distribution.
68+
69+
The exact version of the SDK to be used is listed in the file:
70+
```
71+
LICENSE-binary
72+
```
73+
The [mvn repository](https://mvnrepository.com/)
74+
site will list it as a "Compile Dependency" of the
75+
[hadoop-aws](https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws) artifact.
76+
77+
AWS SDK releases can be downloaded from github at [AWS SDK for Java 2.0](https://github.com/aws/aws-sdk-java-v2)
78+
79+
Or from the [Maven central repository](https://repo1.maven.org/maven2/software/amazon/awssdk/bundle/).
80+
81+
Download the release and place it in the directory `share/hadoop/tools/lib`
82+
of the hadoop distribution.
83+
84+
* Using an earlier SDK than that this SDK was compiled and tested against
85+
will not work.
86+
* Using a later SDK *should* work, but there are no guarantees.
87+
* The V1 SDK will not work.
88+
89+
Any project declaring a dependency on `hadoop-aws` in their Maven/Ivy/SBT/Gradle
90+
build will automatically get the specific version of the AWS SDK which this
91+
module was compiled against.
92+
93+
### <a name="inclusion-on-classpath"></a> Inclusion on classpath
5894

5995
To include the S3A client in Apache Hadoop's default classpath:
6096

0 commit comments

Comments
 (0)