-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 file access PoC using Hadoop FS API #1556
Labels
feature
New feature
priority: undecided
Undecided priority to be assigned after discussion
under discussion
Requires consideration before a decision is made whether/how to implement
Comments
dk1844
added
feature
New feature
under discussion
Requires consideration before a decision is made whether/how to implement
priority: undecided
Undecided priority to be assigned after discussion
labels
Oct 15, 2020
Inteding to base implementation of this feature on https://github.com/AbsaOSS/spark-s3-writer-poc/pull/3 both in Atum and here in Enceladus |
Resolved in #1586 |
benedeki
added a commit
that referenced
this issue
Jan 29, 2021
#1422 and 1423 Remove HDFS and Oozie from Menas #1422 Fix HDFS location validation #1424 Add Menas Dockerfile #1416 hadoop-aws 2.8.5 + s3 aws sdk 2.13.65 compiles. #1416 - enceladus on S3: * - all directly-hdfs touching stuff disabled (atum, performance measurements, info files, output path checking) # Add menasfargate into hosts # paste # save & exit (ctrl+O, ctrl+X) #1416 - enceladus on S3 - (crude) conformance works on s3 (s3 std input, s3 conf output) * Merge spline 0.5.3 into aws-poc * Update spline to 0.5.4 for AWS PoC #1503 Remove HDFS url Validation * New dockerfile - smaller image * s3 persistence (atum, sdk fs usage, ...) (#1526) #1526 * FsUtils divided into LocalFsUtils & HdfsUtils * PathConfigSuite update * S3FsUtils with tail-recursive pagination accumulation - now generic with optional short-circuit breakOut * TestRunnerJob updated to manually cover the cases - should serve as a basis for tests * HdfsUtils replace by trait DistributedFsUtils (except for MenasCredentials loading & nonSplittable splitting) * using final version of s3-powered Atum (3.0.0) * mockito-update version update, scalatest version update * S3FsUtilsSuite: exists, read, sizeDir(hidden, non-hidden, reucursive), non-splittable (simple, recursive with breakOut), delete (recursive), version find (simple - empty, recursive) * explicit stubbing fix for hyperdrive #1556 file access PoC using Hadoop FS API (#1586) * s3 using hadoop fs api * s3 sdk usage removed (pom, classes, tests) * atum final version 3.1.0 used * readStandardizationInputData(... path: String)(implicit ... fs: FileSystem) -> readStandardizationInputData(input: PathWithFs) #1554 Tomcat with TLS container in Docker container #1554 Added envoy config + enabling running unencrypted container #1499 Add authentication to /lineage + update spline to 0.5.5 #1618 - fixes failing spline 0.5.5 integration by providing compatible commons library version. Test-ran on EMR. (#1619) #1434 Add new way of serving properties to Docker #1622: Merge of aws-poc to develop brach * put back HDFS browser * put back Oozie * downgraded Spline * Scopt 4.0.0 * AWS SDK Exclusion * ATUM version 3.2.2 Co-authored-by: Saša Zejnilović <zejnils@gmail.com> Co-authored-by: Daniel Kavan <dk1844@gmail.com> Co-authored-by: Adrian Olosutean <adi.olosutean@gmail.com> Co-authored-by: Adrian Olosutean <adrian.olosutean@absa.africa> Co-authored-by: Jan Scherbaum <kmoj02@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
feature
New feature
priority: undecided
Undecided priority to be assigned after discussion
under discussion
Requires consideration before a decision is made whether/how to implement
Background
An interesting point has been raised by Tony in regard of the S3 PoC file access, currently written in terms of AWS SDK for S3.
Feature
A PoC attempt should be made to use Hadoop FS API to gain access to consistency features.
The text was updated successfully, but these errors were encountered: