Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge of aws-poc to develop brach #1622

Closed
6 tasks
Zejnilovic opened this issue Dec 17, 2020 · 1 comment · Fixed by #1632
Closed
6 tasks

Merge of aws-poc to develop brach #1622

Zejnilovic opened this issue Dec 17, 2020 · 1 comment · Fixed by #1632
Assignees
Labels
priority: high Critical to the health of the project refactoring Improving code quality, paying off tech debt, aligning API, cleanup of unused code

Comments

@Zejnilovic
Copy link
Contributor

Description

Merging of aws-poc into the develop to keep only one long running/living branch

Feature set

  • Keep HDFS/S3 writer/reader logic
  • Keep new Atum
  • Keep Menas DockerSetup
  • Don’t keep Spline 0.5, keep spline 0.3
  • Restore HDFS (file) browser in Menas
  • Restore Oozie in Menas
@Zejnilovic Zejnilovic added refactoring Improving code quality, paying off tech debt, aligning API, cleanup of unused code priority: medium Important but not urgent labels Dec 17, 2020
@benedeki benedeki added priority: high Critical to the health of the project and removed priority: medium Important but not urgent labels Dec 18, 2020
@benedeki benedeki self-assigned this Jan 4, 2021
benedeki added a commit that referenced this issue Jan 29, 2021
#1422 and 1423 Remove HDFS and Oozie from Menas

#1422 Fix HDFS location validation

#1424 Add Menas Dockerfile

#1416 hadoop-aws 2.8.5 + s3 aws sdk 2.13.65 compiles.

#1416 - enceladus on S3:
* - all directly-hdfs touching stuff disabled (atum, performance measurements, info files, output path checking)
# Add menasfargate into hosts
# paste
# save & exit (ctrl+O, ctrl+X)

#1416 - enceladus on S3 - (crude) conformance works on s3 (s3 std input, s3 conf output)
* Merge spline 0.5.3 into aws-poc
* Update spline to 0.5.4 for AWS PoC

#1503 Remove HDFS url Validation
* New dockerfile - smaller image
* s3 persistence (atum, sdk fs usage, ...) (#1526)

#1526 
* FsUtils divided into LocalFsUtils & HdfsUtils
* PathConfigSuite update
* S3FsUtils with tail-recursive pagination accumulation - now generic with optional short-circuit breakOut
* TestRunnerJob updated to manually cover the cases - should serve as a basis for tests
* HdfsUtils replace by trait DistributedFsUtils (except for MenasCredentials loading & nonSplittable splitting)
* using final version of s3-powered Atum (3.0.0)
* mockito-update version update, scalatest version update
* S3FsUtilsSuite: exists, read, sizeDir(hidden, non-hidden, reucursive), non-splittable (simple, recursive with breakOut), delete (recursive), version find (simple - empty, recursive)
* explicit stubbing fix for hyperdrive

#1556 file access PoC using Hadoop FS API (#1586)
* s3 using hadoop fs api
* s3 sdk usage removed (pom, classes, tests)
* atum final version 3.1.0 used
* readStandardizationInputData(... path: String)(implicit ... fs: FileSystem) -> readStandardizationInputData(input: PathWithFs)


#1554 Tomcat with TLS container in Docker container

#1554 Added envoy config + enabling running unencrypted container

#1499 Add authentication to /lineage + update spline to 0.5.5

#1618 - fixes failing spline 0.5.5 integration by providing compatible commons library version. Test-ran on EMR. (#1619)

#1434 Add new way of serving properties to Docker

#1622: Merge of aws-poc to develop brach
* put back HDFS browser
* put back Oozie
* downgraded Spline
* Scopt 4.0.0
* AWS SDK Exclusion
* ATUM version 3.2.2

Co-authored-by: Saša Zejnilović <zejnils@gmail.com>
Co-authored-by: Daniel Kavan <dk1844@gmail.com>
Co-authored-by: Adrian Olosutean <adi.olosutean@gmail.com>
Co-authored-by: Adrian Olosutean <adrian.olosutean@absa.africa>
Co-authored-by: Jan Scherbaum <kmoj02@gmail.com>
@benedeki
Copy link
Collaborator

benedeki commented Feb 3, 2021

Release notes:
#1424 Added Dockerfile for serving Menas, including TLS
#1526 Upgrade to ATUM 3.x - allowing to write in AWS S3
#1556 s3 file access using Hadoop FS Api

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: high Critical to the health of the project refactoring Improving code quality, paying off tech debt, aligning API, cleanup of unused code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants