PoC _INFO files generation in S3 #37

dk1844 · 2020-08-19T10:34:06Z

Enceladus expects to be able to generate _INFO files to be generated in the output directory alongside the spark output.

This feature is originally implemented using HDFS API, for AWS S3, we need to replicate the functionality for S3. Options are:

The most prominent entry point should be:
AtumImplicits.SparkSessionWrapper(spark) and internally ControlFrameworkState.storeCurrentInfoFile

The text was updated successfully, but these errors were encountered:

dk1844 self-assigned this Aug 19, 2020

dk1844 changed the title ~~Info files generation in S3~~ PoC _INFO files generation in S3 Aug 19, 2020

Zejnilovic mentioned this issue Aug 25, 2020

Make ATUM within Enceladus work in AWS AbsaOSS/enceladus#1498

Closed

dk1844 linked a pull request Aug 31, 2020 that will close this issue

AWS S3 support for Atum #38

Merged

dk1844 mentioned this issue Sep 8, 2020

Integration tests for S3-based routines #39

Open

dk1844 closed this as completed in #38 Sep 21, 2020

Provide feedback