Feature/1371 Combined Standardization Conformance Job #1392

AdrianOlosutean · 2020-06-12T08:37:08Z

Closes #1371
Closes #1443

Proposal for main function

…' into feature/1371-combined-std-conf-main

…dardization-conformance' into feature/1371-combined-std-conf-job # Conflicts: # examples/src/main/scala/za/co/absa/enceladus/examples/CustomRuleSample1.scala # examples/src/main/scala/za/co/absa/enceladus/examples/CustomRuleSample2.scala # examples/src/main/scala/za/co/absa/enceladus/examples/CustomRuleSample3.scala # examples/src/main/scala/za/co/absa/enceladus/examples/CustomRuleSample4.scala # examples/src/main/scala/za/co/absa/enceladus/examples/interpreter/rules/custom/UppercaseCustomConformanceRule.scala # examples/src/main/scala/za/co/absa/enceladus/examples/interpreter/rules/custom/XPadCustomConformanceRule.scala # examples/src/test/scala/za/co/absa/enceladus/examples/interpreter/rules/custom/UppercaseCustomConformanceRuleSuite.scala # examples/src/test/scala/za/co/absa/enceladus/examples/interpreter/rules/custom/XPadCustomConformanceRuleSuite.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/common/CommonJobExecution.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/ConformanceExecution.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/ConformanceReader.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/DynamicConformanceJob.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/HyperConformance.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/DynamicInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/InterpreterContext.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/ArrayCollapseInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/ArrayExplodeInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/CastingRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/ConcatenationRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/DropRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/LiteralRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/MappingRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/MappingRuleInterpreterBroadcast.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/MappingRuleInterpreterGroupExplode.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/NegationRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/RuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/SingleColumnRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/SparkSessionConfRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/UppercaseRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/standardization/StandardizationExecution.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/standardization/StandardizationJob.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/standardization/StandardizationReader.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/config/ConformanceConfigSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/ArrayConformanceSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/ChorusMockSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/InterpreterSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/LiteralJoinMappingRuleTest.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/fixtures/NestedStructsFixture.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/fixtures/StreamingFixture.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/rules/CastingRuleSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/rules/NegationRuleSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/rules/RulesSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/rules/TestRuleBehaviors.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/rules/custom/CustomRuleSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/rules/testcasefactories/NestedTestCaseFactory.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/rules/testcasefactories/SimpleTestCaseFactory.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/standardization/StandardizationCobolAsciiSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/standardization/StandardizationCobolEbcdicSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/standardization/StandardizationJsonSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/standardization/StandardizationParquetSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/standardization/StandardizationRerunSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/standardization/config/StandardizationConfigSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/standardization/fixtures/CsvFileFixture.scala

…' into feature/1371-combined-std-conf-job # Conflicts: # examples/src/main/scala/za/co/absa/enceladus/examples/interpreter/rules/custom/UppercaseCustomConformanceRule.scala # examples/src/main/scala/za/co/absa/enceladus/examples/interpreter/rules/custom/XPadCustomConformanceRule.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/common/CommonJobExecution.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/ConformanceExecution.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/PropertiesProvider.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/config/ConformanceConfigInstance.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/DynamicInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/InterpreterContext.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/ArrayCollapseInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/ArrayExplodeInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/CastingRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/ConcatenationRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/DropRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/LiteralRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/MappingRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/MappingRuleInterpreterBroadcast.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/MappingRuleInterpreterGroupExplode.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/NegationRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/RuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/SingleColumnRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/SparkSessionConfRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/UppercaseRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/standardization/PropertiesProvider.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/standardization/StandardizationExecution.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/standardization/config/StandardizationConfigInstance.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/rules/RulesSuite.scala # spark-jobs/src/test/scala/za/co/absa/enceladus/conformance/interpreter/rules/custom/CustomRuleSuite.scala

…ned-std-conf-job # Conflicts: # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/ConformanceExecution.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/ConformancePropertiesProvider.scala

AdrianOlosutean · 2020-07-15T09:14:04Z

I was able to run the combined StandardizationConformanceJob locally, as well as the DynamicConformance one. Can someone confirm that the e2e tests pass too?

spark-jobs/src/main/scala/za/co/absa/enceladus/common/config/PathConfig.scala

Zejnilovic · 2020-07-15T11:07:12Z

spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/InterpreterContext.scala

-                              )
+
+case class InterpreterContextArgs(datasetName: String,
+                                   reportDate: String = "",


missaligned

...ain/scala/za/co/absa/enceladus/standardization_conformance/config/StdConformanceConfig.scala

…ned-std-conf-job

benedeki · 2020-07-22T16:49:54Z

...rc/main/scala/za/co/absa/enceladus/standardization_conformance/StdConformanceExecution.scala

+import za.co.absa.enceladus.standardization.config.StandardizationParser
+import za.co.absa.enceladus.standardization_conformance.config.StdConformanceConfig
+
+trait StdConformanceExecution extends StandardizationExecution with ConformanceExecution {


Small (and just a suggestion): What about naming it StdAndConfExecution?

benedeki · 2020-07-22T17:10:18Z

...ain/scala/za/co/absa/enceladus/standardization_conformance/config/StdConformanceConfig.scala

+import scala.util.Try
+
+
+case class StdConformanceConfig(datasetName: String = "",


Same as above - what about StdAndConfConfig?

benedeki · 2020-07-22T18:01:26Z

spark-jobs/src/main/scala/za/co/absa/enceladus/standardization/StandardizationExecution.scala

@@ -76,19 +75,21 @@ trait StandardizationExecution extends CommonJobExecution {
    // Add the raw format of the input file(s) to Atum's metadata
    Atum.setAdditionalInfo("raw_format" -> cmd.rawFormat)

-    PerformanceMetricTools.addJobInfoToAtumMetadata("std", preparationResult.pathCfg.inputPath, preparationResult.pathCfg.outputPath,
+    // OutputPath is standardizationPath on the Standardization phase of the combined job
+    val outputPath = preparationResult.pathCfg.standardizationPath.getOrElse(preparationResult.pathCfg.outputPath)


For any reader of the code not fully the structure of the code, this would be rather confusing... (similar as in ConformanceExecution)

benedeki · 2020-07-22T18:03:32Z

spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/InterpreterContext.scala

+                                  persistStorageLevel: Option[StorageLevel] = None)
+
+object InterpreterContextArgs {
+  def fromConformanceConfig[T](conformanceConfig: ConformanceParser[T]): InterpreterContextArgs = {


I like that. If considered too long, I don't have a problem with ConfConfigParser.
Btw, same with StandardizationParser

benedeki · 2020-07-22T18:05:39Z

spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/ConformanceExecution.scala

                                         fsUtils: FileSystemVersionUtils): Unit = {
    val cmdLineArgs: String = args.mkString(" ")

+    // StandardizationPath is the input on the Conformance phase of the combined job
+    val standardizationPath = preparationResult.pathCfg.standardizationPath.getOrElse(preparationResult.pathCfg.inputPath)


For any reader of the code not fully the structure of the code, this would be rather confusing... (similar as in StandardizationExecution)

spark-jobs/src/main/scala/za/co/absa/enceladus/common/config/PathConfig.scala

DzMakatun · 2020-07-23T10:45:06Z

What will be an application name for the combined job?
E.g. standalone Standardization uses the pattern .appName(s"Standardisation $enceladusVersion ${cmd.datasetName} ${cmd.datasetVersion} ${cmd.reportDate} $reportVersion")
standalone Conformance uses
.appName(s"Dynamic Conformance $enceladusVersion ${cmd.datasetName} ${cmd.datasetVersion} ${cmd.reportDate} $reportVersion")
The combined app needs something similar, that we can identify it by parsing the name string in monitoring.

AdrianOlosutean · 2020-07-23T14:12:17Z

Good point @DzMakatun , I forgot about that one. I would call them Standardisation, Dynamic Conformance and Standardization Conformance. I can also keep it consistent and change the one for Standardisation to Standardization

…om/AbsaOSS/enceladus into feature/1371-combined-std-conf-job

benedeki · 2020-07-23T17:59:12Z

spark-jobs/src/main/scala/za/co/absa/enceladus/standardization/StandardizationExecution.scala

@@ -164,8 +163,7 @@ trait StandardizationExecution extends CommonJobExecution {
      handleEmptyOutput(sourceId)
    }

-    // OutputPath is standardizationPath on the Standardization phase of the combined job
-    val outputPath = preparationResult.pathCfg.standardizationPath.getOrElse(preparationResult.pathCfg.outputPath)
+    val outputPath = preparationResult.pathCfg.standardizationPath


Variable not needed. Expression can be easily used on line 175 directly

benedeki · 2020-07-23T19:24:04Z

...cala/za/co/absa/enceladus/standardization_conformance/StandardizationAndConformanceJob.scala

 import za.co.absa.enceladus.utils.fs.FileSystemVersionUtils
 import za.co.absa.enceladus.utils.modules.SourcePhase
 import za.co.absa.enceladus.utils.udf.UDFLibrary

-object StandardizationConformanceJob extends StdConformanceExecution {
+object StandardizationAndConformanceJob extends StandardizationExecution with ConformanceExecution {
+  private val jobName = "Standardization Conformance"


What about?

Suggested change

private val jobName = "Standardization Conformance"

private val jobName = "Standardization and Conformance"

or

Suggested change

private val jobName = "Standardization Conformance"

private val jobName = "Standardization & Conformance"

benedeki · 2020-07-23T19:33:07Z

spark-jobs/src/main/scala/za/co/absa/enceladus/common/CommonJobExecution.scala

    // die if the output path exists
-    validateForExistingOutputPath(fsUtils, pathCfg)
+    validateForExistingOutputPath(fsUtils, outputPath)


Would make this abstract here, and implemented in successors. (Combined job could call both ancestor's version)

benedeki · 2020-07-23T19:36:10Z

spark-jobs/src/main/scala/za/co/absa/enceladus/common/CommonJobExecution.scala

+    log.info(s"input path: $inputPath")
+    log.info(s"output path: $outputPath")


Would move this logging to prepareStandardization/prepareConformace, and make it specific "Standardization input path: ..." etc,

benedeki · 2020-07-23T19:54:09Z

spark-jobs/src/main/scala/za/co/absa/enceladus/common/CommonJobExecution.scala

+      rawPath = buildRawPath(cmd.asInstanceOf[StandardizationConfigParser[StandardizationConformanceConfig]], dataset, reportVersion),
+      publishPath = buildPublishPath(cmd.asInstanceOf[ConformanceConfigParser[StandardizationConformanceConfig]], dataset, reportVersion),


Nice solution
Would not cast the cmd here. Let the function within CommonJobExecution here return only the default values. And in the overrides check the class, cast it, and eventually use the configuration override.

benedeki · 2020-07-23T19:56:25Z

spark-jobs/src/main/scala/za/co/absa/enceladus/common/CommonJobExecution.scala


-    val performance = initPerformanceMeasurer(pathCfg.inputPath)
+    val performance = initPerformanceMeasurer(inputPath)


Still thinking how to do this... 🤔
The point is that CommonJobExecution should not know about any Conformance, Standardization or their specific configs.

Would also put this line into the prepareStandardization/prepareConformance` 🤔

DzMakatun · 2020-07-24T08:10:43Z

Good point @DzMakatun , I forgot about that one. I would call them Standardisation, Dynamic Conformance and Standardization Conformance. I can also keep it consistent and change the one for Standardisation to Standardization

Please, keep in mind, the easier to parse and distinguish the jobs the better. It's not ideal when the names have different number of words and the first word is the same in two different jobs. I would go for something like:

Enceladus Standardization ...
Enceladus Conformance ...
Enceladus Combined ...
3a. Or something like Enceladus Standardization_and_Conformance ...

What do you think?

…ned-std-conf-job

benedeki · 2020-07-24T12:09:01Z

Good point @DzMakatun , I forgot about that one. I would call them Standardisation, Dynamic Conformance and Standardization Conformance. I can also keep it consistent and change the one for Standardisation to Standardization

Please, keep in mind, the easier to parse and distinguish the jobs the better. It's not ideal when the names have different number of words and the first word is the same in two different jobs. I would go for something like:
1. Enceladus Standardization ...

2. Enceladus Conformance ...

3. Enceladus Combined ...
   3a. Or something like Enceladus Standardization_and_Conformance ...
What do you think?

I like the suggestion, despite it goes little bit against your own words - first words is always Enceladus. It's good that it's unified, for all 3,
But the underscores are weird. What about Enceladus Standardization&Conformance (I admit suggested it already in other comment)

…ned-std-conf-job

DzMakatun · 2020-07-27T07:23:55Z

Good point @DzMakatun , I forgot about that one. I would call them Standardisation, Dynamic Conformance and Standardization Conformance. I can also keep it consistent and change the one for Standardisation to Standardization

Please, keep in mind, the easier to parse and distinguish the jobs the better. It's not ideal when the names have different number of words and the first word is the same in two different jobs. I would go for something like:
1. Enceladus Standardization ...

2. Enceladus Conformance ...

3. Enceladus Combined ...
   3a. Or something like Enceladus Standardization_and_Conformance ...
What do you think?
Good point @DzMakatun , I forgot about that one. I would call them Standardisation, Dynamic Conformance and Standardization Conformance. I can also keep it consistent and change the one for Standardisation to Standardization

Please, keep in mind, the easier to parse and distinguish the jobs the better. It's not ideal when the names have different number of words and the first word is the same in two different jobs. I would go for something like:
1. Enceladus Standardization ...

2. Enceladus Conformance ...

3. Enceladus Combined ...
   3a. Or something like Enceladus Standardization_and_Conformance ...
What do you think?
I like the suggestion, despite it goes little bit against your own words - first words is always Enceladus. It's good that it's unified, for all 3,

There are also other jobs in Spark History, so having the first word Enceladus helps to get the first level of categorization.

But the underscores are weird. What about Enceladus Standardization&Conformance (I admit suggested it already in other comment)

I see, could be with '&' or maybe use camel notation, like StandardizationConformance, StandardizationAndConformance?

benedeki · 2020-07-27T07:37:53Z

spark-jobs/src/main/scala/za/co/absa/enceladus/common/CommonJobExecution.scala

@@ -166,44 +161,37 @@ trait CommonJobExecution {
    }
  }

-  protected def getPathConfig[T](cmd: JobConfigParser[T], dataset: Dataset, reportVersion: Int): PathConfig = {
+  protected def getDefaultPathConfig[T](cmd: JobConfigParser[T], dataset: Dataset, reportVersion: Int): PathConfig = {


Why not to make this the implementation of the now abstract getPathConfig?

benedeki · 2020-07-27T07:38:02Z

spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/ConformanceExecution.scala

@@ -77,6 +81,21 @@ trait ConformanceExecution extends CommonJobExecution {
      preparationResult.reportVersion)
  }

+  override def getPathConfig[T](cmd: JobConfigParser[T], dataset: Dataset, reportVersion: Int): PathConfig = {
+    val pathOverride = cmd.asInstanceOf[ConformanceConfig].publishPathOverride


Swap with next one and you don't even need the val. Also less "context switching".

benedeki · 2020-07-27T07:38:41Z

.../main/scala/za/co/absa/enceladus/conformance/interpreter/rules/CoalesceRuleInterpreter.scala

@@ -33,7 +33,8 @@ case class CoalesceRuleInterpreter(rule: CoalesceConformanceRule) extends RuleIn
  override def conformanceRule: Option[ConformanceRule] = Some(rule)

  def conform(df: Dataset[Row])
-             (implicit spark: SparkSession, explosionState: ExplosionState, dao: MenasDAO, progArgs: ConformanceConfig): Dataset[Row] = {
+             (implicit spark: SparkSession, explosionState: ExplosionState, dao: MenasDAO,


If multiline parameters, they should be one per line.

benedeki · 2020-07-27T07:39:30Z

spark-jobs/src/main/scala/za/co/absa/enceladus/standardization/StandardizationExecution.scala

@@ -83,6 +86,21 @@ trait StandardizationExecution extends CommonJobExecution {
    dao.getSchema(preparationResult.dataset.schemaName, preparationResult.dataset.schemaVersion)
  }

+  override def getPathConfig[T](cmd: JobConfigParser[T], dataset: Dataset, reportVersion: Int): PathConfig = {
+    val pathOverride = cmd.asInstanceOf[StandardizationConfig].rawPathOverride


Same as above. Swap the lines, get rid of the val.

benedeki · 2020-07-27T07:41:40Z

...a/co/absa/enceladus/standardization_conformance/StandardizationAndConformanceExecution.scala

+  }
+
+  override def validateOutputPath(fsUtils: FileSystemVersionUtils, pathConfig: PathConfig): Unit = {
+    validateIfPathAlreadyExists(fsUtils, pathConfig.standardizationPath)


benedeki · 2020-07-27T07:43:07Z

...a/co/absa/enceladus/standardization_conformance/StandardizationAndConformanceExecution.scala

+    val jobCmd = cmd.asInstanceOf[StandardizationConformanceConfig]
+    val rawPathOverride = jobCmd.rawPathOverride
+    val publishPathOverride = jobCmd.publishPathOverride
+    val defaultConfig = getDefaultPathConfig(cmd, dataset, reportVersion)


Again, I would make this one the first step of the function. Here it's more fore consistency with he other versions of the function.

…ob' into feature/1371-combined-std-conf-job

…ned-std-conf-job # Conflicts: # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/FillNullsRuleInterpreter.scala # spark-jobs/src/main/scala/za/co/absa/enceladus/conformance/interpreter/rules/RuleInterpreter.scala

benedeki

code reviewed
pulled
built
run

The few remarks are really just about code style.

benedeki · 2020-07-30T07:41:35Z

spark-jobs/src/main/scala/za/co/absa/enceladus/standardization/StandardizationExecution.scala

 import za.co.absa.enceladus.standardization.interpreter.StandardizationInterpreter
 import za.co.absa.enceladus.standardization.interpreter.stages.PlainSchemaGenerator
 import za.co.absa.enceladus.utils.fs.FileSystemVersionUtils
 import za.co.absa.enceladus.utils.modules.SourcePhase
-import za.co.absa.enceladus.utils.performance.PerformanceMetricTools
+import za.co.absa.enceladus.utils.performance.{PerformanceMeasurer, PerformanceMetricTools}


Suggested change

import za.co.absa.enceladus.utils.performance.{PerformanceMeasurer, PerformanceMetricTools}

import za.co.absa.enceladus.utils.performance.PerformanceMetricTools

Unused import

benedeki · 2020-07-30T07:45:01Z

.../src/main/scala/za/co/absa/enceladus/standardization/StandardizationPropertiesProvider.scala

    if (cmd.rawFormat.equalsIgnoreCase("fixed-width")) {
      HashMap("trimValues" -> cmd.fixedWidthTrimValues.map(BooleanParameter))
    } else {
      HashMap()
    }
  }

-  private def getCobolOptions(cmd: StandardizationConfig, dataset: Dataset)(implicit dao: MenasDAO): HashMap[String, Option[RawFormatParameter]] = {
+  private def getCobolOptions[T](cmd: StandardizationConfigParser[T], dataset: Dataset)(implicit dao: MenasDAO): HashMap[String, Option[RawFormatParameter]] = {


Tiny: Line too long

benedeki · 2020-07-30T07:45:33Z

.../src/main/scala/za/co/absa/enceladus/standardization/StandardizationPropertiesProvider.scala

    if (cmd.rawFormat.equalsIgnoreCase("xml")) {
      HashMap("rowtag" -> cmd.rowTag.map(StringParameter))
    } else {
      HashMap()
    }
  }

-  private def getCsvOptions(cmd: StandardizationConfig, numberOfColumns: Int = 0): HashMap[String, Option[RawFormatParameter]] = {
+  private def getCsvOptions[T](cmd: StandardizationConfigParser[T], numberOfColumns: Int = 0): HashMap[String, Option[RawFormatParameter]] = {


Tiny: Line too long

benedeki · 2020-07-30T07:48:40Z

.../co/absa/enceladus/standardization_conformance/config/StandardizationConformanceConfig.scala

+                                            failOnInputNotPerSchema: Boolean = false,
+
+                                            credsFile: Option[String] = None,
+                                            keytabFile: Option[String] = None) extends StandardizationConfigParser[StandardizationConformanceConfig]


Tiny: Line too long

benedeki · 2020-07-30T07:49:32Z

.../co/absa/enceladus/standardization_conformance/config/StandardizationConformanceConfig.scala

+  override def withIsCatalystWorkaroundEnabled(value: Option[Boolean]): StandardizationConformanceConfig = copy(isCatalystWorkaroundEnabled = value)
+  override def withAutocleanStandardizedFolder(value: Option[Boolean]): StandardizationConformanceConfig = copy(autocleanStandardizedFolder = value)


Also too long lines, here perhaps worth to mark as exception //scalastyle:ignore

sonarcloud · 2020-07-30T09:14:39Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities (and 0 Security Hotspots to review)
1 Code Smell

No Coverage information
0.0% Duplication

HuvarVer

review
pull
build
run

I like the speed of a new job.
Tested separate jobs by Hermes too, run the new job. Looks good.

Adrian-Olosutean added 2 commits June 11, 2020 13:12

Merge branch 'feature/1015-extract-common-standardization-conformance…

3519c3a

…' into feature/1371-combined-std-conf-main

#1371 New main job created

b69c950

AdrianOlosutean self-assigned this Jun 12, 2020

AdrianOlosutean requested review from yruslan, Zejnilovic, benedeki and dk1844 June 12, 2020 10:11

AdrianOlosutean mentioned this pull request Jun 19, 2020

#1015 Refactoring Conformance and Standardization #1377

Merged

Adrian-Olosutean added 3 commits June 20, 2020 15:35

#1371 Combined job updated

603329f

#1371 Integrate recent refactoring

799de19

AdrianOlosutean added the work in progress Work on this item is not yet finished (mainly intended for PRs) label Jun 25, 2020

Adrian-Olosutean added 2 commits July 14, 2020 11:25

#1371 Updated combined job with latest refactoring

73c2923

Base automatically changed from feature/1015-extract-common-standardization-conformance to develop July 14, 2020 15:58

Adrian-Olosutean added 2 commits July 14, 2020 18:03

#1371 Merged recent changes

455128a

AdrianOlosutean removed the work in progress Work on this item is not yet finished (mainly intended for PRs) label Jul 14, 2020

AdrianOlosutean marked this pull request as ready for review July 14, 2020 19:08

AdrianOlosutean requested review from DzMakatun, HuvarVer and lokm01 as code owners July 14, 2020 19:08

#1371 Reinitializing Control Framework

879b010

AdrianOlosutean removed request for lokm01, yruslan and DzMakatun July 15, 2020 09:14

Zejnilovic reviewed Jul 15, 2020

View reviewed changes

#1371 Path config documentation + other improvements

db426be

AdrianOlosutean requested a review from yruslan as a code owner July 15, 2020 12:39

Adrian-Olosutean added 2 commits July 21, 2020 15:15

#1371 Start feedback

b61fb3e

Merge remote-tracking branch 'origin/develop' into feature/1371-combi…

8c4296a

…ned-std-conf-job

benedeki reviewed Jul 22, 2020

View reviewed changes

Adrian-Olosutean added 2 commits July 23, 2020 16:39

#1371 Version with raw, publish, stdpath

cbe4f03

Merge branch 'feature/1371-combined-std-conf-job' of https://github.c…

3c7441f

…om/AbsaOSS/enceladus into feature/1371-combined-std-conf-job

benedeki reviewed Jul 23, 2020

View reviewed changes

Merge remote-tracking branch 'origin/develop' into feature/1371-combi…

96bcacc

…ned-std-conf-job

Adrian-Olosutean and others added 3 commits July 24, 2020 15:22

#1371 Moved jar assignment

c7cc534

Merge remote-tracking branch 'origin/develop' into feature/1371-combi…

5054da2

…ned-std-conf-job

Merge branch 'develop' into feature/1371-combined-std-conf-job

864b7ef

benedeki reviewed Jul 27, 2020

View reviewed changes

Adrian-Olosutean added 6 commits July 27, 2020 12:44

#1371 Implemented feedback

daa4617

Merge remote-tracking branch 'origin/feature/1371-combined-std-conf-j…

dc6b966

…ob' into feature/1371-combined-std-conf-job

#1371 Imports optimization

cc4ef03

#1371 Reassigned performance

f0d53a3

#1371 Conflict resolution

44c5af3

benedeki previously approved these changes Jul 30, 2020

View reviewed changes

#1371 Minor code style improvements

f58713d

AdrianOlosutean dismissed benedeki’s stale review via f58713d July 30, 2020 09:13

benedeki approved these changes Jul 30, 2020

View reviewed changes

HuvarVer approved these changes Jul 30, 2020

View reviewed changes

AdrianOlosutean merged commit 946ccec into develop Jul 30, 2020

AdrianOlosutean deleted the feature/1371-combined-std-conf-job branch July 30, 2020 12:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/1371 Combined Standardization Conformance Job #1392

Feature/1371 Combined Standardization Conformance Job #1392

AdrianOlosutean commented Jun 12, 2020 •

edited

Loading

AdrianOlosutean commented Jul 15, 2020

Zejnilovic Jul 15, 2020

benedeki Jul 22, 2020

benedeki Jul 22, 2020

benedeki Jul 22, 2020

benedeki Jul 22, 2020

benedeki Jul 22, 2020

DzMakatun commented Jul 23, 2020

AdrianOlosutean commented Jul 23, 2020 •

edited

Loading

benedeki Jul 23, 2020

benedeki Jul 23, 2020

benedeki Jul 23, 2020

AdrianOlosutean Jul 24, 2020

benedeki Jul 23, 2020

benedeki Jul 23, 2020

benedeki Jul 23, 2020

benedeki Jul 24, 2020

DzMakatun commented Jul 24, 2020 •

edited

Loading

benedeki commented Jul 24, 2020

DzMakatun commented Jul 27, 2020

benedeki Jul 27, 2020

benedeki Jul 27, 2020

benedeki Jul 27, 2020

benedeki Jul 27, 2020

benedeki Jul 27, 2020

benedeki Jul 27, 2020

benedeki left a comment

benedeki Jul 30, 2020

benedeki Jul 30, 2020

benedeki Jul 30, 2020

benedeki Jul 30, 2020

benedeki Jul 30, 2020

sonarcloud bot commented Jul 30, 2020

HuvarVer left a comment •

edited

Loading

		import scala.util.Try


		case class StdConformanceConfig(datasetName: String = "",

	private val jobName = "Standardization Conformance"
	private val jobName = "Standardization and Conformance"

	private val jobName = "Standardization Conformance"
	private val jobName = "Standardization & Conformance"

		log.info(s"input path: $inputPath")
		log.info(s"output path: $outputPath")

		rawPath = buildRawPath(cmd.asInstanceOf[StandardizationConfigParser[StandardizationConformanceConfig]], dataset, reportVersion),
		publishPath = buildPublishPath(cmd.asInstanceOf[ConformanceConfigParser[StandardizationConformanceConfig]], dataset, reportVersion),


		val performance = initPerformanceMeasurer(pathCfg.inputPath)
		val performance = initPerformanceMeasurer(inputPath)

	import za.co.absa.enceladus.utils.performance.{PerformanceMeasurer, PerformanceMetricTools}
	import za.co.absa.enceladus.utils.performance.PerformanceMetricTools

		override def withIsCatalystWorkaroundEnabled(value: Option[Boolean]): StandardizationConformanceConfig = copy(isCatalystWorkaroundEnabled = value)
		override def withAutocleanStandardizedFolder(value: Option[Boolean]): StandardizationConformanceConfig = copy(autocleanStandardizedFolder = value)

Feature/1371 Combined Standardization Conformance Job #1392

Feature/1371 Combined Standardization Conformance Job #1392

Conversation

AdrianOlosutean commented Jun 12, 2020 • edited Loading

AdrianOlosutean commented Jul 15, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DzMakatun commented Jul 23, 2020

AdrianOlosutean commented Jul 23, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DzMakatun commented Jul 24, 2020 • edited Loading

benedeki commented Jul 24, 2020

DzMakatun commented Jul 27, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benedeki left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sonarcloud bot commented Jul 30, 2020

HuvarVer left a comment • edited Loading

Choose a reason for hiding this comment

AdrianOlosutean commented Jun 12, 2020 •

edited

Loading

AdrianOlosutean commented Jul 23, 2020 •

edited

Loading

DzMakatun commented Jul 24, 2020 •

edited

Loading

HuvarVer left a comment •

edited

Loading