Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#2022 spark-commons #2023

Merged
merged 6 commits into from
Mar 9, 2022
Merged

Conversation

AdrianOlosutean
Copy link
Contributor

Closes #2022

@AdrianOlosutean AdrianOlosutean self-assigned this Feb 15, 2022
@AdrianOlosutean AdrianOlosutean changed the title #2022 spark-commons init #2022 spark-commons Feb 15, 2022
@AdrianOlosutean AdrianOlosutean added the work in progress Work on this item is not yet finished (mainly intended for PRs) label Feb 15, 2022
@AdrianOlosutean AdrianOlosutean marked this pull request as draft February 15, 2022 14:22
@AdrianOlosutean AdrianOlosutean removed the work in progress Work on this item is not yet finished (mainly intended for PRs) label Feb 17, 2022
@AdrianOlosutean AdrianOlosutean marked this pull request as ready for review February 17, 2022 11:39
dk1844
dk1844 previously approved these changes Feb 25, 2022
Copy link
Contributor

@dk1844 dk1844 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

  • code reviewed
  • pulled
  • built
  • run

.withColumnIfDoesNotExist(InfoDateColumn, to_date(lit(reportDate), ReportDateFormat))
.withColumnIfDoesNotExist(InfoDateColumnString, lit(reportDate))
.withColumnIfDoesNotExist(InfoVersionColumn, lit(reportVersion))
.withColumnIfDoesNotExist(function)(InfoDateColumn, to_date(lit(reportDate), ReportDateFormat))
Copy link
Contributor

@dk1844 dk1844 Feb 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is late, but I think I take issue with DataFrameImplicits.DataFrameEnhancements.withColumnIfDoesNotExist(ifExists: (DataFrame, String) => DataFrame)(colName: String, colExpr: Column) that originated in AbsaOSS/spark-commons#18, apparently.

IMHO the name of the method simply does not conform to what it does and what params it requires - the name suggests no-op column does not exist, but it allows an action in such a case, too. My vote is either to have two separate methods withColumnIfDoesNotExist and transformIfColumnExists or at least rename this method somehow to make sense.

Here, you can alleviate the problem only by renaming the function to something more descriptive like ifExistsFn or similar. (this occurs in multiple codebase files)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree. In minimum, the parameters order seems wrong, and could have a default value in the ifExists parameter.

…/interpreter/stages/TypeParser.scala

Co-authored-by: Daniel K <dk1844@gmail.com>
dk1844
dk1844 previously approved these changes Feb 28, 2022
Copy link
Contributor

@dk1844 dk1844 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

dk1844
dk1844 previously approved these changes Feb 28, 2022
@benedeki benedeki added the PR:reviewing Only for PR - PR is being reviewed by somebody; blocks merging label Mar 3, 2022
import org.apache.spark.sql.types._
import scala.util.Try

object StructFieldImplicits {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this functions? Within Enceladus they seemed/seems pretty useful - code is simpler and shorter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are already in spark-commons, just a bit renamed and on Metadata. I'm not sure if it makes sense to be duplicated in Enceladus and spark-commons

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I see. They are useful, but I like them even better in the original form, being more concise. 🤷‍♂️

@benedeki benedeki removed the PR:reviewing Only for PR - PR is being reviewed by somebody; blocks merging label Mar 8, 2022
@sonarcloud
Copy link

sonarcloud bot commented Mar 8, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

@benedeki benedeki added the PR:no testing needed Only for PR - PR doesn't need to be tested by a tester (person) label Mar 9, 2022
@benedeki
Copy link
Collaborator

benedeki commented Mar 9, 2022

Refactoring, doesn't really need/can be retested.

Copy link
Collaborator

@benedeki benedeki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks fine, but my build *unit tests) fails, while develop-ver-3.0 passes.

@AdrianOlosutean AdrianOlosutean merged commit 55696a0 into develop-ver-3.0 Mar 9, 2022
@AdrianOlosutean AdrianOlosutean deleted the feature/2022-spark-commons branch March 9, 2022 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR:no testing needed Only for PR - PR doesn't need to be tested by a tester (person)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants