Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gradle: test task configuration changes #32108

Merged
merged 7 commits into from
Nov 3, 2023

Conversation

postamar
Copy link
Contributor

@postamar postamar commented Nov 2, 2023

This PR is a prereq to making source-postgres tests run concurrently in the same JVM to speed them up. What I want to do is define the testcontainer objects in static variables to amortize the cost of their creation, which in some cases is considerable.

Effectively, this PR removes the numberThreads and testExecutionConcurrency gradle properties and replaces it with testExecutionConcurrency:

  • If testExecutionConcurrency is unset, gradle parallelizes the tests as best it can by spawning multiple JVMs. This is the default and it is what we already do.
  • if testExecutionConcurrency is set to a positive integer, gradle only spawns one JVM and JUnit runs tests concurrently in as many threads as this property is set to.
  • if testExecutionConcurrency is set to something else, it's the same as above but with as many threads as there are CPU cores.

The idea here is in the future to set testExecutionConcurrency=-1 for source-postgres and rewrite the tests to be thread-safe (right now they aren't).

This PR also renames the log4j config file in airbyte-commons to log4j2-test, to distinguish it from the other log4j config file in airbyte-cdk:core. I've noticed that in tests these two sometimes get mixed up. This way, the former (which is easier on the eyes) will always be used in tests.

Other quality-of-life changes have been noted via comments.

@postamar postamar requested a review from a team as a code owner November 2, 2023 17:12
Copy link

vercel bot commented Nov 2, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 3, 2023 1:15pm

Copy link
Contributor

github-actions bot commented Nov 2, 2023

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

  • PR name follows PR naming conventions
  • Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan.
  • Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
  • You've updated the connector's metadata.yaml file any other relevant changes, including a breakingChanges entry for major version bumps. See metadata.yaml docs
  • Secrets in the connector's spec are annotated with airbyte_secret
  • All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
  • Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
  • Migration guide updated in docs/integrations/<source or destination>/<name>-migrations.md with an entry for the new version, if the version is a breaking change. See migration guide example
  • If set, you've ensured the icon is present in the platform-internal repo. (Docs)

If the checklist is complete, but the CI check is failing,

  1. Check for hidden checklists in your PR description

  2. Toggle the github label checklist-action-run on/off to re-run the checklist CI.

testLogging {
events "passed", "skipped", "failed"
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tasks were not doing anything. We don't have any of these junit tags defined anywhere. Presumably, this is a legacy from before the platform was spun out of this repo.

test {
testLogging {
// TODO: Remove this after debugging
showStandardStreams = true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made it so that we log everything all the time. If it's too verbose we can edit the log4j config instead.

@@ -1,4 +1,3 @@
# currently limit the number of parallel threads until further investigation into the issues \
# where Snowflake will fail to login using config credentials
numberThreads=4
parallelExecutionsPerThread=6
testExecutionConcurrency=4
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems OK in practice, despite it being a reduction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those tests will definitely be IO bound. I'd bump that up to 50...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's some limit (either in the CI runner networking, or in the snowflake warehouse) that we run into beyond a certain parallelism limit. The right number might be somewhere in between :P

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably sbhould look at https://docs.snowflake.com/en/sql-reference/parameters#max-concurrency-level and increase that number on the snowflake side, if we're too low there. Doesn't change the costs of running queries

}
integrationTestJava.configure {
mustRunAfter project.tasks.named('check')
dependsOn project.tasks.matching { it.name == 'assemble' }
}
project.tasks.named('build').configure {
dependsOn integrationTestJava
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this task, and its python equivalent, required by build, because why not.

@edgao
Copy link
Contributor

edgao commented Nov 2, 2023

won't comment on the actual changes, but

rewrite the tests to be thread-safe

🎉 this was one of the big wins on Destinations for our new TypingDeduping tests. If you want to copy any of that stuff, https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/java/airbyte-cdk/typing-deduping/src/testFixtures/java/io/airbyte/integrations/base/destination/typing_deduping/BaseTypingDedupingTest.java#L207-L232 is a decent place to start looking.

(also, https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/java/airbyte-cdk/typing-deduping/src/testFixtures/java/io/airbyte/integrations/base/destination/typing_deduping/BaseTypingDedupingTest.java#L93 feels like a very prescient comment now 😅 )

@postamar
Copy link
Contributor Author

postamar commented Nov 2, 2023

Thanks! This validates my hunch that this approach is going to be worth it.

@postamar
Copy link
Contributor Author

postamar commented Nov 3, 2023

I ran the connectors test separately for destination-snowflake, the only certified connector affected by this change, and they're ✅ except for the increment check which is fine to ignore:
https://github.com/airbytehq/airbyte/actions/runs/6746754485/job/18341383719

@postamar
Copy link
Contributor Author

postamar commented Nov 3, 2023

/approve-and-merge reason="skip CI because of superficial changes to many non-certified connectors"

@octavia-approvington
Copy link
Contributor

After a careful ML study,
we think this looks okay.
imagine code being okay

@postamar postamar enabled auto-merge (squash) November 3, 2023 15:49
@stephane-airbyte
Copy link
Contributor

/approve-and-merge reason="skip CI because of superficial changes to many non-certified connectors. Previous failed because of added branch rules"

@postamar
Copy link
Contributor Author

postamar commented Nov 3, 2023

/approve-and-merge reason="skip CI because of superficial changes to many non-certified connectors"

@octavia-approvington
Copy link
Contributor

This is really good
simply the best

@octavia-approvington octavia-approvington merged commit ef6dbd0 into master Nov 3, 2023
17 of 18 checks passed
@octavia-approvington octavia-approvington deleted the postamar/test-parallelism-and-logging branch November 3, 2023 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants