-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable caching of rest tests which use integ-test distribution #43782
Conversation
Pinging @elastic/es-core-infra |
I like the idea of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without going into any of the details of the PR, I'm wondering if we could avoid the repetition and have a higher level construct that we could adopt to across the board for anything we do not want caching on. Something like:
ignoredForCaching {
setting '...', {}
}
We could have the same for task inputs like system properties as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good Mark! I'm really looking forward to see this live in CI.
Just a few comments.
|
||
import java.util.List; | ||
|
||
public abstract class AbstractLazyPropertyCollection { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be wrong, but it looks like we could use generics instead of Objects
here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean for getNormalizedCollection()
? If so we cannot because implementations might map the value type to any other type for the purposes of input snapshotting. Since this is only used by Gradle for input snapshotting, we only really care about the type at runtime.
|
||
@Override | ||
public Iterator<T> iterator() { | ||
return delegate.stream().peek(this::validate).map(PropertyListEntry::getValue).collect(Collectors.toList()).iterator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to collect, streams have an .iterator()
method.
@CacheableTask | ||
public class RestTestRunnerTask extends Test { | ||
|
||
private Collection<ElasticsearchCluster> clusters = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TestClustersPlugin tracks this, we shouldn't add another place that keeps track.
We should come up with an API in the plugin that would allow the task to ask for all the clusters it uses so we keep a single record of truth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That get's a little awkward as then the task would have to keep a reference back to the extension or similar. I think that's bad practice. It's the job of the plugin to do this wiring. The task should not have any knowledge of the test clusters plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The task already cares about the clusters so does know a lot about the plugin.
We talked about being able to use clusters with specific types of tasks only in the past with @rjernst and agreed it's a good idea, it just never got done, so another way to think about it is to consider the task part of testClusters so that it's the only one that can use them. useCluster
could become a method on this task instead of an extension, possibly saving us a bit at config time. All the house-keeping (claims, running cluster etc) could live in an extension so the task can reference it via the Project
. That would make the plugin code more readable too. We have a handful of
Test
tasks using testclusters but those could and should be RestTestRunnerTasks
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still not clear exactly what the proposal here is. If the suggestion is to move logic that exists currently in TestClustersPlugin
into this task then frankly that's a flat 👎 for me. Tasks should be dumb. They should not contain complex logic for managing state or inputs aside from their intended purpose. That code should and is in the plugin. If that means exposing more properties on the task for the plugin to talk to it then so be it. At some point this becomes a philosophical argument but I'm going to remain stubborn here. RestIntegTestTask
is the extreme example of what I'm talking about. A bunch of complex build logic that got sucked up into a task instead of a plugin and it's a nightmare.
I simply don't see the issue with this method. This task implementation is dead simple. All it does is expose a new bean property, and that's its only intended purpose. The task knows nothign about TestClustersPlugin
. All it's aware of is that some collection of ElasticsearhCluster
s are an input to the task. It doesn't care where they come from, how long they live for, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going back and reading your comment I think I might better understand. If the idea is "if you want to use a test cluster you have to configure a RestTestRunnerTask
" I think that's moving in the right direction. In the end what we want is something to replace RestIntegTestTask
, that is, some central configurable DSL element that setups up all this stuff for us (test task, dependencies, test cluster, etc). What thing thing might be I think is still up for a discussion, but it needs to be higher level than a task as eventually I think we'd like this thing to potentially create source sets, etc. I end goal here is that for REST tests a user should simply need to dump YAML files in a folder, tell us what cluster configuration they need and be on their way.
In terms of this PR, RestTestRunnerTask
is meant to solve a very specific problem which is to track an additional input. There's no implied intention for this task to solve the problems I state above, those need to be addressed elsewhere. I think it's worth discussing this more in our weekly tomorrow though as I do think we share the same high-level goals here.
} | ||
|
||
@Nested | ||
@Optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we really need this as it will never be null, or does it treat an empty collection like a null because of @Nested
? The doc only reads that it allows for the value to be "not specified" .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I initially put this in place during testing. It can probably be removed.
String firstPartOfSeed = project.rootProject.testSeed.tokenize(':').get(0) | ||
setting 'thread_pool.repository_azure.max', (Math.abs(Long.parseUnsignedLong(firstPartOfSeed, 16) % 10) + 1).toString() | ||
setting 'thread_pool.repository_azure.max', (Math.abs(Long.parseUnsignedLong(firstPartOfSeed, 16) % 10) + 1).toString(), System.getProperty('ignore.tests.seed') == null ? DEFAULT : IGNORE_VALUE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why a new property here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow. There's not new property. We are just conditionally ignoring this when we've instructed the build to ignore the test seed, since this property value is generated from the test seed itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, didn't look careful enough and taught we started using a new system property.
|
||
@InputFiles | ||
@PathSensitive(PathSensitivity.RELATIVE) | ||
private FileCollection getDistributionFiles() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will run before we start clusters, so won't have any of the files we create as part of startup ( like the config file ). That's why we need to track the extra config files.
This all works as intended, just wanted to leave it here for other reviewers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this actually opens up another issue, which is the installed plugins and modules won't be there either. I thought I tested this, but I need to verify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I totally missed the fact that these get installed after the node starts. So my test build scan above "works" because the plugin under test is also on the test classpath but this won't work in all cases as we may install plugins/modules that aren't on the test runtime classpath. I'm lookin into a solution for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed a solution here. Essentially, we snapshot the bundle archives for each plugin/module and snapshot it similarly to how we do for the distribution itself by separating JARs from other contents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally missed the fact that these get installed after the node starts
Can you explain this more? We should not be touching plugins or modules after the node starts, and it would not work (ES only loads plugins at startup).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was poorly worded. More accurately, these get installed inside the ElasticsearchNode.start()
method, but before we actually launch the ES process. So no, we don't actually muck with the installation after it's started. The key bit here is that this happens after the testing task begins execution, so input snapshotting has already happened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Pr is getting large, but we should probably improve this right after this is merged by having a configuration that inherits from testRuntime track all the plugins and modules that we need to install in the test cluster. That would make this tracking more straight forward as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would make this tracking more straight forward as well
I'm not convinced it would. A Configuration
is nothing more than a FileCollection
in this case as these are all unmanaged file dependencies. Also they are .zip
files so we still have to deal with unpacking them and dealing with their contents intelligently. A Configuraiton
would also not support any notion of a remote plugin configured via a URI
so we'd have to manage that as well.
83747b7
to
89180d2
Compare
@mark-vieira you might have missed this one |
3742cfd
to
def871f
Compare
@elasticmachine run elasticsearch-ci/packaging-sample |
I'm pulling the trigger on this so we can get some data in Gradle Enterprise. 🚢 |
This pull request is the initial attempt and making our REST integration tests cacheable. We limit caching at this point only to those tests that leverage our
integ-test
distribution, primarily because cacheability of of tests running against our default distribution would likely be very poor since almost any change to the codebase would manifest in some way in the default distribution.While not a tremendous amount of change we've done a couple of fundamental things here:
We've introduced a
RestTestRunnerTask
which simply extends Gradle'sTest
task and defines a property such that we can track the configured test clusters as an input. We have to do this through subclassing because the Gradle runtime input API (i.e.task.inputs.*
) doesn't have an analog to the@Nested
annotation. Our test cluster configurations are complex and need to allow for nested properties.We've also decorated
ElasticsearchNode
with all the requisite input annotations such that all configuration options exposed byTestClusterConfiguration
are tracked as inputs. We don't need to track everything about the setup of these clusters as any change to the test clusters plugin implementation code would also cause a change in cache key due to them sharing the classpath withRestTestRunnerTask
. We mainly need to track two things a) all the options that you can tweak about a cluster in the build script (i.e.TestClusterConfiguration
) and b) the distribution we are testing against itself.All the options are tracked by adding various getters to
ElasticsearchNode
that return flattened and normalized version of those properties. To simplify a lot of the complexity that existed there around dealing with lazy property evaluation and null-checking, we've introducedLazyPropertyList
andLazyPropertyMap
to do the legwork here. These behave like a regularList
orMap
but internally delegate to a data structure that support lazy evaluation (by means of aSupplier
), and configurable normalization strategies, so that some properties can be ignored, or otherwise appropriately normalized for the purposes of input snapshotting (e.x. aMap<String, String>
should use@Input
for snapshotting entry values but aMap<String, File>
should use@InputFile
).For snapshotting of the distribution itself we've split it into two parts. All the JARs in the distribution which get treated as a runtime classpath, and everything else. By separating the JARs we can use a runtime classpath normalization strategy on them which ignores things like changes in JAR entry order, timestamps and any other classpath resources we've configured to be ignored such as manifests. Without this we'd never get a cache hit because two JARs built in two different builds would never by byte-for-byte identical. For all the other contents of the distribution we just treat as a normal collection of files. For any in installed plugins or modules we take a similar approach, spitting the plugin bundle JAR files from the rest of the content and snapshotting each with the appropriate normalization strategy.
The net result here is that any project whose test cluster is configured to use the integ-test distribution should have it's integration tests cached unless either the subproject itself, or something core upstream (e.x.
:server
or:distribution
) has changed. As an example, here's a build scan showing a build running thecheck
task for all:plugins
projects when a single-line change was made to:plugins:ingest-attachment
.