Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scheduler client that pulls from bucket cache #5605

Merged
merged 4 commits into from
Sep 11, 2021

Conversation

cgardens
Copy link
Contributor

What

  • Part 1 of 2 of this issue: Airbyte Cloud: "Cache" connector containers #5266
  • In response to the get spec call being too slow in Cloud, this PR adds a new scheduler client that will pull the spec from a GCS bucket instead of us needing to run the get spec command on a given connector. Retrieving this spec should be much faster than having to run those connectors.
  • In a separate PR, I need to add the logic where we write to that spec (as part of the build for each connector).
  • See the spec in the issue for more information on how we are thinking about this.

@cgardens cgardens requested a review from davinchia August 25, 2021 01:48
@github-actions github-actions bot added the area/platform issues related to the platform label Aug 25, 2021
Copy link
Contributor

@davinchia davinchia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick work!

I'm assuming you tested the GCS storage calls separately.

implementation project(':airbyte-protocol:models')
implementation project(':airbyte-scheduler:models')
implementation project(':airbyte-scheduler:persistence')
// todo (cgardens) - remove this dep. just needs temporal client.
implementation project(':airbyte-workers')

implementation platform('com.google.cloud:libraries-bom:21.0.0')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems more suited for the top-level build.gradle. Can we skip this for now and specify the version in the dependency itself? We can do a sweep later on to consolidate all this at one go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed.

Comment on lines +101 to +114
if (cachedSpecOptional.isPresent()) {
final long now = Instant.now().toEpochMilli();
final SynchronousJobMetadata mockMetadata = new SynchronousJobMetadata(
UUID.randomUUID(),
ConfigType.GET_SPEC,
null,
now,
now,
true,
Path.of(""));
return new SynchronousResponse<>(cachedSpecOptional.get(), mockMetadata);
} else {
return client.createGetSpecJob(dockerImage);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (cachedSpecOptional.isPresent()) {
final long now = Instant.now().toEpochMilli();
final SynchronousJobMetadata mockMetadata = new SynchronousJobMetadata(
UUID.randomUUID(),
ConfigType.GET_SPEC,
null,
now,
now,
true,
Path.of(""));
return new SynchronousResponse<>(cachedSpecOptional.get(), mockMetadata);
} else {
return client.createGetSpecJob(dockerImage);
}
if (cachedSpecOptional.isPresent()) {
final long now = Instant.now().toEpochMilli();
final SynchronousJobMetadata mockMetadata = new SynchronousJobMetadata(
UUID.randomUUID(),
ConfigType.GET_SPEC,
null,
now,
now,
true,
Path.of(""));
return new SynchronousResponse<>(cachedSpecOptional.get(), mockMetadata);
}
return client.createGetSpecJob(dockerImage);

@cgardens cgardens force-pushed the cgardens/bucket_cache_final branch from d1f91dd to a04808c Compare September 11, 2021 19:16
@cgardens cgardens force-pushed the cgardens/bucket_cache branch from 84bd155 to 9d97662 Compare September 11, 2021 19:19
@cgardens cgardens changed the base branch from cgardens/bucket_cache_final to master September 11, 2021 19:19
@cgardens
Copy link
Contributor Author

I'm assuming you tested the GCS storage calls separately.

Do you mean the reads or the writes?

The writes I'm going to do in a separate PR now. The nice thing about this client is that even if the bucket doesn't exist it is safe to merge. It'll just fall back on actually running the spec.

The read is tested in the unit tests. It's a bit awkward because it is more akin to an integration test.

@cgardens cgardens merged commit 7970e63 into master Sep 11, 2021
@cgardens cgardens deleted the cgardens/bucket_cache branch September 11, 2021 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform issues related to the platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants