Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gateway: retry logic for requests to GCS #3836

Merged
merged 6 commits into from
Mar 23, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions packages/apollo-gateway/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
- Support providing a custom logger implementation (e.g. [`winston`](https://npm.im/winston), [`bunyan`](https://npm.im/bunyan), etc.) to capture gateway-sourced console output. This allows the use of existing, production logging facilities or the possibiltiy to use advanced structure in logging, such as console output which is encapsulated in JSON. The same PR that introduces this support also introduces a `logger` property to the `GraphQLRequestContext` that is exposed to `GraphQLDataSource`s and Apollo Server plugins, making it possible to attach additional properties (as supported by the logger implementation) to specific requests, if desired, by leveraging custom implementations in those components respectively. When not provided, these will still output to `console`. [PR #3894](https://github.com/apollographql/apollo-server/pull/3894)
- Drop use of `loglevel-debug`. This removes the very long date and time prefix in front of each log line and also the support for the `DEBUG=apollo-gateway:` environment variable. Both of these were uncommonly necessary or seldom used (with the environment variable also being undocumented). The existing behavior can be preserved by providing a `logger` that uses `loglevel-debug`, if desired, and more details can be found in the PR. [PR #3896](https://github.com/apollographql/apollo-server/pull/3896)
- Fix Typescript generic typing for datasource contexts [#3865](https://github.com/apollographql/apollo-server/pull/3865) This is a fix for the `TContext` typings of the gateway's exposed `GraphQLDataSource` implementations. In their current form, they don't work as intended, or in any manner that's useful for typing the `context` property throughout the class methods. This introduces a type argument `TContext` to the class itself (which defaults to `Record<string, any>` for existing implementations) and removes the non-operational type arguments on the class methods themselves.
- Implement retry logic for requests to GCS [PR #3836](https://github.com/apollographql/apollo-server/pull/3836) Note: coupled with this change is a small alteration in how the gateway polls GCS for updates in managed mode. Previously, the tick was on a specific interval. Now, every tick starts after the round of fetches to GCS completes. For more details, see the linked PR.

## 0.13.2

Expand Down
Original file line number Diff line number Diff line change
@@ -1,17 +1,20 @@
import nock from 'nock';
import { ApolloGateway } from '../..';
import { fetch } from 'apollo-server-env';
import { ApolloGateway, GCS_RETRY_COUNT, getDefaultGcsFetcher } from '../..';
import {
mockLocalhostSDLQuery,
mockStorageSecretSuccess,
mockStorageSecret,
mockCompositionConfigLinkSuccess,
mockCompositionConfigLink,
mockCompositionConfigsSuccess,
mockCompositionConfigs,
mockImplementingServicesSuccess,
mockImplementingServices,
mockRawPartialSchemaSuccess,
mockRawPartialSchema,
apiKeyHash,
graphId,
mockImplementingServices,
mockRawPartialSchema,
mockCompositionConfigLink,
} from './nockMocks';

import loadServicesFromStorage = require("../../loadServicesFromStorage");
Expand All @@ -36,7 +39,7 @@ const service = {
name: String
username: String
}
`
`,
};

const updatedService = {
Expand All @@ -55,11 +58,21 @@ const updatedService = {
name: String
username: String
}
`
}
`,
};

let fetcher: typeof fetch;

beforeEach(() => {
if (!nock.isActive()) nock.activate();

fetcher = getDefaultGcsFetcher().defaults({
retry: {
retries: GCS_RETRY_COUNT,
minTimeout: 0,
maxTimeout: 0,
},
});
});

afterEach(() => {
Expand Down Expand Up @@ -195,3 +208,58 @@ it('Rollsback to a previous schema when triggered', async () => {
await secondSchemaChangeBlocker;
expect(onChange.mock.calls.length).toBe(2);
});

function failNTimes(n: number, fn: () => nock.Interceptor) {
for (let i = 0; i < n; i++) {
fn().reply(500);
}
}

it(`Retries GCS (up to ${GCS_RETRY_COUNT} times) on failure for each request and succeeds`, async () => {
failNTimes(GCS_RETRY_COUNT, mockStorageSecret);
mockStorageSecretSuccess();

failNTimes(GCS_RETRY_COUNT, mockCompositionConfigLink);
mockCompositionConfigLinkSuccess();

failNTimes(GCS_RETRY_COUNT, mockCompositionConfigs);
mockCompositionConfigsSuccess([service.implementingServicePath]);

failNTimes(GCS_RETRY_COUNT, () => mockImplementingServices(service));
mockImplementingServicesSuccess(service);

failNTimes(GCS_RETRY_COUNT, () => mockRawPartialSchema(service));
mockRawPartialSchemaSuccess(service);

const gateway = new ApolloGateway({ fetcher });

await gateway.load({ engine: { apiKeyHash, graphId } });
expect(gateway.schema!.getType('User')!.description).toBe('This is my User');
});

it(`Fails after the ${GCS_RETRY_COUNT + 1}th attempt to reach GCS`, async () => {
failNTimes(GCS_RETRY_COUNT + 1, mockStorageSecret);

const gateway = new ApolloGateway({ fetcher });
await expect(
gateway.load({ engine: { apiKeyHash, graphId } }),
).rejects.toThrowErrorMatchingInlineSnapshot(
`"Could not communicate with Apollo Graph Manager storage: "`,
);
});

it(`Errors when the secret isn't hosted on GCS`, async () => {
mockStorageSecret().reply(
403,
`<Error><Code>AccessDenied</Code>
Anonymous caller does not have storage.objects.get`,
{ 'content-type': 'application/xml' },
);

const gateway = new ApolloGateway({ fetcher });
await expect(
gateway.load({ engine: { apiKeyHash, graphId } }),
).rejects.toThrowErrorMatchingInlineSnapshot(
`"Unable to authenticate with Apollo Graph Manager storage while fetching https://storage.googleapis.com/engine-partial-schema-prod/federated-service/storage-secret/dd55a79d467976346d229a7b12b673ce.json. Ensure that the API key is configured properly and that a federated service has been pushed. For details, see https://go.apollo.dev/g/resolve-access-denied."`,
);
});
69 changes: 45 additions & 24 deletions packages/apollo-gateway/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,28 @@ type WarnedStates = {
remoteWithLocalConfig?: boolean;
};

export const GCS_RETRY_COUNT = 5;

export function getDefaultGcsFetcher() {
return fetcher.defaults({
cacheManager: new HttpRequestCache(),
// All headers should be lower-cased here, as `make-fetch-happen`
// treats differently cased headers as unique (unlike the `Headers` object).
// @see: https://git.io/JvRUa
headers: {
'user-agent': `apollo-gateway/${require('../package.json').version}`,
},
retry: {
retries: GCS_RETRY_COUNT,
// The default factor: expected attempts at 0, 1, 3, 7, 15, and 31 seconds elapsed
factor: 2,
// 1 second
minTimeout: 1000,
randomize: true,
},
});
}

export class ApolloGateway implements GraphQLService {
public schema?: GraphQLSchema;
protected serviceMap: DataSourceCache = Object.create(null);
Expand All @@ -176,15 +198,7 @@ export class ApolloGateway implements GraphQLService {
private serviceSdlCache = new Map<string, string>();
private warnedStates: WarnedStates = Object.create(null);

private fetcher: typeof fetch = fetcher.defaults({
cacheManager: new HttpRequestCache(),
// All headers should be lower-cased here, as `make-fetch-happen`
// treats differently cased headers as unique (unlike the `Headers` object).
// @see: https://git.io/JvRUa
headers: {
'user-agent': `apollo-gateway/${require('../package.json').version}`
}
});
private fetcher: typeof fetch = getDefaultGcsFetcher();

// Observe query plan, service info, and operation info prior to execution.
// The information made available here will give insight into the resulting
Expand Down Expand Up @@ -417,7 +431,7 @@ export class ApolloGateway implements GraphQLService {
}

this.onSchemaChangeListeners.add(callback);
if (!this.pollingTimer) this.startPollingServices();
if (!this.pollingTimer) this.pollServices();

return () => {
this.onSchemaChangeListeners.delete(callback);
Expand All @@ -428,21 +442,28 @@ export class ApolloGateway implements GraphQLService {
};
}

private startPollingServices() {
if (this.pollingTimer) clearInterval(this.pollingTimer);
private async pollServices() {
if (this.pollingTimer) clearTimeout(this.pollingTimer);

this.pollingTimer = setInterval(async () => {
try {
await this.updateComposition();
} catch (err) {
this.logger.error(err && err.message || err);
}
}, this.experimental_pollInterval || 10000);
try {
await this.updateComposition();
} catch (err) {
this.logger.error(err && err.message || err);
}

// Sleep for the specified pollInterval before kicking off another round of polling
await new Promise(res => {
this.pollingTimer = setTimeout(
res,
this.experimental_pollInterval || 10000,
);
// Prevent the Node.js event loop from remaining active (and preventing,
// e.g. process shutdown) by calling `unref` on the `Timeout`. For more
// information, see https://nodejs.org/api/timers.html#timers_timeout_unref.
this.pollingTimer?.unref();
Comment on lines +460 to +463
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Substantially less critical to have this now that it's not a (forever) interval, but probably worth keeping.

});

// Prevent the Node.js event loop from remaining active (and preventing,
// e.g. process shutdown) by calling `unref` on the `Timeout`. For more
// information, see https://nodejs.org/api/timers.html#timers_timeout_unref.
this.pollingTimer.unref();
this.pollServices();
}

private createAndCacheDataSource(
Expand Down Expand Up @@ -508,7 +529,7 @@ export class ApolloGateway implements GraphQLService {
this.logger.warn(
"A local gateway service list is overriding an Apollo Graph " +
"Manager managed configuration. To use the managed " +
"configuration, do not specifiy a service list locally.",
"configuration, do not specify a service list locally.",
trevor-scheer marked this conversation as resolved.
Show resolved Hide resolved
);
}).catch(() => {}); // Don't mind errors if managed config is missing.
}
Expand Down