Skip to content

Commit

Permalink
Gateway: retry logic for requests to GCS (#3836)
Browse files Browse the repository at this point in the history
Implement gateway retry logic for requests to GCS. Failed requests will retry up to 5 times.

Additionally, this PR adjusts how polling is done in order to prevent the possibility of multiple in-flight updates. The next "tick" only begins after a full round of updating is completed rather than on a perfectly regular interval. Thanks to @abernix for suggesting this change. For more details please see the PR description.
  • Loading branch information
trevor-scheer authored Mar 23, 2020
1 parent 34c7072 commit cdee9d6
Show file tree
Hide file tree
Showing 3 changed files with 121 additions and 31 deletions.
1 change: 1 addition & 0 deletions packages/apollo-gateway/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
- Support providing a custom logger implementation (e.g. [`winston`](https://npm.im/winston), [`bunyan`](https://npm.im/bunyan), etc.) to capture gateway-sourced console output. This allows the use of existing, production logging facilities or the possibiltiy to use advanced structure in logging, such as console output which is encapsulated in JSON. The same PR that introduces this support also introduces a `logger` property to the `GraphQLRequestContext` that is exposed to `GraphQLDataSource`s and Apollo Server plugins, making it possible to attach additional properties (as supported by the logger implementation) to specific requests, if desired, by leveraging custom implementations in those components respectively. When not provided, these will still output to `console`. [PR #3894](https://github.com/apollographql/apollo-server/pull/3894)
- Drop use of `loglevel-debug`. This removes the very long date and time prefix in front of each log line and also the support for the `DEBUG=apollo-gateway:` environment variable. Both of these were uncommonly necessary or seldom used (with the environment variable also being undocumented). The existing behavior can be preserved by providing a `logger` that uses `loglevel-debug`, if desired, and more details can be found in the PR. [PR #3896](https://github.com/apollographql/apollo-server/pull/3896)
- Fix Typescript generic typing for datasource contexts [#3865](https://github.com/apollographql/apollo-server/pull/3865) This is a fix for the `TContext` typings of the gateway's exposed `GraphQLDataSource` implementations. In their current form, they don't work as intended, or in any manner that's useful for typing the `context` property throughout the class methods. This introduces a type argument `TContext` to the class itself (which defaults to `Record<string, any>` for existing implementations) and removes the non-operational type arguments on the class methods themselves.
- Implement retry logic for requests to GCS [PR #3836](https://github.com/apollographql/apollo-server/pull/3836) Note: coupled with this change is a small alteration in how the gateway polls GCS for updates in managed mode. Previously, the tick was on a specific interval. Now, every tick starts after the round of fetches to GCS completes. For more details, see the linked PR.

## 0.13.2

Expand Down
Original file line number Diff line number Diff line change
@@ -1,17 +1,20 @@
import nock from 'nock';
import { ApolloGateway } from '../..';
import { fetch } from 'apollo-server-env';
import { ApolloGateway, GCS_RETRY_COUNT, getDefaultGcsFetcher } from '../..';
import {
mockLocalhostSDLQuery,
mockStorageSecretSuccess,
mockStorageSecret,
mockCompositionConfigLinkSuccess,
mockCompositionConfigLink,
mockCompositionConfigsSuccess,
mockCompositionConfigs,
mockImplementingServicesSuccess,
mockImplementingServices,
mockRawPartialSchemaSuccess,
mockRawPartialSchema,
apiKeyHash,
graphId,
mockImplementingServices,
mockRawPartialSchema,
mockCompositionConfigLink,
} from './nockMocks';

import loadServicesFromStorage = require("../../loadServicesFromStorage");
Expand All @@ -36,7 +39,7 @@ const service = {
name: String
username: String
}
`
`,
};

const updatedService = {
Expand All @@ -55,11 +58,21 @@ const updatedService = {
name: String
username: String
}
`
}
`,
};

let fetcher: typeof fetch;

beforeEach(() => {
if (!nock.isActive()) nock.activate();

fetcher = getDefaultGcsFetcher().defaults({
retry: {
retries: GCS_RETRY_COUNT,
minTimeout: 0,
maxTimeout: 0,
},
});
});

afterEach(() => {
Expand Down Expand Up @@ -195,3 +208,58 @@ it('Rollsback to a previous schema when triggered', async () => {
await secondSchemaChangeBlocker;
expect(onChange.mock.calls.length).toBe(2);
});

function failNTimes(n: number, fn: () => nock.Interceptor) {
for (let i = 0; i < n; i++) {
fn().reply(500);
}
}

it(`Retries GCS (up to ${GCS_RETRY_COUNT} times) on failure for each request and succeeds`, async () => {
failNTimes(GCS_RETRY_COUNT, mockStorageSecret);
mockStorageSecretSuccess();

failNTimes(GCS_RETRY_COUNT, mockCompositionConfigLink);
mockCompositionConfigLinkSuccess();

failNTimes(GCS_RETRY_COUNT, mockCompositionConfigs);
mockCompositionConfigsSuccess([service.implementingServicePath]);

failNTimes(GCS_RETRY_COUNT, () => mockImplementingServices(service));
mockImplementingServicesSuccess(service);

failNTimes(GCS_RETRY_COUNT, () => mockRawPartialSchema(service));
mockRawPartialSchemaSuccess(service);

const gateway = new ApolloGateway({ fetcher });

await gateway.load({ engine: { apiKeyHash, graphId } });
expect(gateway.schema!.getType('User')!.description).toBe('This is my User');
});

it(`Fails after the ${GCS_RETRY_COUNT + 1}th attempt to reach GCS`, async () => {
failNTimes(GCS_RETRY_COUNT + 1, mockStorageSecret);

const gateway = new ApolloGateway({ fetcher });
await expect(
gateway.load({ engine: { apiKeyHash, graphId } }),
).rejects.toThrowErrorMatchingInlineSnapshot(
`"Could not communicate with Apollo Graph Manager storage: "`,
);
});

it(`Errors when the secret isn't hosted on GCS`, async () => {
mockStorageSecret().reply(
403,
`<Error><Code>AccessDenied</Code>
Anonymous caller does not have storage.objects.get`,
{ 'content-type': 'application/xml' },
);

const gateway = new ApolloGateway({ fetcher });
await expect(
gateway.load({ engine: { apiKeyHash, graphId } }),
).rejects.toThrowErrorMatchingInlineSnapshot(
`"Unable to authenticate with Apollo Graph Manager storage while fetching https://storage.googleapis.com/engine-partial-schema-prod/federated-service/storage-secret/dd55a79d467976346d229a7b12b673ce.json. Ensure that the API key is configured properly and that a federated service has been pushed. For details, see https://go.apollo.dev/g/resolve-access-denied."`,
);
});
69 changes: 45 additions & 24 deletions packages/apollo-gateway/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,28 @@ type WarnedStates = {
remoteWithLocalConfig?: boolean;
};

export const GCS_RETRY_COUNT = 5;

export function getDefaultGcsFetcher() {
return fetcher.defaults({
cacheManager: new HttpRequestCache(),
// All headers should be lower-cased here, as `make-fetch-happen`
// treats differently cased headers as unique (unlike the `Headers` object).
// @see: https://git.io/JvRUa
headers: {
'user-agent': `apollo-gateway/${require('../package.json').version}`,
},
retry: {
retries: GCS_RETRY_COUNT,
// The default factor: expected attempts at 0, 1, 3, 7, 15, and 31 seconds elapsed
factor: 2,
// 1 second
minTimeout: 1000,
randomize: true,
},
});
}

export class ApolloGateway implements GraphQLService {
public schema?: GraphQLSchema;
protected serviceMap: DataSourceCache = Object.create(null);
Expand All @@ -176,15 +198,7 @@ export class ApolloGateway implements GraphQLService {
private serviceSdlCache = new Map<string, string>();
private warnedStates: WarnedStates = Object.create(null);

private fetcher: typeof fetch = fetcher.defaults({
cacheManager: new HttpRequestCache(),
// All headers should be lower-cased here, as `make-fetch-happen`
// treats differently cased headers as unique (unlike the `Headers` object).
// @see: https://git.io/JvRUa
headers: {
'user-agent': `apollo-gateway/${require('../package.json').version}`
}
});
private fetcher: typeof fetch = getDefaultGcsFetcher();

// Observe query plan, service info, and operation info prior to execution.
// The information made available here will give insight into the resulting
Expand Down Expand Up @@ -417,7 +431,7 @@ export class ApolloGateway implements GraphQLService {
}

this.onSchemaChangeListeners.add(callback);
if (!this.pollingTimer) this.startPollingServices();
if (!this.pollingTimer) this.pollServices();

return () => {
this.onSchemaChangeListeners.delete(callback);
Expand All @@ -428,21 +442,28 @@ export class ApolloGateway implements GraphQLService {
};
}

private startPollingServices() {
if (this.pollingTimer) clearInterval(this.pollingTimer);
private async pollServices() {
if (this.pollingTimer) clearTimeout(this.pollingTimer);

this.pollingTimer = setInterval(async () => {
try {
await this.updateComposition();
} catch (err) {
this.logger.error(err && err.message || err);
}
}, this.experimental_pollInterval || 10000);
try {
await this.updateComposition();
} catch (err) {
this.logger.error(err && err.message || err);
}

// Sleep for the specified pollInterval before kicking off another round of polling
await new Promise(res => {
this.pollingTimer = setTimeout(
res,
this.experimental_pollInterval || 10000,
);
// Prevent the Node.js event loop from remaining active (and preventing,
// e.g. process shutdown) by calling `unref` on the `Timeout`. For more
// information, see https://nodejs.org/api/timers.html#timers_timeout_unref.
this.pollingTimer?.unref();
});

// Prevent the Node.js event loop from remaining active (and preventing,
// e.g. process shutdown) by calling `unref` on the `Timeout`. For more
// information, see https://nodejs.org/api/timers.html#timers_timeout_unref.
this.pollingTimer.unref();
this.pollServices();
}

private createAndCacheDataSource(
Expand Down Expand Up @@ -508,7 +529,7 @@ export class ApolloGateway implements GraphQLService {
this.logger.warn(
"A local gateway service list is overriding an Apollo Graph " +
"Manager managed configuration. To use the managed " +
"configuration, do not specifiy a service list locally.",
"configuration, do not specify a service list locally.",
);
}).catch(() => {}); // Don't mind errors if managed config is missing.
}
Expand Down

0 comments on commit cdee9d6

Please sign in to comment.