-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8.0.0-beta.6 massive performance degradation/connections dropped #11897
Comments
Hey, thank you for reporting this! So We did bump a bunch of underlying instrumentation, including One thing you could try is to force it to use v0.38.0 of {
"resolutions": {
"@sentry/node/@opentelemetry/instrumentation-pg": "^0.38.0"
}
} I think this should work to try if that has an impact on performance 🤔 But it could of course also be another instrumentation, hard to tell. I guess you don't have a public reproduction for this that we could take a look at? |
Same issue with What's weird is that the CPU usage remains about the same. I would suspect that some instrumentation logic is introducing big overhead, but that does not appear to be the case. I have not had much luck replicating this in isolation, i.e. we are only seeing this when deployed to production, which would indicate that this issue surfaces under high concurrency. Will continue to troubleshoot and report back if we can locate the root cause. |
Thanks for the added info! Could you share your |
import { RequestError } from '@contra/http-client';
import { init, rewriteFramesIntegration } from '@sentry/node';
import { nodeProfilingIntegration } from '@sentry/profiling-node';
import { type Integration } from '@sentry/types';
export { findActiveSpan } from './findActiveSpan.js';
export { getTraceAndBaggage } from './getTraceAndBaggage.js';
export { startSpan } from '@sentry/core';
export {
type Breadcrumb,
captureException,
captureMessage,
close,
setupFastifyErrorHandler,
setUser,
withScope,
} from '@sentry/node';
export { type Extras } from '@sentry/types';
const {
NODE_ENV,
NODE_PROFILING,
// eslint-disable-next-line n/no-process-env
} = process.env;
type ErrorDetail = {
message: string;
path: string[];
};
// Type Guard for ErrorDetail
const isErrorDetail = (error: unknown): error is ErrorDetail => {
return (
typeof error === 'object' &&
error !== null &&
'message' in error &&
'path' in error &&
typeof error.message === 'string' &&
Array.isArray(error.path)
);
};
// Extracts error message from JSON error string
const extractErrorMessage = (errorString: string): string | null => {
try {
const outerArray = JSON.parse(errorString);
const parsed = outerArray[0];
if (Array.isArray(parsed) && parsed.every(isErrorDetail)) {
return parsed.map((error) => error.message).join(', ');
} else {
return parsed.message;
}
} catch {
return null;
}
};
// Extracts path from JSON error string and removes index of edges
const extractErrorPath = (errorString: string): string | null => {
try {
const outerArray = JSON.parse(errorString);
const parsed = outerArray[0];
if (Array.isArray(parsed) && parsed.every(isErrorDetail)) {
return parsed
.map((error) => error.path.join('.').replaceAll(/\d+/gu, ''))
.filter(Boolean)
.join(', ');
} else {
return parsed.path.join('.').replaceAll(/\d+/gu, '');
}
} catch {
return null;
}
};
// Extracts error details
const extractErrorDetails = (error: unknown): Record<string, string | null> => {
let errorMessage: string | null = null;
let errorPath: string | null = null;
errorMessage = extractErrorMessage(typeof error === 'string' ? error : '');
errorPath = extractErrorPath(typeof error === 'string' ? error : '');
return { errorMessage, errorPath };
};
const formatFingerprint = (fingerprint: string[]): string[] => {
return fingerprint.map((item) => {
const { errorMessage, errorPath } = extractErrorDetails(item);
if (errorMessage && errorPath) {
return `{message: "${errorMessage}", path: "${errorPath}"}`;
}
return item;
});
};
export type SentryTracing = {
shutdown: () => Promise<void>;
};
export const createSentry = ({
integrations,
root,
}: {
integrations?: (integrations: Integration[]) => Integration[];
root: string;
}) => {
init({
beforeSend(event, hint) {
let fingerprint =
typeof hint.originalException === 'string'
? [hint.originalException]
: hint.originalException instanceof Error
? [hint.originalException?.message]
: undefined;
if (!event.contexts) {
event.contexts = {};
}
if (hint.originalException instanceof RequestError) {
event.contexts.httpRequest = {
method: hint.originalException.request?.options.method,
// eslint-disable-next-line @typescript-eslint/no-explicit-any
source: (hint.originalException as any)?.sourceTrace,
url: hint.originalException.request?.requestUrl,
};
event.contexts.httpResponse = {
body: hint.originalException?.response?.body ?? null,
statusCode: hint.originalException?.response?.statusCode,
};
fingerprint = [
hint.originalException.message,
hint.originalException.request?.options.method ?? '',
// @ts-expect-error TODO why origin does not exist?
String(hint.originalException.request?.options.url?.origin),
];
}
if (fingerprint) {
if (event.contexts.graphql) {
fingerprint.push(event.contexts.graphql.path as string);
} else if (event.contexts.temporal) {
fingerprint.push(event.contexts.temporal.workflowType as string);
}
}
if (fingerprint) {
event.fingerprint = formatFingerprint(fingerprint);
} else {
event.fingerprint = fingerprint;
}
return event;
},
integrations: (defaultIntegrations) => {
const customIntegrations = [
...defaultIntegrations,
rewriteFramesIntegration({
root,
}),
];
if (NODE_PROFILING === 'true') {
customIntegrations.push(nodeProfilingIntegration());
}
if (integrations) {
return integrations(customIntegrations);
}
return customIntegrations;
},
normalizeDepth: 5,
spotlight: NODE_ENV === 'development',
tracesSampleRate: 1,
});
return {
shutdown: async () => {
// This was made redundant by Sentry abstracting away opentelemetry
},
};
}; |
v8 same issue. @mydea I managed to get a trace while we were testing v8. Looks like there are massive gaps between span. |
The resource consumption profile does not change dramatically. This is a snapshot of pre-deployment:
This is after:
Memory usage is def higher on average, but nothing that would make me wince. |
Just like the last time, we do see a big uptick in timeouts while trying to establish pg connection. Logs are flooded with:
|
@mydea What would be the correct way to disable all auto instrumentations and re-add them all one by one? I know how to do it with opentelemetry, but not when it is hidden like it is with Sentry.
doesn't work. |
You can do Or, alternatively, you can also just filter out single integrations like this: Sentry.init({
integrations: function(integrations) {
return integrations.filter(integration => integration.name !== 'Postgres');
}
}); (similar to what you are already doing in your init, but just removing the ones you want to exclude for a chance). Would be interesting to know if this is only related to the pg instrumentation 🤔 I don't see anything in the trace you sent that appears wrong, at a glance 🤔 |
Is there an existing issue for this?
How do you use Sentry?
Sentry Saas (sentry.io)
Which SDK are you using?
@sentry/node
SDK Version
8.0.0-beta.6
Framework Version
No response
Link to Sentry event
No response
SDK Setup
N/A
Steps to Reproduce
We upgraded from
8.0.0-alpha.9
to8.0.0-beta.6
.Expected Result
Everything to work.
Actual Result
Experienced major performance degradation.
There are no errors per se. However, service performance has dropped to a halt.
The most noticeable symptom was occasional errors coming from
pg
module complaining about timeouts, e.g.Worth noticing that we are also heavily relying on Zod. I noticed that it was mentioned in the upgrade changelog.
The text was updated successfully, but these errors were encountered: