In the browser environment, if there are many requests in the page, and the chrome tag is frequently switched back and forth, the sendBeacon request will appear in pending status, and the console promise will report an error #3489

yuanman0109 · 2022-12-15T13:44:09Z

What happened?

Steps to Reproduce

Visit a page with more than 200 requests, switch to other tabs when sending to the server, and then switch back quickly

Expected Result

Actual Result

The data can be sent successfully, and there is no promise error on the console

Additional Details

OpenTelemetry Setup Code

import { WebTracerProvider } from '@opentelemetry/sdk-trace-web'
import { Resource } from '@opentelemetry/resources'
import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions'
import { ZoneContextManager } from '@opentelemetry/context-zone'
import { registerInstrumentations } from '@opentelemetry/instrumentation'
import { FetchInstrumentation } from '@opentelemetry/instrumentation-fetch'
import { XMLHttpRequestInstrumentation } from '@opentelemetry/instrumentation-xml-http-request'
import { BatchSpanProcessor }  from '@opentelemetry/sdk-trace-base';
import { CustomIdGenerator } from '../utils/generator'
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
.
.
.
provider.register({
      contextManager: new ZoneContextManager(),
    })
    const collectorOptions = {
      url: 'xxx/api/traces', // url is optional and can be omitted - default is
    };
    const exporter = new OTLPTraceExporter(collectorOptions);
    const spanProcess = new BatchSpanProcessor(exporter, {
      // The maximum queue size. After the size is reached spans are dropped.
      maxQueueSize: 100,
      // The maximum batch size of every export. It must be smaller or equal to maxQueueSize.
      maxExportBatchSize: 10,
      // The interval between two consecutive exports
      scheduledDelayMillis: 2000,
      // How long the export can run before it is cancelled
      exportTimeoutMillis: 30000,
    })
    provider.addSpanProcessor(spanProcess);

package.json

"@opentelemetry/api": "^1.3.0",
    "@opentelemetry/context-zone": "^1.8.0",
    "@opentelemetry/exporter-trace-otlp-http": "^0.34.0",
    "@opentelemetry/instrumentation": "^0.34.0",
    "@opentelemetry/instrumentation-fetch": "^0.34.0",
    "@opentelemetry/instrumentation-xml-http-request": "^0.34.0",
    "@opentelemetry/otlp-transformer": "^0.34.0",
    "@opentelemetry/resources": "^1.8.0",
    "@opentelemetry/sdk-trace-base": "^1.8.0",
    "@opentelemetry/sdk-trace-web": "^1.8.0",
    "@opentelemetry/semantic-conventions": "^1.8.0",

Relevant log output

No response

dyladan · 2022-12-15T17:19:13Z

I'm sorry it isn't really clear from your report, but is the data actually sent successfully or not? Is the bug just the log statement? Can you include the stacktrace so I can try to track down where this is actually coming from?

yuanman0109 · 2022-12-16T03:04:16Z

I'm sorry it isn't really clear from your report, but is the data actually sent successfully or not? Is the bug just the log statement? Can you include the stacktrace so I can try to track down where this is actually coming from?

The data is in pending status and there are many data. It has not been verified whether the transmission is successful. However, the console will throw a promise error, which will affect the indicators of exception collection. Most span data comes from xhr and fetch instrumentation. There is stack information, which is currently located here:

opentelemetry-js/experimental/packages/otlp-exporter-base/src/platform/browser/util.ts

Line 38 in 2fb80eb

const error = new OTLPExporterError(`sendBeacon - cannot send ${body}`);

It may be difficult to troubleshoot this problem only through what I have described. Because in general, users will not switch labels violently like me, even if The length of the '_finishedSpans' is very long, and they will also be queued for sending. However, if you switch to another tag when sending, I guess that the 'visibilitychange' or 'pagehide' is triggered, which requires simultaneous execution 'flushAll' function. You can simulate a scene and let ' The maxQueueSize 'should be as close to the maximum value as possible. Then, when sending data, switch to other pages, and then switch back quickly. This operation should be repeated several times, and then observe the console

dyladan · 2022-12-21T16:53:23Z

The bug here is most likely in the simultaneous execution of the flush then. I believe the spec is that it should not be executed concurrently.

dyladan · 2022-12-21T16:54:07Z

Labeling as p2 since it doesn't seem to be a crasher or affect your application (please correct me if that's wrong) but may be missing telemetry to the backend.

isanchen · 2023-01-05T17:48:42Z

We are seeing the same error, and the stacktrace suggests this line navigator.sendBeacon() is returning false here and causing an unhandled promise to bubble up ultimately.

My guess is the data has exceeded the browser limit for queueing the sendBeacon request (https://w3c.github.io/beacon/#return-value):

If the amount of data to be queued exceeds the user agent limit (as defined in HTTP-network-or-cache fetch), this method returns false

BTW we use the BatchSpanProcessor with the following config:

{
      maxQueueSize: 100,
      maxExportBatchSize: 10,
      scheduledDelayMillis: 500,
      exportTimeoutMillis: 30000,
}

If this theory sounds reasonable, we will try reducing the maxQueueSize, maxExportBatchSize and scheduledDelayMillis to see if it helps.

On the other hand, I am not sure if the fact that the SDK emitting unhandled promises is acceptable by the OTEL development guidance or not: https://opentelemetry.io/docs/reference/specification/error-handling/

And finally, is there a way for users to configure error handlers for these errors?

Thanks

ahayworth · 2023-02-21T18:41:23Z

I believe we are also seeing this bug in production. In the interest of not just registering a "me too" comment, we can add the following details:

We see this with the metrics SDK, rather than traces
We also see stacktraces pointing to the same code

We are seeing around 100 per day, which is a rather small fraction of the metrics requests we are handling; so I do not believe it to be extraordinarily urgent. But, we do see it.

dyladan · 2023-02-21T20:56:02Z

We are seeing the same error, and the stacktrace suggests this line navigator.sendBeacon() is returning false here and causing an unhandled promise to bubble up ultimately.

My guess is the data has exceeded the browser limit for queueing the sendBeacon request (https://w3c.github.io/beacon/#return-value):

If the amount of data to be queued exceeds the user agent limit (as defined in HTTP-network-or-cache fetch), this method returns false

BTW we use the BatchSpanProcessor with the following config:
{
      maxQueueSize: 100,
      maxExportBatchSize: 10,
      scheduledDelayMillis: 500,
      exportTimeoutMillis: 30000,
}
If this theory sounds reasonable, we will try reducing the maxQueueSize, maxExportBatchSize and scheduledDelayMillis to see if it helps.

Yes this seems reasonable. I'm not sure from the wording there if a single batch is too big or if the entire beacon queue as grown too large. If the first, then only changing maxExportBatchSize should suffice.

On the other hand, I am not sure if the fact that the SDK emitting unhandled promises is acceptable by the OTEL development guidance or not: https://opentelemetry.io/docs/reference/specification/error-handling/

An unhandled promise rejection is equivalent to an unhandled throw and it is not acceptable. This is a bug. Thanks for the additional info and troubleshooting.

And finally, is there a way for users to configure error handlers for these errors?

There is a global error handler but if we aren't catching the error then we can't send it to the global error handler. You may also be able to use the window unhandledrejection event to log them if you want.

dyladan · 2023-02-21T20:56:28Z

I'm changing this to p1 since it is throwing unhandled promise rejections

MSNev · 2023-09-07T17:13:15Z

This is also an issue on the "receiving" server.

When a (recent) browser (older ones work differently) sends a sendBeacon request to it "expects" that the receiving server will NOT return a response and therefore it (should) return a 204 (with no body), this enables the browser to "reuse" the existing connection. If the server sends back a 200 with a response the browser (chrome) will cause the current connection to be closed resulting in the need to establish a new connection to the receiving server.

This issue is compounded when the "sender" uses multiple sendBeacon request in a small amount of time as this causes browser to create / establish multiple connections to the same domain resulting in "many" end up in the pending state and often never getting sent.

As mentioned above older versions of Chromium (Chrome, Edge) and specifically original Microsoft Edge (non-chromium version) handle the "200" response without issues and can actually cause other issues if many 204's are returned. For Microsoft's internal SDK we have a configuration which enables the browser instance to "detect" these environments and it works with the backend by supplying a query string to "inform" the receiving server that the runtime is expecting a 204 for the current request (assuming all is OK, ie. it would have normally returned a 200)

myieye · 2023-10-09T09:16:04Z

It seems to me that the traces should now be sent using the fetch api with the keepalive parameter instead of the Beacon API (in browsers that support it at least, which is pretty much everyone except FireFox):

The keepalive option can be used to allow the request to outlive the page. Fetch with the keepalive flag is a replacement for the Navigator.sendBeacon() API.

MSNev · 2023-10-25T16:51:03Z

fetch api with the keepalive

The fetch with keep-alive flag also has the same 64kb limitation, so if this is payload related it's still going to be an issue (with getting the telemetry out of the environment)

istvan-hevele · 2024-11-13T17:14:59Z

A potential workaround is to pass in headers: {} to the OTLPTraceExporter's options. Because the Beacon API doesn't support custom headers, this will cause the library to use XHR instead.

yuanman0109 added bug Something isn't working triage labels Dec 15, 2022

dyladan added the information-requested Bug is waiting on additional information from the user label Dec 15, 2022

dyladan added priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect and removed triage labels Dec 21, 2022

myieye mentioned this issue Oct 9, 2023

Stabilize client-side OTEL export sillsdev/languageforge-lexbox#312

Open

ccschmitz mentioned this issue Jul 22, 2024

Performance data not being collected correctly highlight/highlight#8993

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In the browser environment, if there are many requests in the page, and the chrome tag is frequently switched back and forth, the sendBeacon request will appear in pending status, and the console promise will report an error #3489

In the browser environment, if there are many requests in the page, and the chrome tag is frequently switched back and forth, the sendBeacon request will appear in pending status, and the console promise will report an error #3489

yuanman0109 commented Dec 15, 2022 •

edited

Loading

dyladan commented Dec 15, 2022

yuanman0109 commented Dec 16, 2022

dyladan commented Dec 21, 2022

dyladan commented Dec 21, 2022

isanchen commented Jan 5, 2023

ahayworth commented Feb 21, 2023

dyladan commented Feb 21, 2023

dyladan commented Feb 21, 2023

MSNev commented Sep 7, 2023

myieye commented Oct 9, 2023

MSNev commented Oct 25, 2023

istvan-hevele commented Nov 13, 2024

In the browser environment, if there are many requests in the page, and the chrome tag is frequently switched back and forth, the sendBeacon request will appear in pending status, and the console promise will report an error #3489

In the browser environment, if there are many requests in the page, and the chrome tag is frequently switched back and forth, the sendBeacon request will appear in pending status, and the console promise will report an error #3489

Comments

yuanman0109 commented Dec 15, 2022 • edited Loading

What happened?

Steps to Reproduce

Expected Result

Actual Result

Additional Details

OpenTelemetry Setup Code

package.json

Relevant log output

dyladan commented Dec 15, 2022

yuanman0109 commented Dec 16, 2022

dyladan commented Dec 21, 2022

dyladan commented Dec 21, 2022

isanchen commented Jan 5, 2023

ahayworth commented Feb 21, 2023

dyladan commented Feb 21, 2023

dyladan commented Feb 21, 2023

MSNev commented Sep 7, 2023

myieye commented Oct 9, 2023

MSNev commented Oct 25, 2023

istvan-hevele commented Nov 13, 2024

yuanman0109 commented Dec 15, 2022 •

edited

Loading