Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch Instrumentation Start and End Times Can Be Off By Up to a Few Milliseconds #3719

Closed
Grunet opened this issue Apr 7, 2023 · 15 comments
Closed
Assignees
Labels
bug Something isn't working triage

Comments

@Grunet
Copy link
Contributor

Grunet commented Apr 7, 2023

What happened?

Steps to Reproduce

I tried using the FetchInstrumentation in Deno, but found that the fetch call I was making was lasting longer than it parent span.

  1. Opening up Github Codespaces on https://github.com/Grunet/otel-js-deno-experimentation/blob/example-for-fetch-bug-report/
  2. Running deno run -A index.ts
  3. Running curl http://localhost:8080/
  4. Waiting a few seconds
  5. Inspecting the console output and noting that the "HTTP GET" span created by fetch autoinstrumentation ends after the parent "handler" span

Expected Result

The span created by fetch autoinstrumentation should end before its parent span ends

Actual Result

Here is what this looks like in Honeycomb today. I added in a manual span called "manualFetch" to compare to the autogenerated "HTTP GET" span to see the difference between the two

4 horizontal bars each slightly above the other. The 2nd bar from the top ends before the top bar ends. However the 3rd bar from the top ends after the top bar ends. This is incorrect, both the 2nd and 3rd bars should end around the same point, before the top bar ends

Additional Details

There are 2 places in code I think performance.now() could be used instead of Date.now() to fix this

  1. When fetch autoinstrumentation creates the span (at https://github.com/open-telemetry/opentelemetry-js/blob/main/experimental/packages/opentelemetry-instrumentation-fetch/src/fetch.ts#L201) a performance.now() could be passed in the options for startTime
  2. When fetch autoinstrumentation ends the span (at https://github.com/open-telemetry/opentelemetry-js/blob/main/experimental/packages/opentelemetry-instrumentation-fetch/src/fetch.ts#L279) the Date.now() could be replaced by a performance.now()

OpenTelemetry Setup Code

import { registerInstrumentations } from "npm:@opentelemetry/instrumentation";
import { FetchInstrumentation } from 'npm:@opentelemetry/instrumentation-fetch';

import { NodeTracerProvider } from "npm:@opentelemetry/sdk-trace-node";
import { Resource } from "npm:@opentelemetry/resources";
import { SemanticResourceAttributes } from "npm:@opentelemetry/semantic-conventions";
import { BatchSpanProcessor, ConsoleSpanExporter } from "npm:@opentelemetry/sdk-trace-base";
import { OTLPTraceExporter } from "npm:@opentelemetry/exporter-trace-otlp-proto";

import opentelemetry from "npm:@opentelemetry/api";
import { serve } from "https://deno.land/std@0.180.0/http/server.ts";

// autoinstrumentation.ts

registerInstrumentations({
  instrumentations: [new FetchInstrumentation()],
});

// Monkeypatching to get past FetchInstrumentation's dependence on sdk-trace-web, which has runtime dependencies on some browser-only constructs. See https://github.com/open-telemetry/opentelemetry-js/issues/3413#issuecomment-1496834689 for more details
// Specifically for this line - https://github.com/open-telemetry/opentelemetry-js/blob/main/packages/opentelemetry-sdk-trace-web/src/utils.ts#L310
globalThis.location = {}; 

// tracing.ts

const resource =
  Resource.default().merge(
    new Resource({
      [SemanticResourceAttributes.SERVICE_NAME]: "deno-demo-service",
      [SemanticResourceAttributes.SERVICE_VERSION]: "0.1.0",
    })
  );

const provider = new NodeTracerProvider({
    resource: resource,
});

const consoleExporter = new ConsoleSpanExporter();
provider.addSpanProcessor(new BatchSpanProcessor(consoleExporter));

const traceExporter = new OTLPTraceExporter();
provider.addSpanProcessor(new BatchSpanProcessor(traceExporter));

provider.register();

// Application code

const tracer = opentelemetry.trace.getTracer(
  'deno-demo-tracer'
);

const port = 8080;

const handler = async (request: Request): Promise<Response> => {
  // This call will be autoinstrumented
  const otherSpan = tracer.startSpan('manualFetch');
  await fetch("http://www.example.com");
  otherSpan.end();

  const span = tracer.startSpan(`constructBody`);
  const body = `Your user-agent is:\n\n${request.headers.get("user-agent") ?? "Unknown"}`;
  span.end();

  return new Response(body, { status: 200 });
};

await serve(instrument(handler), { port });

// Helper code

function instrument(handler) {

  async function instrumentedHandler(request) {
    let response;
    await tracer.startActiveSpan('handler', async (span) => {

      response = await handler(request);

      span.end();
    });

    return response;
  }

  return instrumentedHandler;
}

package.json

{
  "version": "2",
  "remote": {
    "https://deno.land/std@0.180.0/async/abortable.ts": "73acfb3ed7261ce0d930dbe89e43db8d34e017b063cf0eaa7d215477bf53442e",
    "https://deno.land/std@0.180.0/async/deadline.ts": "c5facb0b404eede83e38bd2717ea8ab34faa2ffb20ef87fd261fcba32ba307aa",
    "https://deno.land/std@0.180.0/async/debounce.ts": "adab11d04ca38d699444ac8a9d9856b4155e8dda2afd07ce78276c01ea5a4332",
    "https://deno.land/std@0.180.0/async/deferred.ts": "42790112f36a75a57db4a96d33974a936deb7b04d25c6084a9fa8a49f135def8",
    "https://deno.land/std@0.180.0/async/delay.ts": "73aa04cec034c84fc748c7be49bb15cac3dd43a57174bfdb7a4aec22c248f0dd",
    "https://deno.land/std@0.180.0/async/mod.ts": "f04344fa21738e5ad6bea37a6bfffd57c617c2d372bb9f9dcfd118a1b622e576",
    "https://deno.land/std@0.180.0/async/mux_async_iterator.ts": "70c7f2ee4e9466161350473ad61cac0b9f115cff4c552eaa7ef9d50c4cbb4cc9",
    "https://deno.land/std@0.180.0/async/pool.ts": "fd082bd4aaf26445909889435a5c74334c017847842ec035739b4ae637ae8260",
    "https://deno.land/std@0.180.0/async/retry.ts": "5efa3ba450ac0c07a40a82e2df296287b5013755d232049efd7ea2244f15b20f",
    "https://deno.land/std@0.180.0/async/tee.ts": "47e42d35f622650b02234d43803d0383a89eb4387e1b83b5a40106d18ae36757",
    "https://deno.land/std@0.180.0/http/server.ts": "cbb17b594651215ba95c01a395700684e569c165a567e4e04bba327f41197433"
  },
  "npm": {
    "specifiers": {
      "@opentelemetry/api": "@opentelemetry/api@1.4.1",
      "@opentelemetry/exporter-trace-otlp-proto": "@opentelemetry/exporter-trace-otlp-proto@0.37.0_@opentelemetry+api@1.4.1",
      "@opentelemetry/instrumentation": "@opentelemetry/instrumentation@0.37.0_@opentelemetry+api@1.4.1",
      "@opentelemetry/instrumentation-fetch": "@opentelemetry/instrumentation-fetch@0.37.0_@opentelemetry+api@1.4.1",
      "@opentelemetry/resources": "@opentelemetry/resources@1.11.0_@opentelemetry+api@1.4.1",
      "@opentelemetry/sdk-trace-base": "@opentelemetry/sdk-trace-base@1.11.0_@opentelemetry+api@1.4.1",
      "@opentelemetry/sdk-trace-node": "@opentelemetry/sdk-trace-node@1.11.0_@opentelemetry+api@1.4.1",
      "@opentelemetry/semantic-conventions": "@opentelemetry/semantic-conventions@1.11.0"
    },
    "packages": {
      "@opentelemetry/api@1.4.1": {
        "integrity": "sha512-O2yRJce1GOc6PAy3QxFM4NzFiWzvScDC1/5ihYBL6BUEVdq0XMWN01sppE+H6bBXbaFYipjwFLEWLg5PaSOThA==",
        "dependencies": {}
      },
      "@opentelemetry/context-async-hooks@1.11.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-Ao1z7p+Au7A10SvQ6NCo5h2dAb3cujy+1VUZrd6gZuqMTxADYEWw/yjDbkHM/NAAaBphDGhqNg2MxGYIdgQs8w==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1"
        }
      },
      "@opentelemetry/core@1.11.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-aP1wHSb+YfU0pM63UAkizYPuS4lZxzavHHw5KJfFNN2oWQ79HSm6JR3CzwFKHwKhSzHN8RE9fgP1IdVJ8zmo1w==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "@opentelemetry/semantic-conventions": "@opentelemetry/semantic-conventions@1.11.0"
        }
      },
      "@opentelemetry/exporter-trace-otlp-proto@0.37.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-xOXs28I8EhZpQhjIyJyfDUJUsM6GbsGmQRIwhB5LFsbs1sPKiTWlcXZLJN16Ipeqz9Am8Rgbk5HdL99aQ6SEBA==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "@opentelemetry/core": "@opentelemetry/core@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/otlp-exporter-base": "@opentelemetry/otlp-exporter-base@0.37.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/otlp-proto-exporter-base": "@opentelemetry/otlp-proto-exporter-base@0.37.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/otlp-transformer": "@opentelemetry/otlp-transformer@0.37.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/resources": "@opentelemetry/resources@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/sdk-trace-base": "@opentelemetry/sdk-trace-base@1.11.0_@opentelemetry+api@1.4.1"
        }
      },
      "@opentelemetry/instrumentation-fetch@0.37.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-PXAbRPOpVGxxH3kOd5j8D6AK7l8Gk8pPDNZGoBijYPbAIBg8SvefGXZlHtrzbTmBjy71IjnIxpl0rAl7QH6IPA==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "@opentelemetry/core": "@opentelemetry/core@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/instrumentation": "@opentelemetry/instrumentation@0.37.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/sdk-trace-web": "@opentelemetry/sdk-trace-web@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/semantic-conventions": "@opentelemetry/semantic-conventions@1.11.0"
        }
      },
      "@opentelemetry/instrumentation@0.37.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-QAHIYTeVHcvP5NcI8r0WbvF5KCojZSzQLO9G73/OpiXLy/t8hIUXHq0nuuSB5zP5dKQ8h9sORi/3suGBNHnsjw==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "require-in-the-middle": "require-in-the-middle@6.0.0",
          "semver": "semver@7.3.8",
          "shimmer": "shimmer@1.2.1"
        }
      },
      "@opentelemetry/otlp-exporter-base@0.37.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-dLbv7nr7d14xrHzd+S1eW+RpXh7IC0onktc23pwzETh6J7Ytzf0+QwLV5iRatoNtwPU2hX1VGOipwEnC/BjXxg==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "@opentelemetry/core": "@opentelemetry/core@1.11.0_@opentelemetry+api@1.4.1"
        }
      },
      "@opentelemetry/otlp-proto-exporter-base@0.37.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-KU1aJetRLQWwtoiXiuJcsW2H8e6H3qn9gvoT2SLyA3f566C/i2z1aJRqJnO+ZEo87SBxDf/gk2R300PohDOBrQ==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "@opentelemetry/core": "@opentelemetry/core@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/otlp-exporter-base": "@opentelemetry/otlp-exporter-base@0.37.0_@opentelemetry+api@1.4.1",
          "protobufjs": "protobufjs@7.2.3"
        }
      },
      "@opentelemetry/otlp-transformer@0.37.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-cIzV9x2DhJ5gN0mld8OqN+XM95sDiuAJJvXsRjVuz9vu8TSNbbao/QCKNfJLOXqe8l3Ge05nKzQ6Q2gDDEN36w==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "@opentelemetry/core": "@opentelemetry/core@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/resources": "@opentelemetry/resources@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/sdk-metrics": "@opentelemetry/sdk-metrics@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/sdk-trace-base": "@opentelemetry/sdk-trace-base@1.11.0_@opentelemetry+api@1.4.1"
        }
      },
      "@opentelemetry/propagator-b3@1.11.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-Nnlu2PUSMqB8Lc3OzjznqpwuUdej1LrYYNjHItk9g7jJ9SAjiLdkB1cDALO+xhISTON8VrELh1rh0XIMUts2og==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "@opentelemetry/core": "@opentelemetry/core@1.11.0_@opentelemetry+api@1.4.1"
        }
      },
      "@opentelemetry/propagator-jaeger@1.11.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-/9XDcBnB6Y2INMWACs0UniY+aV7LReMRzPN6Q0SI7SlXZLZPTUnaZt51Tb/TyixjzAkHsj86K27XSCv3ctB2UQ==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "@opentelemetry/core": "@opentelemetry/core@1.11.0_@opentelemetry+api@1.4.1"
        }
      },
      "@opentelemetry/resources@1.11.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-y0z2YJTqk0ag+hGT4EXbxH/qPhDe8PfwltYb4tXIEsozgEFfut/bqW7H7pDvylmCjBRMG4NjtLp57V1Ev++brA==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "@opentelemetry/core": "@opentelemetry/core@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/semantic-conventions": "@opentelemetry/semantic-conventions@1.11.0"
        }
      },
      "@opentelemetry/sdk-metrics@1.11.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-knuq3pwU0+46FEMdw9Ses+alXL9cbcLUUTdYBBBsaKkqKwoVMHfhBufW7u6YCu4i+47Wg6ZZTN/eGc4LbTbK5Q==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "@opentelemetry/core": "@opentelemetry/core@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/resources": "@opentelemetry/resources@1.11.0_@opentelemetry+api@1.4.1",
          "lodash.merge": "lodash.merge@4.6.2"
        }
      },
      "@opentelemetry/sdk-trace-base@1.11.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-DV8e5/Qo42V8FMBlQ0Y0Liv6Hl/Pp5bAZ73s7r1euX8w4bpRes1B7ACiA4yujADbWMJxBgSo4fGbi4yjmTMG2A==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "@opentelemetry/core": "@opentelemetry/core@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/resources": "@opentelemetry/resources@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/semantic-conventions": "@opentelemetry/semantic-conventions@1.11.0"
        }
      },
      "@opentelemetry/sdk-trace-node@1.11.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-BzcrLl6m10Gc9Za5hNCRAClgpetGd5di5bG4RDChhVfqrecl98ok+BKunWxWxrZUMpXpJqfXteHo0su3SEXvXQ==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "@opentelemetry/context-async-hooks": "@opentelemetry/context-async-hooks@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/core": "@opentelemetry/core@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/propagator-b3": "@opentelemetry/propagator-b3@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/propagator-jaeger": "@opentelemetry/propagator-jaeger@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/sdk-trace-base": "@opentelemetry/sdk-trace-base@1.11.0_@opentelemetry+api@1.4.1",
          "semver": "semver@7.3.8"
        }
      },
      "@opentelemetry/sdk-trace-web@1.11.0_@opentelemetry+api@1.4.1": {
        "integrity": "sha512-lZwZ7S3aJQLCfsp47LGIu0amceefniPBbVoAX3n1QHng/ld1P4cYIrE4+Lil39xhq8DvyUKEgGO+iazugAUtog==",
        "dependencies": {
          "@opentelemetry/api": "@opentelemetry/api@1.4.1",
          "@opentelemetry/core": "@opentelemetry/core@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/sdk-trace-base": "@opentelemetry/sdk-trace-base@1.11.0_@opentelemetry+api@1.4.1",
          "@opentelemetry/semantic-conventions": "@opentelemetry/semantic-conventions@1.11.0"
        }
      },
      "@opentelemetry/semantic-conventions@1.11.0": {
        "integrity": "sha512-fG4D0AktoHyHwGhFGv+PzKrZjxbKJfckJauTJdq2A+ej5cTazmNYjJVAODXXkYyrsI10muMl+B1iO2q1R6Lp+w==",
        "dependencies": {}
      },
      "@protobufjs/aspromise@1.1.2": {
        "integrity": "sha512-j+gKExEuLmKwvz3OgROXtrJ2UG2x8Ch2YZUxahh+s1F2HZ+wAceUNLkvy6zKCPVRkU++ZWQrdxsUeQXmcg4uoQ==",
        "dependencies": {}
      },
      "@protobufjs/base64@1.1.2": {
        "integrity": "sha512-AZkcAA5vnN/v4PDqKyMR5lx7hZttPDgClv83E//FMNhR2TMcLUhfRUBHCmSl0oi9zMgDDqRUJkSxO3wm85+XLg==",
        "dependencies": {}
      },
      "@protobufjs/codegen@2.0.4": {
        "integrity": "sha512-YyFaikqM5sH0ziFZCN3xDC7zeGaB/d0IUb9CATugHWbd1FRFwWwt4ld4OYMPWu5a3Xe01mGAULCdqhMlPl29Jg==",
        "dependencies": {}
      },
      "@protobufjs/eventemitter@1.1.0": {
        "integrity": "sha512-j9ednRT81vYJ9OfVuXG6ERSTdEL1xVsNgqpkxMsbIabzSo3goCjDIveeGv5d03om39ML71RdmrGNjG5SReBP/Q==",
        "dependencies": {}
      },
      "@protobufjs/fetch@1.1.0": {
        "integrity": "sha512-lljVXpqXebpsijW71PZaCYeIcE5on1w5DlQy5WH6GLbFryLUrBD4932W/E2BSpfRJWseIL4v/KPgBFxDOIdKpQ==",
        "dependencies": {
          "@protobufjs/aspromise": "@protobufjs/aspromise@1.1.2",
          "@protobufjs/inquire": "@protobufjs/inquire@1.1.0"
        }
      },
      "@protobufjs/float@1.0.2": {
        "integrity": "sha512-Ddb+kVXlXst9d+R9PfTIxh1EdNkgoRe5tOX6t01f1lYWOvJnSPDBlG241QLzcyPdoNTsblLUdujGSE4RzrTZGQ==",
        "dependencies": {}
      },
      "@protobufjs/inquire@1.1.0": {
        "integrity": "sha512-kdSefcPdruJiFMVSbn801t4vFK7KB/5gd2fYvrxhuJYg8ILrmn9SKSX2tZdV6V+ksulWqS7aXjBcRXl3wHoD9Q==",
        "dependencies": {}
      },
      "@protobufjs/path@1.1.2": {
        "integrity": "sha512-6JOcJ5Tm08dOHAbdR3GrvP+yUUfkjG5ePsHYczMFLq3ZmMkAD98cDgcT2iA1lJ9NVwFd4tH/iSSoe44YWkltEA==",
        "dependencies": {}
      },
      "@protobufjs/pool@1.1.0": {
        "integrity": "sha512-0kELaGSIDBKvcgS4zkjz1PeddatrjYcmMWOlAuAPwAeccUrPHdUqo/J6LiymHHEiJT5NrF1UVwxY14f+fy4WQw==",
        "dependencies": {}
      },
      "@protobufjs/utf8@1.1.0": {
        "integrity": "sha512-Vvn3zZrhQZkkBE8LSuW3em98c0FwgO4nxzv6OdSxPKJIEKY2bGbHn+mhGIPerzI4twdxaP8/0+06HBpwf345Lw==",
        "dependencies": {}
      },
      "@types/node@18.15.11": {
        "integrity": "sha512-E5Kwq2n4SbMzQOn6wnmBjuK9ouqlURrcZDVfbo9ftDDTFt3nk7ZKK4GMOzoYgnpQJKcxwQw+lGaBvvlMo0qN/Q==",
        "dependencies": {}
      },
      "debug@4.3.4": {
        "integrity": "sha512-PRWFHuSU3eDtQJPvnNY7Jcket1j0t5OuOsFzPPzsekD52Zl8qUfFIPEiswXqIvHWGVHOgX+7G/vCNNhehwxfkQ==",
        "dependencies": {
          "ms": "ms@2.1.2"
        }
      },
      "function-bind@1.1.1": {
        "integrity": "sha512-yIovAzMX49sF8Yl58fSCWJ5svSLuaibPxXQJFLmBObTuCr0Mf1KiPopGM9NiFjiYBCbfaa2Fh6breQ6ANVTI0A==",
        "dependencies": {}
      },
      "has@1.0.3": {
        "integrity": "sha512-f2dvO0VU6Oej7RkWJGrehjbzMAjFp5/VKPp5tTpWIV4JHHZK1/BxbFRtf/siA2SWTe09caDmVtYYzWEIbBS4zw==",
        "dependencies": {
          "function-bind": "function-bind@1.1.1"
        }
      },
      "is-core-module@2.11.0": {
        "integrity": "sha512-RRjxlvLDkD1YJwDbroBHMb+cukurkDWNyHx7D3oNB5x9rb5ogcksMC5wHCadcXoo67gVr/+3GFySh3134zi6rw==",
        "dependencies": {
          "has": "has@1.0.3"
        }
      },
      "lodash.merge@4.6.2": {
        "integrity": "sha512-0KpjqXRVvrYyCsX1swR/XTK0va6VQkQM6MNo7PqW77ByjAhoARA8EfrP1N4+KlKj8YS0ZUCtRT/YUuhyYDujIQ==",
        "dependencies": {}
      },
      "long@5.2.1": {
        "integrity": "sha512-GKSNGeNAtw8IryjjkhZxuKB3JzlcLTwjtiQCHKvqQet81I93kXslhDQruGI/QsddO83mcDToBVy7GqGS/zYf/A==",
        "dependencies": {}
      },
      "lru-cache@6.0.0": {
        "integrity": "sha512-Jo6dJ04CmSjuznwJSS3pUeWmd/H0ffTlkXXgwZi+eq1UCmqQwCh+eLsYOYCwY991i2Fah4h1BEMCx4qThGbsiA==",
        "dependencies": {
          "yallist": "yallist@4.0.0"
        }
      },
      "module-details-from-path@1.0.3": {
        "integrity": "sha512-ySViT69/76t8VhE1xXHK6Ch4NcDd26gx0MzKXLO+F7NOtnqH68d9zF94nT8ZWSxXh8ELOERsnJO/sWt1xZYw5A==",
        "dependencies": {}
      },
      "ms@2.1.2": {
        "integrity": "sha512-sGkPx+VjMtmA6MX27oA4FBFELFCZZ4S4XqeGOXCv68tT+jb3vk/RyaKWP0PTKyWtmLSM0b+adUTEvbs1PEaH2w==",
        "dependencies": {}
      },
      "path-parse@1.0.7": {
        "integrity": "sha512-LDJzPVEEEPR+y48z93A0Ed0yXb8pAByGWo/k5YYdYgpY2/2EsOsksJrq7lOHxryrVOn1ejG6oAp8ahvOIQD8sw==",
        "dependencies": {}
      },
      "protobufjs@7.2.3": {
        "integrity": "sha512-TtpvOqwB5Gdz/PQmOjgsrGH1nHjAQVCN7JG4A6r1sXRWESL5rNMAiRcBQlCAdKxZcAbstExQePYG8xof/JVRgg==",
        "dependencies": {
          "@protobufjs/aspromise": "@protobufjs/aspromise@1.1.2",
          "@protobufjs/base64": "@protobufjs/base64@1.1.2",
          "@protobufjs/codegen": "@protobufjs/codegen@2.0.4",
          "@protobufjs/eventemitter": "@protobufjs/eventemitter@1.1.0",
          "@protobufjs/fetch": "@protobufjs/fetch@1.1.0",
          "@protobufjs/float": "@protobufjs/float@1.0.2",
          "@protobufjs/inquire": "@protobufjs/inquire@1.1.0",
          "@protobufjs/path": "@protobufjs/path@1.1.2",
          "@protobufjs/pool": "@protobufjs/pool@1.1.0",
          "@protobufjs/utf8": "@protobufjs/utf8@1.1.0",
          "@types/node": "@types/node@18.15.11",
          "long": "long@5.2.1"
        }
      },
      "require-in-the-middle@6.0.0": {
        "integrity": "sha512-+dtWQ7l2lqQDxheaG3jjyN1QI37gEwvzACSgjYi4/C2y+ZTUMeRW8BIOm+9NBKvwaMBUSZfPXVOt1skB0vBkRw==",
        "dependencies": {
          "debug": "debug@4.3.4",
          "module-details-from-path": "module-details-from-path@1.0.3",
          "resolve": "resolve@1.22.2"
        }
      },
      "resolve@1.22.2": {
        "integrity": "sha512-Sb+mjNHOULsBv818T40qSPeRiuWLyaGMa5ewydRLFimneixmVy2zdivRl+AF6jaYPC8ERxGDmFSiqui6SfPd+g==",
        "dependencies": {
          "is-core-module": "is-core-module@2.11.0",
          "path-parse": "path-parse@1.0.7",
          "supports-preserve-symlinks-flag": "supports-preserve-symlinks-flag@1.0.0"
        }
      },
      "semver@7.3.8": {
        "integrity": "sha512-NB1ctGL5rlHrPJtFDVIVzTyQylMLu9N9VICA6HSFJo8MCGVTMW6gfpicwKmmK/dAjTOrqu5l63JJOpDSrAis3A==",
        "dependencies": {
          "lru-cache": "lru-cache@6.0.0"
        }
      },
      "shimmer@1.2.1": {
        "integrity": "sha512-sQTKC1Re/rM6XyFM6fIAGHRPVGvyXfgzIDvzoq608vM+jeyVD0Tu1E6Np0Kc2zAIFWIj963V2800iF/9LPieQw==",
        "dependencies": {}
      },
      "supports-preserve-symlinks-flag@1.0.0": {
        "integrity": "sha512-ot0WnXS9fgdkgIcePe6RHNk1WA8+muPa6cSjeR3V8K27q9BB1rTE3R1p7Hv0z1ZyAc8s6Vvv8DIyWf681MAt0w==",
        "dependencies": {}
      },
      "yallist@4.0.0": {
        "integrity": "sha512-3wdGidZyq5PB084XLES5TpOSRA3wjXAlIWMhum2kRcv/41Sn2emQ0dycQW4uZXLejwKvg6EsvbdlVL+FYEct7A==",
        "dependencies": {}
      }
    }
  }
}

Relevant log output

{
  traceId: "9a9a625e1562e9847ffe97e09fcf1bea",
  parentId: "f80a96f14dc334d8",
  traceState: undefined,
  name: "constructBody",
  id: "5a1974974e18fcf6",
  kind: 0,
  timestamp: 1680829418679000,
  duration: 195,
  attributes: {},
  status: { code: 0 },
  events: [],
  links: []
}
{
  traceId: "9a9a625e1562e9847ffe97e09fcf1bea",
  parentId: undefined,
  traceState: undefined,
  name: "handler",
  id: "f80a96f14dc334d8",
  kind: 0,
  timestamp: 1680829418663000,
  duration: 16340,
  attributes: {},
  status: { code: 0 },
  events: [],
  links: []
}
{
  traceId: "9a9a625e1562e9847ffe97e09fcf1bea",
  parentId: "f80a96f14dc334d8",
  traceState: undefined,
  name: "HTTP GET",
  id: "7ec3993d6f40671c",
  kind: 2,
  timestamp: 1680829418669000,
  duration: 12000,
  attributes: {
    component: "fetch",
    "http.method": "GET",
    "http.url": "http://www.example.com/",
    "http.status_code": 200,
    "http.status_text": "OK",
    "http.host": "www.example.com",
    "http.scheme": "http",
    "http.user_agent": "Deno/1.32.1"
  },
  status: { code: 0 },
  events: [],
  links: []
}
{
  traceId: "5086d5f4412c11a533d81044998ac7d6",
  parentId: undefined,
  traceState: undefined,
  name: "HTTP POST",
  id: "698b36fbc35e09c9",
  kind: 2,
  timestamp: 1680829423694000,
  duration: 94000,
  attributes: {
    component: "fetch",
    "http.method": "POST",
    "http.url": "https://api.honeycomb.io/v1/traces",
    "http.status_code": 200,
    "http.status_text": "OK",
    "http.host": "api.honeycomb.io",
    "http.scheme": "https",
    "http.user_agent": "Deno/1.32.1"
  },
  status: { code: 0 },
  events: [],
  links: []
}
@Grunet Grunet added bug Something isn't working triage labels Apr 7, 2023
@Grunet
Copy link
Contributor Author

Grunet commented Apr 7, 2023

I'm mostly unsure if this is an appropriate change (or like there's some reason Date.now() is being used in the fetch instrumentation)

If not just lmk and I can see about trying to make a PR for this

@Grunet
Copy link
Contributor Author

Grunet commented Apr 7, 2023

Actually that may not be the only thing involved, as I see the end of the span is actually off by several milliseconds

@Grunet Grunet changed the title Fetch Instrumentation Rounds Span Start/End Times Up to the Nearest Millisecond Fetch Instrumentation Start and End Times Can Be Off By Up to a Few Milliseconds Apr 7, 2023
@Grunet
Copy link
Contributor Author

Grunet commented Apr 7, 2023

For the end time, I wonder if https://github.com/open-telemetry/opentelemetry-js/blob/main/experimental/packages/opentelemetry-instrumentation-fetch/src/fetch.ts#L348 is involved, as it seems to imply that reader.read() returns a Promise, which would mean the resolution of the monkeypatched fetch promise would happen first (https://github.com/open-telemetry/opentelemetry-js/blob/main/experimental/packages/opentelemetry-instrumentation-fetch/src/fetch.ts#L367) and then it would come back and run endSpanOnSuccess after finishing reading the body, which eventually sets the end time on the span

@Grunet
Copy link
Contributor Author

Grunet commented Apr 10, 2023

Maybe it has something to do with the autoinstrumentation library using its own Tracer

this._tracer = trace.getTracer(instrumentationName, instrumentationVersion);

Though I don't see how that would affect things this much

@Grunet
Copy link
Contributor Author

Grunet commented Apr 10, 2023

I'm still not quite sure how to investigate this further, but I'm wondering now if the auto-instrumentation's tracer is somehow shifted so all it's times are a few milliseconds off.

I'm not seeing evidence of Deno's performance.timeOrigin drifting or anything though

@Grunet
Copy link
Contributor Author

Grunet commented Apr 11, 2023

I'm not sure why this would matter, but putting the registerInstrumentations call immediately after the provider.register call seems to put the start of the fetch span consistently ~1ms after the start of the parent span, improved on the example given above.

@Flarna
Copy link
Member

Flarna commented Apr 11, 2023

I haven't read all the messages here but I guess #3434 (and linked issues) might be of interest for you. With that PR a move from hrtime to Date.now() was done because to solve problems with hrtime drifting in some environments.

@t2t2
Copy link
Contributor

t2t2 commented Apr 12, 2023

For the end time, I wonder if https://github.com/open-telemetry/opentelemetry-js/blob/main/experimental/packages/opentelemetry-instrumentation-fetch/src/fetch.ts#L348 is involved, as it seems to imply that reader.read() returns a Promise, which would mean the resolution of the monkeypatched fetch promise would happen first (https://github.com/open-telemetry/opentelemetry-js/blob/main/experimental/packages/opentelemetry-instrumentation-fetch/src/fetch.ts#L367) and then it would come back and run endSpanOnSuccess after finishing reading the body, which eventually sets the end time on the span

This is most likely it, in your example code you're not actually doing anything with the response, which basically means js event loop ends up being like:

  • (fetch end)
  • fetch instrumentation onSuccess/onError
    • queues reader.read().then
    • calls promise resolve
  • small chance of reader being ready but it may also get called mutliple times so realistically no
  • goes back to your code after await (from otherSpan.end(); all the way to return new Response)
  • then fetch instrumentation finishes reading and goes to endSpanOnSuccess

did a small change to your example:

const handler = async (request: Request): Promise<Response> => {
  // This call will be autoinstrumented
  const otherSpan = tracer.startSpan('manualFetch');
  const res = await fetch("http://www.example.com");
+ await res.text(); // Also wait for response body to be piped through
  otherSpan.end();

  const span = tracer.startSpan(`constructBody`);
  const body = `Your user-agent is:\n\n${request.headers.get("user-agent") ?? "Unknown"}`;
  span.end();

  return new Response(body, { status: 200 });
};

And I got timestamps that make more sense (manualFetch duration: 178600, HTTP GET duration: 173000)

Could argue that recording span end time in onSuccess would be better

@dyladan dyladan self-assigned this Apr 12, 2023
@Grunet
Copy link
Contributor Author

Grunet commented Apr 13, 2023

And I got timestamps that make more sense (manualFetch duration: 178600, HTTP GET duration: 173000)

Could argue that recording span end time in onSuccess would be better

Yup I'm seeing the traces in Honeycomb look more as expected with your edit.

That plus the registerInstrumentations positioning plus the use of Date.now instead of performance.now I think explains everything I've come across.

If there's anything else I can do to help with this (e.g. trying to PR the end time change) just let me know.

And thank you to everyone who's looked at this!!

@Grunet
Copy link
Contributor Author

Grunet commented Apr 20, 2023

FYI I still don't have a guess for why the start of the fetch autoinstrumented span is so many milliseconds ahead of the start of the parent span.

The call to startSpan seems to happen pretty quickly after the promise is created, and startSpan didn't seem to be doing too much work on the face.

@dyladan
Copy link
Member

dyladan commented Apr 26, 2023

Posted in the linked website doc issue, but cross-posting here for visibility:

@Grunet I suspect what you're experiencing is latency added by the proxy tracer provider in the API. If you acquire a tracer before registering the TracerProvider you receive a ProxyTracer object which no-ops by default. If you then register a real TracerProvider, the ProxyTracer will proxy calls to the real Tracer returned by the TracerProvider. The advantage of this is that any instrumentation which acquires a Tracer before your SDK is loaded will become active after the SDK is loaded. The disadvantage is that these early-loaded tracers will have a very slight performance penalty due to the proxy function overhead. Registering the provider before registering instrumentations avoids this overhead because the instrumentations get real tracers and you avoid the proxy.

@dyladan
Copy link
Member

dyladan commented Apr 26, 2023

Could argue that recording span end time in onSuccess would be better

@t2t2 do you mean ending the span before reading the body? Wouldn't this cause the span to always end too early? I'm not sure I understand what you're arguing for here.

If there's anything else I can do to help with this (e.g. trying to PR the end time change) just let me know

@Grunet absolutely if you think something can be improved here then a PR would be welcome.

@t2t2
Copy link
Contributor

t2t2 commented Apr 26, 2023

do you mean ending the span before reading the body?

Nah, just recording the timestamp, like is already done to not include ResourceObserver timeout. But I think I'd actually retract that opinion after re-reading fetch documentation:

It returns a Promise that resolves to the Response to that request — as soon as the server responds with headers — even if the server response is an HTTP error status.

So changing it to when fetch promise resolves would record span end on headers start, not response end, which I think would be wrong. So I think this issue is more of a user misunderstanding - the parent span did technically end before the response of child request ended (even if it's only due to how js is a single threaded event loop)

@dyladan
Copy link
Member

dyladan commented Apr 26, 2023

do you mean ending the span before reading the body?

Nah, just recording the timestamp, like is already done to not include ResourceObserver timeout. But I think I'd actually retract that opinion after re-reading fetch documentation:

It returns a Promise that resolves to the Response to that request — as soon as the server responds with headers — even if the server response is an HTTP error status.

So changing it to when fetch promise resolves would record span end on headers start, not response end, which I think would be wrong. So I think this issue is more of a user misunderstanding - the parent span did technically end before the response of child request ended (even if it's only due to how js is a single threaded event loop)

Ok glad we're of the same mind here

@Grunet
Copy link
Contributor Author

Grunet commented Apr 30, 2023

So I think this issue is more of a user misunderstanding - the parent span did technically end before the response of child request ended (even if it's only due to how js is a single threaded event loop)

Yup I think that makes sense to me now with all of the context. Still a touch counterintuitive personally, but all the context fixes that.

I think all of my original concerns were addressed so I'll go ahead and close this issue. Thank you everyone for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

4 participants