Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log events sent to collector use incorrect encoding of traceid #3892

Closed
nwilliams-lw opened this issue Jun 13, 2023 · 2 comments · Fixed by #3875
Closed

Log events sent to collector use incorrect encoding of traceid #3892

nwilliams-lw opened this issue Jun 13, 2023 · 2 comments · Fixed by #3875
Labels
bug Something isn't working priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect

Comments

@nwilliams-lw
Copy link

nwilliams-lw commented Jun 13, 2023

What happened?

Steps to Reproduce

  1. Run an otel 0.79 collector with a http endpoint exposed
  2. Setup OTLPLogExporter and point it at the local collector via HTTP:
  3. Create a trace span in the current context
  4. Emit a log via the logger

Expected Result

  1. Logs are sent to the collector including the trace id

Actual Result

  1. Logs are attempted to be sent to the collector but the collector returns 400 bad request:
{
    "stack": "OTLPExporterError: Bad Request\n    at IncomingMessage.<anonymous> (/Users/nw/dev/src/lp/pam/api/node_modules/@opentelemetry/otlp-exporter-base/src/platform/node/util.ts:131:27)\n    at /Users/nw/dev/src/lp/pam/api/node_modules/@opentelemetry/context-async-hooks/src/AbstractAsyncHooksContextManager.ts:75:49\n    at AsyncLocalStorage.run (node:async_hooks:327:14)\n    at AsyncLocalStorageContextManager.with (/Users/nw/dev/src/lp/pam/api/node_modules/@opentelemetry/context-async-hooks/src/AsyncLocalStorageContextManager.ts:40:36)\n    at IncomingMessage.contextWrapper (/Users/nw/dev/src/lp/pam/api/node_modules/@opentelemetry/context-async-hooks/src/AbstractAsyncHooksContextManager.ts:75:26)\n    at IncomingMessage.emit (node:events:525:35)\n    at IncomingMessage.emit (node:domain:489:12)\n    at endReadableNT (node:internal/streams/readable:1359:12)\n    at processTicksAndRejections (node:internal/process/task_queues:82:21)",
    "message": "Bad Request",
    "name": "OTLPExporterError",
    "data": "{\"code\":3,\"message\":\"readLog.traceId: parse trace_id:invalid length for ID, error found in #10 byte of ...|peUVrgQ==\\\",\\\"spanId\\\":|..., bigger context ...|\\\":0,\\\"flags\\\":1,\\\"traceId\\\":\\\"DWPNFhJlsl6rzxtpeUVrgQ==\\\",\\\"spanId\\\":\\\"ti6B1zBFRX0=\\\"}]}]}]}|...\"}",
    "code": "400"
}

Additional Details

The issue appears to be that the traceids sent to the collector as part of the log events are base64 encoded and not hex encoded.

Looking through the otel source, there is a toLogRecord function that takes a useHex flag, but I cannot see any point in which useHex does not resolve to false, it appears to be mostly undefined.

As a quick check, I forced the useHex flag to always be true by hardcoding the flag. Using this version the log record was successfully sent to the collector and it was correctly associated with the trace.

In our actual setup, we are running a node express application, where we use the NodeSDK functions to auto-instructment the application.

Example Collector Config:

extensions:
  health_check:
    check_collector_pipeline:
      enabled: false
  zpages:
    endpoint: 0.0.0.0:55679
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
processors:
  # https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md
  batch/traces:
  batch/metrics:
  batch/logs:
exporters:
  logging:
    verbosity: normal
  logging/off:
    sampling_initial: 0
    sampling_thereafter: 0
service:
  telemetry:
    logs:
      level: info
    metrics:
      address: ":8888"
  extensions: [ health_check, zpages ]
  pipelines:
    traces:
      receivers: [ otlp ]
      processors: [ batch/traces ]
      exporters: [ logging ]
    metrics:
      receivers: [ otlp ]
      processors: [ batch/metrics ]
      exporters: [ logging ]
    logs:
      receivers: [ otlp ]
      processors: [ batch/logs ]
      exporters: [ logging ]

Run locally using:
docker run -p 4318:4318 -v $(pwd)/config.yaml:/etc/otelcol-contrib/config.yaml otel/opentelemetry-collector-contrib:0.79.0

OpenTelemetry Setup Code

// node setup

import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions';
import { W3CTraceContextPropagator } from "@opentelemetry/core";
import { Resource } from '@opentelemetry/resources';

import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
import { PeriodicExportingMetricReader } from '@opentelemetry/sdk-metrics';

import { OTLPTraceExporter as OTLPTraceExporterHttp } from "@opentelemetry/exporter-trace-otlp-http";
import { OTLPMetricExporter as OTLPMetricExporterHttp } from "@opentelemetry/exporter-metrics-otlp-http";

import { NodeSDK } from '@opentelemetry/sdk-node';


export const setupTracing = (serviceName: string) => {
  const resource = Resource.default().merge(
    new Resource({
      [SemanticResourceAttributes.SERVICE_NAME]: serviceName,
      // [SemanticResourceAttributes.SERVICE_VERSION]: "0.1.0",
    }),
  );

  const traceExporter = new OTLPTraceExporterHttp({
    url: "http://localhost:4318/v1/traces",
  })

  const sdk = new NodeSDK({
    resource: resource,
    traceExporter: traceExporter,
    textMapPropagator: new W3CTraceContextPropagator(),
    metricReader: new PeriodicExportingMetricReader({
      exporter: new OTLPMetricExporterHttp({
          url: "http://localhost:4318/v1/metrics",
        }
      )
    }),
    instrumentations: [getNodeAutoInstrumentations({
      '@opentelemetry/instrumentation-fs': {
        enabled: false,
      }
    })]
  });

  sdk.start();

  // gracefully shut down the SDK on process exit
  process.on('SIGTERM', () => {
    sdk.shutdown()
      .then(() => console.log('Tracing terminated'))
      .catch((error) => console.log('Error terminating tracing', error))
      .finally(() => process.exit(0));
  });
};

// logging setup
import { BatchLogRecordProcessor, ConsoleLogRecordExporter, LoggerProvider, SimpleLogRecordProcessor } from '@opentelemetry/sdk-logs';
import { OTLPLogExporter as OTLPLogExporterHttp } from "@opentelemetry/exporter-logs-otlp-http";
import { Request } from 'express';
import { logs, SeverityNumber, Logger as OtelLogger } from '@opentelemetry/api-logs';

import { DiagConsoleLogger, DiagLogLevel, diag } from '@opentelemetry/api';

diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.DEBUG);

export interface Logger {
  debug(message: string, ...args: any[]): void;

  info(message: string, ...args: any[]): void;

  warn(message: string, ...args: any[]): void;

  error(message: string, ...args: any[]): void;
}

const wrappedLogger = (otelLogger: OtelLogger) : Logger => {
  return {
    debug: (message: string, ...args: any[]): void => {
      otelLogger.emit({
        severityNumber: SeverityNumber.DEBUG,
        severityText: 'DEBUG',
        body: message,
        attributes: { 'log.type': 'custom' },
      });
    },
    info: (message: string, ...args: any[]): void => {
      otelLogger.emit({
        severityNumber: SeverityNumber.INFO,
        severityText: 'INFO',
        body: message,
        attributes: { 'log.type': 'custom' },
      });
    },
    warn: (message: string, ...args: any[]): void => {
      otelLogger.emit({
        severityNumber: SeverityNumber.WARN,
        severityText: 'WARN',
        body: message,
        attributes: { 'log.type': 'custom' },
      });
    },
    error: (message: string, ...args: any[]): void => {
      otelLogger.emit({
        severityNumber: SeverityNumber.ERROR,
        severityText: 'ERROR',
        body: message,
        attributes: { 'log.type': 'custom' },
      });
    }
  }
}


export const createLogger = (
): Logger => {
  const loggerProvider = new LoggerProvider();
  loggerProvider.addLogRecordProcessor(
    new SimpleLogRecordProcessor(new ConsoleLogRecordExporter())
  );


  const loggerExporter = new OTLPLogExporterHttp({
    url: "http://localhost:4318/v1/logs",
  });

  loggerProvider.addLogRecordProcessor(
    new BatchLogRecordProcessor(loggerExporter)
  );

  logs.setGlobalLoggerProvider(loggerProvider);

  return wrappedLogger(logs.getLogger('api-v2-local', '1.0.0'))
};

export const globalServerLogger = createLogger();

//

A globalServerLogger.info('blah') after the traceid has been populated in the current context will trigger the error.

package.json

"dependencies": {
    "@opentelemetry/api": "^1.4.1",
    "@opentelemetry/api-logs": "^0.40.0",
    "@opentelemetry/auto-instrumentations-node": "^0.37.0",
    "@opentelemetry/exporter-logs-otlp-grpc": "^0.40.0",
    "@opentelemetry/exporter-logs-otlp-http": "^0.40.0",
    "@opentelemetry/exporter-metrics-otlp-grpc": "^0.40.0",
    "@opentelemetry/exporter-metrics-otlp-http": "^0.40.0",
    "@opentelemetry/exporter-trace-otlp-grpc": "^0.40.0",
    "@opentelemetry/exporter-trace-otlp-http": "^0.40.0",
    "@opentelemetry/instrumentation-express": "^0.32.3",
    "@opentelemetry/instrumentation-http": "^0.40.0",
    "@opentelemetry/sdk-logs": "^0.40.0",
    "@opentelemetry/sdk-node": "^0.40.0",
    "@opentelemetry/sdk-trace-node": "^1.14.0",
    "aws-param-env": "^2.1.0",
    "aws-sdk": "^2.1393.0",
    "axios": "^1.4.0",
    "body-parser": "^1.20.2",
    "boom": "^7.3.0",
    "class-transformer": "^0.5.1",
    "class-validator": "^0.14.0",
    "compression": "^1.7.4",
    "dotenv-safe": "^8.2.0",
    "express": "^4.18.2",
    "helmet": "^7.0.0",
    "js-yaml": "^4.1.0",
    "jsonwebtoken": "^9.0.0",
    "luxon": "^3.3.0",
    "module-alias": "^2.2.3",
    "newrelic": "^10.2.0",
    "npm-scripts-info": "^0.3.9",
    "raven": "^2.6.4",
    "reflect-metadata": "^0.1.13",
    "ts-node": "^10.9.1"
  },
  "devDependencies": {
    "@types/body-parser": "^1.19.2",
    "@types/boom": "^7.3.2",
    "@types/chai": "^4.3.5",
    "@types/compression": "^1.7.2",
    "@types/dotenv-safe": "^8.1.0",
    "@types/errorhandler": "^1.5.0",
    "@types/express": "^4.17.17",
    "@types/js-yaml": "^4.0.5",
    "@types/jsonwebtoken": "^9.0.2",
    "@types/luxon": "^3.3.0",
    "@types/mocha": "^10.0.1",
    "@types/newrelic": "^9.14.0",
    "@types/node": "^20.2.5",
    "@types/raven": "^2.5.3",
    "@types/supertest": "^2.0.12",
    "@typescript-eslint/eslint-plugin": "^5.59.9",
    "@typescript-eslint/parser": "^5.59.7",
    "chai": "^4.3.7",
    "eslint": "^8.42.0",
    "eslint-config-prettier": "^8.8.0",
    "eslint-plugin-prefer-arrow": "^1.2.2",
    "eslint-plugin-prettier": "^4.2.1",
    "mocha": "^10.2.0",
    "nodemon": "^2.0.22",
    "prettier": "2.8.8",
    "rewiremock": "^3.14.5",
    "supertest": "^6.3.3",
    "ts-mocha": "^10.0.0",
    "typescript": "^5.1.3"
  }

Relevant log output

No response

@nwilliams-lw nwilliams-lw added bug Something isn't working triage labels Jun 13, 2023
@pichlermarc pichlermarc added priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect and removed triage labels Jun 13, 2023
@pichlermarc
Copy link
Member

Hi, thanks for the report. We're already aware of this (see #3875), you can expect this to be fixed in the next release 🙂

@pichlermarc pichlermarc linked a pull request Jun 13, 2023 that will close this issue
8 tasks
@nwilliams-lw
Copy link
Author

nwilliams-lw commented Jun 13, 2023

@pichlermarc

Ahh 🤦 I checked existing issues but not the PRs.

Sorry for the dupe issue - thanks for the work on the project!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants