Skip to content

feat(observability): integrate Langfuse for LLM request tracing#791

Merged
ding113 merged 10 commits intodevfrom
feat/langfuse-integration
Feb 15, 2026
Merged

feat(observability): integrate Langfuse for LLM request tracing#791
ding113 merged 10 commits intodevfrom
feat/langfuse-integration

Conversation

@ding113
Copy link
Owner

@ding113 ding113 commented Feb 15, 2026

Summary

  • Integrate Langfuse as an optional observability backend for tracing all proxy LLM requests
  • Multi-span trace hierarchy: root proxy-request span -> guard-pipeline span + provider-attempt events + llm-call generation
  • Timing breakdown captures guard pipeline duration, upstream total, TTFB from forward, and token generation time
  • Full usage/cost propagation with usageDetails and costDetails on the LLM generation observation
  • Sampling support via LANGFUSE_SAMPLE_RATE (0-1), disabled by default when keys are not set
  • All Langfuse I/O truncated to configurable LANGFUSE_MAX_IO_SIZE (default 100K chars) to avoid excessive payloads
  • Auth-sensitive headers redacted before tracing

Changes

File Description
src/lib/langfuse/index.ts Langfuse client singleton with sampling, lifecycle management
src/lib/langfuse/trace-proxy-request.ts Core tracing logic - builds multi-span hierarchy
src/app/v1/_lib/proxy/response-handler.ts Fires trace after response completes (with usage/cost)
src/app/v1/_lib/proxy/session.ts Added forwardStartTime + recordForwardStart()
src/app/v1/_lib/proxy-handler.ts Calls recordForwardStart() before forwarding
src/instrumentation.ts OpenTelemetry NodeSDK + Langfuse trace exporter
src/lib/config/env.schema.ts Zod-validated Langfuse env vars
.env.example Documented Langfuse env vars
tests/unit/langfuse/langfuse-trace.test.ts 827-line test suite covering trace hierarchy

Test plan

  • bun run build passes
  • bun run lint passes
  • bun run typecheck passes
  • bun run test passes (2326 tests, 0 failures)
  • Manual verification with Langfuse Cloud: set LANGFUSE_PUBLIC_KEY + LANGFUSE_SECRET_KEY, send proxy requests, verify traces appear with correct hierarchy
  • Verify sampling: set LANGFUSE_SAMPLE_RATE=0.5, confirm ~50% of requests are traced
  • Verify disabled state: unset keys, confirm no trace errors in logs

Greptile Summary

Integrates Langfuse as an optional observability backend for tracing all proxy LLM requests through the existing OpenTelemetry pipeline. The integration is cleanly gated behind LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY env vars — when unset, all code paths are no-ops.

  • Adds a multi-span trace hierarchy: root proxy-request span → guard-pipeline child span + provider-attempt events (for failed retries) + llm-call generation observation with usage/cost details
  • Introduces calculateRequestCostBreakdown in cost-calculation.ts to provide per-category cost breakdowns (input, output, cache creation, cache read) for Langfuse, separate from the existing multiplier-applied cost used for billing
  • Captures timing breakdown (guard pipeline duration, upstream total, TTFB from forward, token generation time) and propagates it through trace metadata
  • emitLangfuseTrace is called at all 5 response finalization paths (non-stream success, non-stream error/Gemini, SSE Gemini, SSE Claude, and persistRequestFailure for error/abort)
  • Sampling support via LANGFUSE_SAMPLE_RATE (0-1) using OpenTelemetry's TraceIdRatioBasedSampler
  • Note: The PR description mentions "I/O truncated to configurable LANGFUSE_MAX_IO_SIZE" and "auth-sensitive headers redacted", but the implementation sends raw bodies without truncation and raw client headers without redaction (the code comments explain provider keys are never in session.headers, which is correct). The description appears outdated relative to the final implementation.
  • src/lib/langfuse/index.ts reads env vars directly from process.env instead of using the validated getEnvConfig(), bypassing the Zod schema constraints defined in env.schema.ts

Confidence Score: 4/5

  • This PR is safe to merge — the Langfuse integration is fully optional (disabled by default) and all trace emission is fire-and-forget with error isolation.
  • Score of 4 reflects that the core integration is well-designed with proper error boundaries, non-blocking async emission, and comprehensive test coverage (996-line test suite + 159-line cost breakdown tests). The main concern is the env config bypass in initLangfuse which reads process.env directly instead of the validated schema, though this only affects the optional Langfuse feature. The duplicated raw-cost calculation blocks across response paths are a maintenance concern but not a correctness issue.
  • src/lib/langfuse/index.ts (bypasses env schema validation), src/app/v1/_lib/proxy/response-handler.ts (duplicated cost calculation blocks)

Important Files Changed

Filename Overview
src/app/v1/_lib/proxy/response-handler.ts Adds emitLangfuseTrace() calls at all 5 response finalization paths (non-stream, non-stream error, SSE Gemini, SSE Claude, and persistRequestFailure). Raw cost calculation block is duplicated between non-stream and stream paths. Minor redundancy in enabled check.
src/app/v1/_lib/proxy/session.ts Adds forwardStartTime, forwardedRequestBody fields and recordForwardStart() method. Idempotent, clean implementation.
src/instrumentation.ts Adds Langfuse init at startup and shutdown flush during graceful termination. Both wrapped in try/catch. Properly positioned in lifecycle.
src/lib/config/env.schema.ts Adds Zod-validated Langfuse env vars with proper defaults and constraints. LANGFUSE_SAMPLE_RATE correctly bounded to 0-1.
src/lib/langfuse/index.ts Langfuse singleton init/shutdown. Reads env vars directly from process.env instead of using validated getEnvConfig(), bypassing the schema constraints defined in env.schema.ts.
src/lib/langfuse/trace-proxy-request.ts Core tracing logic building multi-span hierarchy. Well-structured with proper error handling. Raw request/response bodies sent without truncation (PR description mentions truncation but it was removed).
src/lib/utils/cost-calculation.ts Adds calculateRequestCostBreakdown that mirrors existing calculateRequestCost logic but returns per-category costs. Correctly handles all tiers (context1m, 200k, cache). Good test coverage.

Sequence Diagram

sequenceDiagram
    participant Client
    participant ProxyHandler
    participant GuardPipeline
    participant ProxyForwarder
    participant Upstream as Upstream LLM
    participant ResponseHandler
    participant Langfuse

    Client->>ProxyHandler: LLM Request
    ProxyHandler->>GuardPipeline: Run guards (auth, rate limit, etc.)
    GuardPipeline-->>ProxyHandler: Pass
    Note over ProxyHandler: recordForwardStart()
    ProxyHandler->>ProxyForwarder: Forward request
    Note over ProxyForwarder: Set session.forwardedRequestBody
    ProxyForwarder->>Upstream: HTTP Request
    Upstream-->>ResponseHandler: Response (stream/non-stream)
    Note over ResponseHandler: Calculate cost breakdown
    ResponseHandler-->>Client: Proxied Response

    ResponseHandler-)Langfuse: emitLangfuseTrace (async, fire-and-forget)
    Note over Langfuse: Root span: proxy-request
    Note over Langfuse: Child: guard-pipeline
    Note over Langfuse: Events: provider-attempt (failures)
    Note over Langfuse: Generation: llm-call (usage + cost)
Loading

Last reviewed commit: db043d8

@coderabbitai
Copy link

coderabbitai bot commented Feb 15, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

本次变更在代码库中新增 Langfuse 可观测性集成(OTel 初始化、追踪上报、生命周期)、请求成本拆解与上报、会话转发时间与转发体捕获、相关环境变量与运行时依赖,并添加大量单元测试;未修改公开 API 签名或主要控制流。

Changes

Cohort / File(s) Summary
环境与探针配置
\.env.example, src/lib/config/env.schema.ts
新增 Langfuse 环境变量(LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_BASE_URL, LANGFUSE_SAMPLE_RATE, LANGFUSE_DEBUG)及 Smart/Endpoint probing 配置;在 schema 中加入校验与默认值。
依赖
package.json
新增运行时依赖:@langfuse/client, @langfuse/otel, @langfuse/tracing, @opentelemetry/sdk-node
Langfuse 初始化与生命周期
src/lib/langfuse/index.ts, src/instrumentation.ts
新增 Langfuse OTEL 集成模块(isLangfuseEnabled, initLangfuse, shutdownLangfuse);在启动/关闭路径以动态导入、非阻塞方式初始化与关闭。
追踪上报实现
src/lib/langfuse/trace-proxy-request.ts
新增代理请求追踪实现(TraceContext, traceProxyRequest):构建根/子 span,记录时序分解、元数据、成本/使用信息;对失败容错且异步非阻塞。
响应处理埋点与成本传播
src/app/v1/_lib/proxy/response-handler.ts
在非流、流与错误路径中调用 Langfuse 追踪(非阻塞);扩展成本计算以产生 raw cost 与 costBreakdown 并将其传播至追踪。
会话与转发元数据
src/app/v1/_lib/proxy/session.ts, src/app/v1/_lib/proxy/forwarder.ts, src/app/v1/_lib/proxy-handler.ts
ProxySession 增加 forwardStartTime, forwardedRequestBody 与方法 recordForwardStart();在转发前记录序列化体并在转发时调用 recordForwardStart()
成本拆解工具与测试
src/lib/utils/cost-calculation.ts, tests/unit/lib/cost-calculation-breakdown.test.ts
新增 CostBreakdown 类型与 calculateRequestCostBreakdown(),实现输入/输出/缓存创建/缓存读取的成本细分并添加全面单元测试。
Langfuse 单元测试与测试修复
tests/unit/langfuse/langfuse-trace.test.ts, tests/unit/proxy/proxy-handler-session-id-error.test.ts
新增大量 Langfuse 追踪单元测试;在测试夹具中为 session 添加 recordForwardStart 存根以避免测试中断。

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 47.37% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (14 files):

⚔️ .env.example (content)
⚔️ CHANGELOG.md (content)
⚔️ README.en.md (content)
⚔️ README.md (content)
⚔️ VERSION (content)
⚔️ package.json (content)
⚔️ src/app/v1/_lib/proxy-handler.ts (content)
⚔️ src/app/v1/_lib/proxy/forwarder.ts (content)
⚔️ src/app/v1/_lib/proxy/response-handler.ts (content)
⚔️ src/app/v1/_lib/proxy/session.ts (content)
⚔️ src/instrumentation.ts (content)
⚔️ src/lib/config/env.schema.ts (content)
⚔️ src/lib/utils/cost-calculation.ts (content)
⚔️ tests/unit/proxy/proxy-handler-session-id-error.test.ts (content)

These conflicts must be resolved before merging into dev.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed 标题清晰准确地概括了此PR的主要变更——集成Langfuse用于LLM请求跟踪,这与包含的多个文件变更(Langfuse初始化、跟踪逻辑、会话跟踪、环境配置等)的核心目标完全一致。
Description check ✅ Passed The PR description clearly relates to the changeset, describing Langfuse integration with specific details about trace hierarchy, timing breakdown, and usage/cost propagation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/langfuse-integration
⚔️ Resolve merge conflicts (beta)
  • Auto-commit resolved conflicts to branch feat/langfuse-integration
  • Create stacked PR with resolved conflicts
  • Post resolved changes as copyable diffs in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ding113, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust observability solution by integrating Langfuse for comprehensive tracing of all LLM proxy requests. The integration provides deep insights into the lifecycle of each request, from initial processing through the guard pipeline to upstream provider interactions and final response generation. This enhancement allows for better monitoring, debugging, and performance analysis of the LLM proxy service, offering detailed timing, usage, and cost breakdowns without impacting performance due to its asynchronous and optional nature.

Highlights

  • Langfuse Integration: Integrated Langfuse as an optional observability backend to trace all proxy LLM requests, automatically enabling it when API keys are configured.
  • Multi-Span Trace Hierarchy: Implemented a detailed trace hierarchy including a root 'proxy-request' span, a 'guard-pipeline' span, 'provider-attempt' events, and an 'llm-call' generation span.
  • Comprehensive Timing Breakdown: Captured and reported timing metrics such as guard pipeline duration, upstream total duration, Time To First Byte (TTFB) from forwarding, and token generation time.
  • Usage and Cost Propagation: Ensured full propagation of usage metrics and raw cost details (without multipliers) to the LLM generation observation in Langfuse.
  • Configurable Sampling and I/O Truncation: Added support for trace sampling via LANGFUSE_SAMPLE_RATE and truncated all Langfuse I/O to LANGFUSE_MAX_IO_SIZE to prevent excessive payload sizes.
  • Sensitive Header Redaction: Implemented redaction of authentication-sensitive headers before they are sent to Langfuse for tracing, enhancing security.
Changelog
  • .env.example
    • Added new environment variables for Langfuse configuration, including public key, secret key, base URL, sample rate, debug mode, and max I/O size.
  • package.json
    • Added @langfuse/client, @langfuse/otel, @langfuse/tracing, and @opentelemetry/sdk-node dependencies.
  • src/app/v1/_lib/proxy-handler.ts
    • Called session.recordForwardStart() before forwarding the request to the upstream provider to mark the beginning of the forwarding phase.
  • src/app/v1/_lib/proxy/response-handler.ts
    • Introduced emitLangfuseTrace function to asynchronously send trace data to Langfuse.
    • Integrated emitLangfuseTrace calls for non-streaming successful responses, streaming responses, and error/abort paths.
    • Modified cost calculation logic to derive a rawCostUsdStr for Langfuse, excluding any provider-specific cost multipliers.
  • src/app/v1/_lib/proxy/session.ts
    • Added forwardStartTime property to ProxySession to record the timestamp when the guard pipeline finishes and forwarding begins.
    • Implemented recordForwardStart() method to set the forwardStartTime once.
  • src/instrumentation.ts
    • Added asynchronous initialization of Langfuse observability during Node.js runtime startup.
    • Included graceful shutdown logic for Langfuse to flush pending spans on process termination.
  • src/lib/config/env.schema.ts
    • Extended the environment variable schema with Zod validations for new Langfuse configuration parameters.
  • src/lib/langfuse/index.ts
    • Created a new module to manage the Langfuse OpenTelemetry SDK singleton, including initialization with sampling and debug logging, and shutdown procedures.
  • src/lib/langfuse/trace-proxy-request.ts
    • Created a new module containing the core logic for traceProxyRequest, responsible for constructing the multi-span trace hierarchy for each proxy request.
    • Implemented header sanitization to redact sensitive information before sending to Langfuse.
    • Added logic to build request body summaries, status categories, and timing breakdowns for trace metadata.
    • Included functionality to truncate large input/output payloads to prevent excessive data transfer to Langfuse.
  • tests/unit/langfuse/langfuse-trace.test.ts
    • Added a comprehensive suite of unit tests (827 lines) to verify the correctness of the Langfuse tracing logic, including span hierarchy, timing, metadata, and error handling.
  • tests/unit/proxy/proxy-handler-session-id-error.test.ts
    • Updated the mock session object to include the recordForwardStart method for compatibility with new tracing logic.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions github-actions bot added the size/XL Extra Large PR (> 1000 lines) label Feb 15, 2026
@github-actions
Copy link
Contributor

🧪 测试结果

测试类型 状态
代码质量
单元测试
集成测试
API 测试

总体结果: ✅ 所有测试通过

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

11 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +210 to +215
...(ctx.usageMetrics.cache_read_input_tokens != null
? { cache_read_input_tokens: ctx.usageMetrics.cache_read_input_tokens }
: {}),
...(ctx.usageMetrics.cache_creation_input_tokens != null
? { cache_creation_input_tokens: ctx.usageMetrics.cache_creation_input_tokens }
: {}),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usage keys use snake_case (cache_read_input_tokens) while base keys use lowercase (input, output). Consider standardizing to camelCase for consistency with Langfuse conventions.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/langfuse/trace-proxy-request.ts
Line: 210:215

Comment:
Usage keys use snake_case (`cache_read_input_tokens`) while base keys use lowercase (`input`, `output`). Consider standardizing to camelCase for consistency with Langfuse conventions.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

@github-actions github-actions bot added enhancement New feature or request area:core javascript Pull requests that update javascript code dependencies Pull requests that update a dependency file labels Feb 15, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive integration with Langfuse for observability, covering tracing for the entire proxy request lifecycle, including a multi-span hierarchy, timing breakdowns, and cost propagation. However, a medium-severity security vulnerability was identified due to incomplete redaction of sensitive headers, which could lead to API key leakage for certain providers. To improve maintainability, consider centralizing configuration loading in the Langfuse initialization logic and addressing some code duplication in the response handler. The implementation is robust, with non-blocking trace emission and data truncation, and includes an extensive test suite, ensuring reliability.

Comment on lines 8 to 15
const REDACTED_HEADERS = new Set([
"x-api-key",
"authorization",
"x-goog-api-key",
"anthropic-api-key",
"cookie",
"set-cookie",
]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The REDACTED_HEADERS set is missing the api-key header, which is commonly used by providers like Azure OpenAI. This could lead to sensitive API keys being leaked to the Langfuse observability backend in plain text. It is recommended to add api-key and other common authentication headers to the redaction list.

const REDACTED_HEADERS = new Set([
  "x-api-key",
  "authorization",
  "x-goog-api-key",
  "anthropic-api-key",
  "api-key",
  "x-api-token",
  "x-auth-token",
  "cookie",
  "set-cookie",
]);

Comment on lines +760 to +773
// Raw cost without multiplier for Langfuse
if (provider.costMultiplier !== 1) {
const rawCost = calculateRequestCost(
usageMetrics,
priceData,
1.0,
session.getContext1mApplied()
);
if (rawCost.gt(0)) {
rawCostUsdStr = rawCost.toString();
}
} else {
rawCostUsdStr = costUsdStr;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic for calculating the raw cost for Langfuse (without the provider's cost multiplier) appears to be duplicated from the handleStream method (lines 1700-1713). To improve maintainability, consider extracting this into a private helper method within the ProxyResponseHandler class.

Comment on lines +1700 to +1713
// Raw cost without multiplier for Langfuse
if (provider.costMultiplier !== 1) {
const rawCost = calculateRequestCost(
usageForCost,
priceData,
1.0,
session.getContext1mApplied()
);
if (rawCost.gt(0)) {
rawCostUsdStr = rawCost.toString();
}
} else {
rawCostUsdStr = costUsdStr;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This is a duplication of the raw cost calculation logic found in handleNonStream (lines 760-773). As suggested in the other comment, refactoring this into a shared helper method would reduce code duplication.

Comment on lines +19 to +69
export async function initLangfuse(): Promise<void> {
if (initialized || !isLangfuseEnabled()) {
return;
}

try {
const { NodeSDK: OtelNodeSDK } = await import("@opentelemetry/sdk-node");
const { LangfuseSpanProcessor: LfSpanProcessor } = await import("@langfuse/otel");

const sampleRate = Number.parseFloat(process.env.LANGFUSE_SAMPLE_RATE || "1.0");

spanProcessor = new LfSpanProcessor({
publicKey: process.env.LANGFUSE_PUBLIC_KEY,
secretKey: process.env.LANGFUSE_SECRET_KEY,
baseUrl: process.env.LANGFUSE_BASE_URL || "https://cloud.langfuse.com",
// Only export spans from langfuse-sdk scope (avoid noise from other OTel instrumentations)
shouldExportSpan: ({ otelSpan }) => otelSpan.instrumentationScope.name === "langfuse-sdk",
});

const samplerConfig =
sampleRate < 1.0
? await (async () => {
const { TraceIdRatioBasedSampler } = await import("@opentelemetry/sdk-trace-base");
return { sampler: new TraceIdRatioBasedSampler(sampleRate) };
})()
: {};

sdk = new OtelNodeSDK({
spanProcessors: [spanProcessor],
...samplerConfig,
});

sdk.start();
initialized = true;

logger.info("[Langfuse] Observability initialized", {
baseUrl: process.env.LANGFUSE_BASE_URL || "https://cloud.langfuse.com",
sampleRate,
debug: process.env.LANGFUSE_DEBUG === "true",
});

if (process.env.LANGFUSE_DEBUG === "true") {
const { configureGlobalLogger, LogLevel } = await import("@langfuse/core");
configureGlobalLogger({ level: LogLevel.DEBUG });
}
} catch (error) {
logger.error("[Langfuse] Failed to initialize", {
error: error instanceof Error ? error.message : String(error),
});
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve maintainability and ensure consistency with configuration validation, it's better to use the getEnvConfig() function to retrieve all Langfuse-related environment variables. This avoids duplicating the parsing and default value logic that is already defined in src/lib/config/env.schema.ts.

export async function initLangfuse(): Promise<void> {
  if (initialized || !isLangfuseEnabled()) {
    return;
  }

  try {
    const { NodeSDK: OtelNodeSDK } = await import("@opentelemetry/sdk-node");
    const { LangfuseSpanProcessor: LfSpanProcessor } = await import("@langfuse/otel");
    const { getEnvConfig } = require("@/lib/config/env.schema");
    const {
      LANGFUSE_PUBLIC_KEY,
      LANGFUSE_SECRET_KEY,
      LANGFUSE_BASE_URL,
      LANGFUSE_SAMPLE_RATE,
      LANGFUSE_DEBUG,
    } = getEnvConfig();

    spanProcessor = new LfSpanProcessor({
      publicKey: LANGFUSE_PUBLIC_KEY,
      secretKey: LANGFUSE_SECRET_KEY,
      baseUrl: LANGFUSE_BASE_URL,
      // Only export spans from langfuse-sdk scope (avoid noise from other OTel instrumentations)
      shouldExportSpan: ({ otelSpan }) => otelSpan.instrumentationScope.name === "langfuse-sdk",
    });

    const samplerConfig =
      LANGFUSE_SAMPLE_RATE < 1.0
        ? await (async () => {
            const { TraceIdRatioBasedSampler } = await import("@opentelemetry/sdk-trace-base");
            return { sampler: new TraceIdRatioBasedSampler(LANGFUSE_SAMPLE_RATE) };
          })()
        : {};

    sdk = new OtelNodeSDK({
      spanProcessors: [spanProcessor],
      ...samplerConfig,
    });

    sdk.start();
    initialized = true;

    logger.info("[Langfuse] Observability initialized", {
      baseUrl: LANGFUSE_BASE_URL,
      sampleRate: LANGFUSE_SAMPLE_RATE,
      debug: LANGFUSE_DEBUG,
    });

    if (LANGFUSE_DEBUG) {
      const { configureGlobalLogger, LogLevel } = await import("@langfuse/core");
      configureGlobalLogger({ level: LogLevel.DEBUG });
    }
  } catch (error) {
    logger.error("[Langfuse] Failed to initialize", {
      error: error instanceof Error ? error.message : String(error),
    });
  }
}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @.env.example:
- Around line 131-139: Add the missing LANGFUSE_MAX_IO_SIZE environment variable
documentation to .env.example: describe the variable name LANGFUSE_MAX_IO_SIZE,
its default value (100000), allowed range (1-10000000), and its purpose (limits
Langfuse I/O payload size) along with an example value and note that it is
optional; place it near the other LANGFUSE_* entries (alongside
LANGFUSE_SAMPLE_RATE and LANGFUSE_DEBUG) so operators can discover and tune the
setting.

In `@package.json`:
- Around line 47-51: 在 package.json 中修正不可用的依赖版本:将 "@opentelemetry/sdk-node" 的版本从
"0.212.0" 改为 "^0.211.0",并将三个 Langfuse 包 "@langfuse/client", "@langfuse/otel",
"@langfuse/tracing" 的版本从 "4.6.1" 更新为一个在 npm 上存在的版本(例如
"^4.5.1")或确认并替换为你们实际需要的已发布版本;保存后运行 npm/yarn 安装以验证没有版本解析错误并修复 package-lock.json 或
yarn.lock 中的锁定文件。

In `@src/app/v1/_lib/proxy/response-handler.ts`:
- Around line 1763-1774: The field sseEventCount is incorrectly populated with
chunks.length (raw read() byte-chunk count) in emitLangfuseTrace; update the
call in response-handler.ts to either rename the field to chunkCount to
accurately reflect chunks.length, or compute the actual SSE event count from the
parsed/decoded SSE stream (e.g., count complete SSE events after your SSE
parsing logic) and pass that value as sseEventCount; adjust the
TraceContext/interface usage accordingly (or update Langfuse payload field name)
and ensure the symbol emitLangfuseTrace and the variable chunks are the points
you modify.
🧹 Nitpick comments (10)
src/lib/config/env.schema.ts (1)

131-137: 考虑为 LANGFUSE_BASE_URL 添加 .url() 校验。

其他 URL 类型的环境变量(如 DSN,第 34 行)使用了 .url() 校验。建议对 LANGFUSE_BASE_URL 也添加 URL 格式校验,以便在启动时尽早发现配置错误,而不是等到运行时连接失败。

建议的修改
-  LANGFUSE_BASE_URL: z.string().default("https://cloud.langfuse.com"),
+  LANGFUSE_BASE_URL: z.string().url().default("https://cloud.langfuse.com"),
src/lib/langfuse/index.ts (2)

28-33: initLangfuse() 直接读取 process.env 绕过了 Zod 校验。

LANGFUSE_SAMPLE_RATELANGFUSE_BASE_URLenv.schema.ts 中定义了校验规则(如 min/max 边界和默认值),但此处通过 process.env 直接读取并硬编码了默认值。这会导致:

  1. 如果用户设置了无效值(如 LANGFUSE_SAMPLE_RATE=abc),Number.parseFloat 返回 NaN,不会触发 Zod 的范围校验。
  2. 默认值在两处维护,未来容易不同步。

理解这里需要在 env schema 初始化之前运行,但至少建议对 sampleRateNaN 防御。

建议对 sampleRate 添加 NaN 防御
     const sampleRate = Number.parseFloat(process.env.LANGFUSE_SAMPLE_RATE || "1.0");
+    // Fallback to 1.0 (no sampling) if value is invalid
+    const effectiveSampleRate = Number.isFinite(sampleRate) ? Math.max(0, Math.min(1, sampleRate)) : 1.0;

然后在后续代码中使用 effectiveSampleRate 代替 sampleRate


75-91: 关闭时未重置 sdkspanProcessor 引用。

shutdownLangfuse()initialized 重置为 false,但未将 sdkspanProcessor 置为 null。虽然当前不太可能在 shutdown 后再次调用 initLangfuse(),但为了防御性编程,建议一并清理。

建议的修改
     initialized = false;
+    sdk = null;
+    spanProcessor = null;
     logger.info("[Langfuse] Shutdown complete");
src/lib/langfuse/trace-proxy-request.ts (3)

130-141: failedAttempts 计数可能不够精确。

第 139 行将 reason 不在 SUCCESS_REASONS(包含 "initial_selection""session_reuse")中的 chain item 都计为失败。但 "initial_selection" 是选择事件,不代表最终结果;只有后续有另一个带结果 reason 的 chain item 才表示该次尝试的成败。这可能导致某些中间状态的 chain item 被错误地计入失败次数。

这不影响功能,仅影响 trace 中的元数据准确性。考虑过滤出明确的失败 reason("retry_failed", "system_error" 等)而非取 success 的补集。


73-89: truncateForLangfuseJSON.stringify 对循环引用可能抛异常。

第 78 行对对象调用 JSON.stringify(data) 可能因循环引用而抛出 TypeError。虽然外层 traceProxyRequest 有 try/catch,但这会导致整个 trace 丢失。建议在此处添加 try/catch 保护,仅对该对象降级处理而不中断整个 trace 流程。

建议的防御性修改
   if (data != null && typeof data === "object") {
-    const str = JSON.stringify(data);
-    if (str.length > maxChars) {
-      return {
-        _truncated: true,
-        _length: str.length,
-        _preview: str.substring(0, Math.min(maxChars, 2000)),
-      };
+    try {
+      const str = JSON.stringify(data);
+      if (str.length > maxChars) {
+        return {
+          _truncated: true,
+          _length: str.length,
+          _preview: str.substring(0, Math.min(maxChars, 2000)),
+        };
+      }
+    } catch {
+      return { _truncated: true, _error: "unserializable" };
     }
     return data;
   }

339-355: 根 span 的 input/output 在第 229-243 行和第 341-354 行中重复设置。

startObservation 调用时已经传入了 inputoutput(第 229-243 行),updateTrace 又以几乎相同的内容再次设置(第 341-354 行)。注释说明 propagateAttributes 不支持 input/output,但第一次的 startObservation 已经设置了。如果两者设置的是不同层级的数据(span vs trace),建议在注释中更明确地说明区别。

src/app/v1/_lib/proxy/response-handler.ts (3)

42-81: emitLangfuseTrace 中的启用检查与 isLangfuseEnabled() 逻辑重复。

第 59 行直接检查 process.env.LANGFUSE_PUBLIC_KEY && process.env.LANGFUSE_SECRET_KEY,与 src/lib/langfuse/index.ts 中的 isLangfuseEnabled() 逻辑重复。理解这是为了避免在禁用状态下加载模块,设计合理。但如果未来启用条件变更(例如增加 feature flag),需要同步修改两处。建议至少添加注释说明此处的判断逻辑与 isLangfuseEnabled() 保持一致。


760-773: 非流式和流式路径中的 raw cost 计算逻辑完全重复。

非流式(第 760-773 行)和流式(第 1700-1713 行)路径中计算 rawCostUsdStr 的逻辑完全一致:当 costMultiplier !== 1 时以 1.0 为乘数重新计算。考虑将其提取为共享辅助函数以消除重复。

建议提取共享函数
function computeRawCost(
  usageMetrics: UsageMetrics,
  priceData: ModelPriceData,
  costMultiplier: number,
  context1mApplied: boolean,
  costUsdStr: string | undefined,
): string | undefined {
  if (costMultiplier !== 1) {
    const rawCost = calculateRequestCost(usageMetrics, priceData, 1.0, context1mApplied);
    return rawCost.gt(0) ? rawCost.toString() : undefined;
  }
  return costUsdStr;
}

Also applies to: 1700-1713


564-575: Gemini 透传路径中 costUsd 始终为 undefined

第 570 行 costUsd: undefined——在 Gemini 透传非流式路径中,使用 finalizeRequestStats 处理统计,但 Langfuse trace 不包含 cost 信息。同样的情况出现在 Gemini 透传流式路径(第 1309 行)。如果需要在 Langfuse 中查看 Gemini 请求的成本,此处需要额外计算。如果是有意为之(由于 Gemini 透传路径的 cost 计算在 finalizeRequestStats 内部),建议添加注释说明。

tests/unit/langfuse/langfuse-trace.test.ts (1)

793-827: isLangfuseEnabled 测试未验证实际实现。

这两个测试用内联表达式 !!(process.env.LANGFUSE_PUBLIC_KEY && process.env.LANGFUSE_SECRET_KEY) 替代了对真实 isLangfuseEnabled 函数的调用。如果实际实现逻辑发生变化(例如增加 LANGFUSE_BASE_URL 检查或采样率判断),这些测试不会捕获到回归。

建议直接导入并调用真实函数(需要绕过上面的 mock,或在单独的测试文件中测试)。

建议方案:使用独立文件或 `vi.importActual` 测试真实实现
-  test("should return false when env vars are not set", () => {
-    delete process.env.LANGFUSE_PUBLIC_KEY;
-    delete process.env.LANGFUSE_SECRET_KEY;
-
-    // Direct function test (not using the mock)
-    const isEnabled = !!(process.env.LANGFUSE_PUBLIC_KEY && process.env.LANGFUSE_SECRET_KEY);
-    expect(isEnabled).toBe(false);
-  });
-
-  test("should return true when both keys are set", () => {
-    process.env.LANGFUSE_PUBLIC_KEY = "pk-lf-test-mock";
-    process.env.LANGFUSE_SECRET_KEY = "test-mock-not-real";
-
-    const isEnabled = !!(process.env.LANGFUSE_PUBLIC_KEY && process.env.LANGFUSE_SECRET_KEY);
-    expect(isEnabled).toBe(true);
-  });
+  // 建议将这些测试移到单独的测试文件(如 langfuse-enabled.test.ts),
+  // 不 mock `@/lib/langfuse/index`,直接导入并测试 isLangfuseEnabled 函数。

Comment on lines +131 to +139
# Langfuse Observability (optional, auto-enabled when keys are set)
# 功能说明:企业级 LLM 可观测性集成,自动追踪所有代理请求的完整生命周期
# - 配置 PUBLIC_KEY 和 SECRET_KEY 后自动启用
# - 支持 Langfuse Cloud 和自托管实例
LANGFUSE_PUBLIC_KEY= # Langfuse project public key (pk-lf-...)
LANGFUSE_SECRET_KEY= # Langfuse project secret key (sk-lf-...)
LANGFUSE_BASE_URL=https://cloud.langfuse.com # Langfuse server URL (self-hosted or cloud)
LANGFUSE_SAMPLE_RATE=1.0 # Trace sampling rate (0.0-1.0, default: 1.0 = 100%)
LANGFUSE_DEBUG=false # Enable Langfuse debug logging
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

缺少 LANGFUSE_MAX_IO_SIZE 配置项说明。

env schema 中定义了 LANGFUSE_MAX_IO_SIZE(默认 100,000,范围 1-10,000,000),但 .env.example 中未包含该配置项。建议补充以便运维人员了解此可调参数。

建议补充
 LANGFUSE_SAMPLE_RATE=1.0                    # Trace sampling rate (0.0-1.0, default: 1.0 = 100%)
 LANGFUSE_DEBUG=false                        # Enable Langfuse debug logging
+LANGFUSE_MAX_IO_SIZE=100000                 # Max I/O size per trace (chars, default: 100000, max: 10000000)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Langfuse Observability (optional, auto-enabled when keys are set)
# 功能说明:企业级 LLM 可观测性集成,自动追踪所有代理请求的完整生命周期
# - 配置 PUBLIC_KEY 和 SECRET_KEY 后自动启用
# - 支持 Langfuse Cloud 和自托管实例
LANGFUSE_PUBLIC_KEY= # Langfuse project public key (pk-lf-...)
LANGFUSE_SECRET_KEY= # Langfuse project secret key (sk-lf-...)
LANGFUSE_BASE_URL=https://cloud.langfuse.com # Langfuse server URL (self-hosted or cloud)
LANGFUSE_SAMPLE_RATE=1.0 # Trace sampling rate (0.0-1.0, default: 1.0 = 100%)
LANGFUSE_DEBUG=false # Enable Langfuse debug logging
# Langfuse Observability (optional, auto-enabled when keys are set)
# 功能说明:企业级 LLM 可观测性集成,自动追踪所有代理请求的完整生命周期
# - 配置 PUBLIC_KEY 和 SECRET_KEY 后自动启用
# - 支持 Langfuse Cloud 和自托管实例
LANGFUSE_PUBLIC_KEY= # Langfuse project public key (pk-lf-...)
LANGFUSE_SECRET_KEY= # Langfuse project secret key (sk-lf-...)
LANGFUSE_BASE_URL=https://cloud.langfuse.com # Langfuse server URL (self-hosted or cloud)
LANGFUSE_SAMPLE_RATE=1.0 # Trace sampling rate (0.0-1.0, default: 1.0 = 100%)
LANGFUSE_DEBUG=false # Enable Langfuse debug logging
LANGFUSE_MAX_IO_SIZE=100000 # Max I/O size per trace (chars, default: 100000, max: 10000000)
🧰 Tools
🪛 dotenv-linter (4.0.0)

[warning] 135-135: [SpaceCharacter] The line has spaces around equal sign

(SpaceCharacter)


[warning] 135-135: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)


[warning] 136-136: [SpaceCharacter] The line has spaces around equal sign

(SpaceCharacter)


[warning] 136-136: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)


[warning] 137-137: [UnorderedKey] The LANGFUSE_BASE_URL key should go before the LANGFUSE_PUBLIC_KEY key

(UnorderedKey)


[warning] 137-137: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)


[warning] 138-138: [UnorderedKey] The LANGFUSE_SAMPLE_RATE key should go before the LANGFUSE_SECRET_KEY key

(UnorderedKey)


[warning] 138-138: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)


[warning] 139-139: [UnorderedKey] The LANGFUSE_DEBUG key should go before the LANGFUSE_PUBLIC_KEY key

(UnorderedKey)


[warning] 139-139: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)

🤖 Prompt for AI Agents
In @.env.example around lines 131 - 139, Add the missing LANGFUSE_MAX_IO_SIZE
environment variable documentation to .env.example: describe the variable name
LANGFUSE_MAX_IO_SIZE, its default value (100000), allowed range (1-10000000),
and its purpose (limits Langfuse I/O payload size) along with an example value
and note that it is optional; place it near the other LANGFUSE_* entries
(alongside LANGFUSE_SAMPLE_RATE and LANGFUSE_DEBUG) so operators can discover
and tune the setting.

Comment on lines +47 to +51
"@langfuse/client": "^4.6.1",
"@langfuse/otel": "^4.6.1",
"@langfuse/tracing": "^4.6.1",
"@lobehub/icons": "^2",
"@opentelemetry/sdk-node": "^0.212.0",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

@opentelemetry/sdk-node npm registry version 0.212.0

💡 Result:

@opentelemetry/sdk-node@0.212.0 does not appear to be published on the npm registry. The latest published npm version shown by multiple package indexes is 0.211.0. [1][2][3]

There is an upstream GitHub release tag experimental/v0.212.0 in the open-telemetry/opentelemetry-js repo, but that doesn’t necessarily mean the @opentelemetry/sdk-node npm package was published at 0.212.0. [4]

What to do

  • Install the latest available: npm i @opentelemetry/sdk-node@0.211.0 [1][2][3]
  • Or verify from your environment: npm view @opentelemetry/sdk-node versions --json | jq 'map(select(.=="0.212.0"))' (should return nothing if it’s unpublished)

Sources: [1][2][3][4]


🌐 Web query:

@langfuse/client @langfuse/otel @langfuse/tracing npm registry version 4.6.1

💡 Result:

I couldn’t find any evidence that version 4.6.1 exists on the npm registry for @langfuse/client, @langfuse/tracing, or @langfuse/otel based on publicly indexable sources.

What I could verify:

  • npm package pages (as indexed) show 4.0.0 for @langfuse/client, @langfuse/tracing, and @langfuse/otel. [1][2][3]
  • A registry mirror (Tessl) shows at least 4.2.0 for @langfuse/otel and @langfuse/tracing. [4][5]
  • A recent community post shows usage of ^4.5.1 for @langfuse/otel and @langfuse/tracing. [6]

To confirm from your environment (authoritative), run:

npm view `@langfuse/client` version
npm view `@langfuse/tracing` version
npm view `@langfuse/otel` version

npm view `@langfuse/client`@4.6.1 version
npm view `@langfuse/tracing`@4.6.1 version
npm view `@langfuse/otel`@4.6.1 version

Sources: [1] [2] [3] [4] [5] [6]


修正依赖包版本号:所指定版本在 npm 注册表中不存在。

  • @opentelemetry/sdk-node@0.212.0 未在 npm 注册表中发布。最新可用版本为 0.211.0。请更新为 ^0.211.0
  • @langfuse/client@4.6.1@langfuse/otel@4.6.1@langfuse/tracing@4.6.1 无法在公开来源验证版本 4.6.1 是否存在。已确认的版本有 4.0.04.2.04.5.1。请确认实际可用版本并更新(建议验证 4.5.1 是否符合需求)。
🤖 Prompt for AI Agents
In `@package.json` around lines 47 - 51, 在 package.json 中修正不可用的依赖版本:将
"@opentelemetry/sdk-node" 的版本从 "0.212.0" 改为 "^0.211.0",并将三个 Langfuse 包
"@langfuse/client", "@langfuse/otel", "@langfuse/tracing" 的版本从 "4.6.1" 更新为一个在
npm 上存在的版本(例如 "^4.5.1")或确认并替换为你们实际需要的已发布版本;保存后运行 npm/yarn 安装以验证没有版本解析错误并修复
package-lock.json 或 yarn.lock 中的锁定文件。

Comment on lines 1763 to 1774

emitLangfuseTrace(session, {
responseHeaders: response.headers,
responseText: allContent,
usageMetrics: usageForCost,
costUsd: rawCostUsdStr,
statusCode: effectiveStatusCode,
durationMs: duration,
isStreaming: true,
sseEventCount: chunks.length,
errorMessage: streamErrorMessage ?? undefined,
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

sseEventCount 实际传入的是 chunks.length(原始数据块数),非 SSE 事件数。

第 1772 行 sseEventCount: chunks.length 传入的是 reader.read() 返回的原始字节块数量,而非解析后的 SSE 事件数量。单次 read() 可能包含多个 SSE 事件,或者一个事件被拆分到多次 read() 中。TraceContext 接口中该字段命名为 sseEventCount,在 Langfuse 中也以此名称展示,可能造成误导。

建议要么重命名为 chunkCount 以准确反映含义,要么传入实际的 SSE 事件计数。

🤖 Prompt for AI Agents
In `@src/app/v1/_lib/proxy/response-handler.ts` around lines 1763 - 1774, The
field sseEventCount is incorrectly populated with chunks.length (raw read()
byte-chunk count) in emitLangfuseTrace; update the call in response-handler.ts
to either rename the field to chunkCount to accurately reflect chunks.length, or
compute the actual SSE event count from the parsed/decoded SSE stream (e.g.,
count complete SSE events after your SSE parsing logic) and pass that value as
sseEventCount; adjust the TraceContext/interface usage accordingly (or update
Langfuse payload field name) and ensure the symbol emitLangfuseTrace and the
variable chunks are the points you modify.

try {
return JSON.parse(text);
} catch {
return text;
Copy link
Contributor

@github-actions github-actions bot Feb 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CRITICAL] [ERROR-SILENT] Silent catch in tryParseJsonSafe

Location: src/lib/langfuse/trace-proxy-request.ts:368

Evidence:

} catch {
  return text;
}

Why this is a problem: When responseText looks like JSON but is malformed/truncated, this suppresses the parse failure and makes tracing output harder to debug.

Suggested fix:

function tryParseJsonSafe(text: string): unknown {
  const trimmed = text.trimStart();
  if (!trimmed.startsWith("{") && !trimmed.startsWith("[")) {
    return text;
  }

  try {
    return JSON.parse(text);
  } catch (error) {
    logger.trace("[Langfuse] Failed to parse JSON output for Langfuse, sending raw text", {
      error: error instanceof Error ? error.message : String(error),
    });
    return text;
  }
}

LANGFUSE_BASE_URL=https://cloud.langfuse.com # Langfuse server URL (self-hosted or cloud)
LANGFUSE_SAMPLE_RATE=1.0 # Trace sampling rate (0.0-1.0, default: 1.0 = 100%)
LANGFUSE_DEBUG=false # Enable Langfuse debug logging

Copy link
Contributor

@github-actions github-actions bot Feb 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] [COMMENT-INCOMPLETE] Missing LANGFUSE_MAX_IO_SIZE in .env.example

Location: .env.example:140

Why this is a problem: Tracing truncation uses LANGFUSE_MAX_IO_SIZE (src/lib/config/env.schema.ts:137, src/lib/langfuse/trace-proxy-request.ts:46), but operators will not discover/tune it via .env.example.

Suggested fix (add to the Langfuse block):

LANGFUSE_MAX_IO_SIZE=100000                 # Max chars of traced input/output payload (default: 100000)

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR adds a large new observability surface area (Langfuse + OpenTelemetry) across the proxy pipeline. One error-handling pattern in the new tracing code silently swallows parse failures, which will hide malformed payloads when tracing is enabled. The environment docs also omit a newly-supported tuning knob.

PR Size: XL

  • Lines changed: 1500
  • Files changed: 11

Split suggestions (recommended for follow-ups):

  • PR 1: dependency + env schema changes (package.json, src/lib/config/env.schema.ts, .env.example)
  • PR 2: Langfuse SDK lifecycle (src/lib/langfuse/index.ts, src/instrumentation.ts)
  • PR 3: tracing payload/content model (src/lib/langfuse/trace-proxy-request.ts)
  • PR 4: proxy integration points (src/app/v1/_lib/proxy-handler.ts, src/app/v1/_lib/proxy/session.ts, src/app/v1/_lib/proxy/response-handler.ts)
  • PR 5: unit tests (tests/unit/langfuse/langfuse-trace.test.ts)

Issues Found

Category Critical High Medium Low
Logic/Bugs 0 0 0 0
Security 0 0 0 0
Error Handling 1 0 0 0
Types 0 0 0 0
Comments/Docs 0 0 1 0
Tests 0 0 0 0
Simplification 0 0 0 0

Critical Issues (Must Fix)

  • src/lib/langfuse/trace-proxy-request.ts:368 [CRITICAL] [ERROR-SILENT] tryParseJsonSafe swallows JSON parse errors without logging (Confidence: 97/100)

High Priority Issues (Should Fix)

  • .env.example:140 [MEDIUM] [COMMENT-INCOMPLETE] missing LANGFUSE_MAX_IO_SIZE documentation (Confidence: 88/100)

Review Coverage

  • Logic and correctness
  • Security (OWASP Top 10)
  • Error handling
  • Type safety
  • Documentation accuracy
  • Test coverage
  • Code clarity

Automated review by Codex AI

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR implements Langfuse observability integration with a well-designed multi-span trace hierarchy. The implementation is solid with comprehensive test coverage (827 lines), proper error handling throughout, and appropriate security considerations (header redaction).

PR Size: XL

  • Lines changed: 1500 (1491 additions, 9 deletions)
  • Files changed: 11

Issues Found

Category Critical High Medium Low
Logic/Bugs 0 0 0 0
Security 0 0 0 0
Error Handling 0 0 0 0
Types 0 0 0 0
Comments/Docs 0 0 0 0
Tests 0 0 0 0
Simplification 0 0 0 0

Review Coverage

  • Logic and correctness - Clean
  • Security (OWASP Top 10) - Clean (header redaction implemented)
  • Error handling - Clean (all catch blocks log errors appropriately)
  • Type safety - Clean (type assertions documented with SDK limitations)
  • Documentation accuracy - Clean (comments match implementation)
  • Test coverage - Comprehensive (827-line test suite)
  • Code clarity - Good (well-structured with clear separation of concerns)

Implementation Quality Notes

Strengths:

  • Multi-span trace hierarchy (root -> guard-pipeline + provider-attempt events + llm-call generation)
  • Non-blocking fire-and-forget design with error tolerance
  • Configurable sampling via LANGFUSE_SAMPLE_RATE with Zod validation
  • I/O truncation to prevent excessive payloads
  • Auth-sensitive header redaction before tracing
  • Proper lifecycle management with graceful shutdown

Minor observations (below reporting threshold):

  • initLangfuse reads LANGFUSE_SAMPLE_RATE directly from process.env while other config values use getEnvConfig() - this is safe due to Zod validation but slightly inconsistent
  • Type assertions used to work around SDK type limitations are documented inline

Automated review by Claude AI

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Identified the active PR as #791 and applied the label size/XL (1500 LOC changed across 11 files).
  • Posted 2 inline review comments (diff-only):
    • src/lib/langfuse/trace-proxy-request.ts:368[CRITICAL] [ERROR-SILENT] empty catch {} in tryParseJsonSafe; suggested a guard + logged fallback.
    • .env.example:140[MEDIUM] [COMMENT-INCOMPLETE] missing LANGFUSE_MAX_IO_SIZE documentation, despite being used by truncation.
  • Submitted the required “Code Review Summary” review comment on #791, including XL split suggestions and confidence scores.

@ding113 ding113 force-pushed the feat/langfuse-integration branch from aaa48a9 to c7fc66c Compare February 15, 2026 13:50
@github-actions
Copy link
Contributor

🧪 测试结果

测试类型 状态
代码质量
单元测试
集成测试
API 测试

总体结果: ✅ 所有测试通过

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@src/app/v1/_lib/proxy/response-handler.ts`:
- Around line 567-578: The Langfuse trace for the Gemini passthrough path is
missing cost data: when calling emitLangfuseTrace in response-handler.ts (the
block that already passes responseHeaders, responseText, usageMetrics:
parseUsageFromResponseText(responseText, provider.providerType).usageMetrics,
etc.) compute the costUsd and costBreakdown from the parsed usage before
emitting; i.e., call the same cost-calculation helper used in the
non-passthrough paths (the function that converts usage metrics to cost — search
for usages where costUsd and costBreakdown are set for non-stream and stream
flows) and supply those values to emitLangfuseTrace instead of leaving costUsd
undefined so Gemini passthrough traces include cost and costBreakdown.
- Around line 46-84: The emitLangfuseTrace function currently checks process.env
directly which diverges from trace-proxy-request's isLangfuseEnabled; replace
the direct env check with a call to isLangfuseEnabled() (imported from
"@/lib/langfuse" or "@/lib/langfuse/index") so both use the same enablement
logic, i.e., call isLangfuseEnabled() at the top of emitLangfuseTrace and only
perform the dynamic import/traceProxyRequest when it returns true, preserving
the existing error logging behavior for the import/trace call.

In `@src/lib/langfuse/trace-proxy-request.ts`:
- Around line 26-37: The hasSystemPrompt check in buildRequestBodySummary
currently only treats msg.system as an array; update the logic in
buildRequestBodySummary (which takes a ProxySession and reads
session.request.message as msg) to consider both non-empty strings and non-empty
arrays as a system prompt (e.g., treat typeof msg.system === "string" &&
msg.system.trim().length > 0 OR Array.isArray(msg.system) && msg.system.length >
0 as true); keep existing handling for toolsCount, stream, maxTokens and
temperature unchanged.

In `@tests/unit/langfuse/langfuse-trace.test.ts`:
- Around line 962-995: The tests in the "isLangfuseEnabled" describe block are
checking a local inline expression instead of calling the real isLangfuseEnabled
implementation (the module "@/lib/langfuse/index" is mocked earlier), so update
the block to exercise the real function: either move this describe to a separate
test file that does not mock "@/lib/langfuse/index" or restore the real
implementation inside the block using vi.importActual to import and call
isLangfuseEnabled from "@/lib/langfuse/index"; ensure you remove the inline
!!(process.env.LANGFUSE_PUBLIC_KEY && process.env.LANGFUSE_SECRET_KEY) checks
and instead call isLangfuseEnabled() so changes to the function (e.g., sampling
logic) are tested.
🧹 Nitpick comments (7)
src/lib/utils/cost-calculation.ts (1)

101-306: 建议抽取共享计算核心以避免与 calculateRequestCost 逻辑漂移
新函数与现有计费逻辑高度重复,后续改动容易产生不一致。可考虑抽取公共分桶/分层计算或复用一处核心逻辑。

src/app/v1/_lib/proxy/response-handler.ts (2)

746-795: 原始成本 / 成本拆解的计算逻辑在 non-stream 和 stream 路径中完全重复。

Lines 746-795(non-stream)与 Lines 1698-1747(stream finalizeStream)的 rawCostUsdStr / costBreakdown 计算逻辑几乎逐行一致。建议提取为一个共用的辅助函数(如 computeLangfuseCost),减少维护时两处不同步的风险。


3071-3083: persistRequestFailure 中补发 Langfuse trace:确认不会与主路径产生重复 trace。

此处为兜底路径,在主路径的 emitLangfuseTrace 未执行时补发 trace。逻辑上看,主路径的 trace 在 finalizeStream / handleNonStream 的 try 块末尾发射,若之前抛出异常则不会到达,因此进入 persistRequestFailure 时通常不会重复。但如果后续重构打乱了执行顺序,可能产生重复 trace。可以考虑在 session 上加一个 langfuseTraceEmitted 标记来做幂等保护。

src/lib/langfuse/trace-proxy-request.ts (3)

112-145: session.getProviderChain() 被多次调用,建议缓存。

getProviderChain() 在 Lines 143、144、152-154、304 处被重复调用。虽然该方法应该是幂等的,但每次调用可能创建新数组或执行内部逻辑。建议在函数入口处缓存一次:

建议缓存 providerChain
     const requestStartTime = new Date(session.startTime);
     const requestEndTime = new Date(session.startTime + durationMs);
+    const providerChain = session.getProviderChain();

     // ...
     const timingBreakdown = {
       // ...
-      failedAttempts: session.getProviderChain().filter((i) => !isSuccessReason(i.reason)).length,
-      providersAttempted: new Set(session.getProviderChain().map((i) => i.id)).size,
+      failedAttempts: providerChain.filter((i) => !isSuccessReason(i.reason)).length,
+      providersAttempted: new Set(providerChain.map((i) => i.id)).size,
     };

74-90: truncateForLangfuse 对对象类型会完整序列化两次。

Line 79 调用 JSON.stringify(data) 来检测大小,若超限则 Line 84 再次 substring 取前 2000 字符作为 preview。如果对象体积很大(如包含 base64 图片的消息),第一次完整序列化会临时占用大量内存。由于 tracing 是异步非阻塞的,不会影响请求延迟,但在高并发场景下可能造成内存峰值。

可以考虑先用 JSON.stringify(data).length 的替代方案(如先序列化一次缓存结果),或者对 preview 复用同一次序列化的结果:

复用序列化结果
 if (data != null && typeof data === "object") {
   const str = JSON.stringify(data);
   if (str.length > maxChars) {
     return {
       _truncated: true,
       _length: str.length,
-      _preview: str.substring(0, Math.min(maxChars, 2000)),
+      _preview: str.substring(0, 2000),
     };
   }
-  return data;
+  return data; // Under limit, original object is fine (Langfuse SDK will serialize it)
 }

实际上 str 已经缓存了,当前代码已经是复用的。但如果不超限,str 被丢弃而返回原始 data,Langfuse SDK 还会再序列化一次。可以考虑直接返回已序列化的 str(如果 SDK 接受字符串)。


184-190: tags 数组可能包含重复值。

provider.providerTypesession.originalFormat 相同时(例如均为 "claude"),tag 会重复出现。虽然 Langfuse 能容忍重复 tag,但可以考虑使用 Set 去重后再转数组。

tests/unit/langfuse/langfuse-trace.test.ts (1)

470-492: 精确断言截断大小限制。

当前测试验证 output.length < 200_000 和包含 ...[truncated] 的断言过于宽松。LANGFUSE_MAX_IO_SIZE 默认值为 100_000,截断后字符串为 ${data.substring(0, 100_000)}...[truncated],总长度约 100_014 字符。建议改为 expect(output.length).toBeLessThanOrEqual(100_000 + 20) 以更精确地验证截断阈值,防止截断逻辑中的边界错误。

Comment on lines +46 to +84
function emitLangfuseTrace(
session: ProxySession,
data: {
responseHeaders: Headers;
responseText: string;
usageMetrics: UsageMetrics | null;
costUsd: string | undefined;
costBreakdown?: CostBreakdown;
statusCode: number;
durationMs: number;
isStreaming: boolean;
sseEventCount?: number;
errorMessage?: string;
}
): void {
if (!process.env.LANGFUSE_PUBLIC_KEY || !process.env.LANGFUSE_SECRET_KEY) return;

void import("@/lib/langfuse/trace-proxy-request")
.then(({ traceProxyRequest }) => {
void traceProxyRequest({
session,
responseHeaders: data.responseHeaders,
durationMs: data.durationMs,
statusCode: data.statusCode,
isStreaming: data.isStreaming,
responseText: data.responseText,
usageMetrics: data.usageMetrics,
costUsd: data.costUsd,
costBreakdown: data.costBreakdown,
sseEventCount: data.sseEventCount,
errorMessage: data.errorMessage,
});
})
.catch((err) => {
logger.warn("[ResponseHandler] Langfuse trace failed", {
error: err instanceof Error ? err.message : String(err),
});
});
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

emitLangfuseTrace 使用 process.env 直接检查,与 isLangfuseEnabled() 不一致。

Line 61 直接读取 process.env.LANGFUSE_PUBLIC_KEY / process.env.LANGFUSE_SECRET_KEY,而 trace-proxy-request.ts 内部使用的是 isLangfuseEnabled() (来自 @/lib/langfuse/index)。两处判断逻辑如果后续出现分歧(例如 isLangfuseEnabled 增加采样率判断),会导致不必要的动态 import 或跳过 trace。建议统一使用 isLangfuseEnabled()

建议统一为 isLangfuseEnabled()
+import { isLangfuseEnabled } from "@/lib/langfuse/index";
+
 function emitLangfuseTrace(
   session: ProxySession,
   data: { ... }
 ): void {
-  if (!process.env.LANGFUSE_PUBLIC_KEY || !process.env.LANGFUSE_SECRET_KEY) return;
+  if (!isLangfuseEnabled()) return;
 
   void import("@/lib/langfuse/trace-proxy-request")
🤖 Prompt for AI Agents
In `@src/app/v1/_lib/proxy/response-handler.ts` around lines 46 - 84, The
emitLangfuseTrace function currently checks process.env directly which diverges
from trace-proxy-request's isLangfuseEnabled; replace the direct env check with
a call to isLangfuseEnabled() (imported from "@/lib/langfuse" or
"@/lib/langfuse/index") so both use the same enablement logic, i.e., call
isLangfuseEnabled() at the top of emitLangfuseTrace and only perform the dynamic
import/traceProxyRequest when it returns true, preserving the existing error
logging behavior for the import/trace call.

Comment on lines +26 to +37
function buildRequestBodySummary(session: ProxySession): Record<string, unknown> {
const msg = session.request.message as Record<string, unknown>;
return {
model: session.request.model,
messageCount: session.getMessagesLength(),
hasSystemPrompt: Array.isArray(msg.system) && msg.system.length > 0,
toolsCount: Array.isArray(msg.tools) ? msg.tools.length : 0,
stream: msg.stream === true,
maxTokens: typeof msg.max_tokens === "number" ? msg.max_tokens : undefined,
temperature: typeof msg.temperature === "number" ? msg.temperature : undefined,
};
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

hasSystemPrompt 仅检查数组形式的 system prompt。

Line 31 使用 Array.isArray(msg.system) 来判断是否有系统提示。但 Claude API 的 system 字段也可以是字符串(非数组)。这会导致字符串形式的系统提示在 requestSummary 中被报告为 hasSystemPrompt: false

建议同时检查字符串形式
-    hasSystemPrompt: Array.isArray(msg.system) && msg.system.length > 0,
+    hasSystemPrompt: Array.isArray(msg.system) ? msg.system.length > 0 : typeof msg.system === "string" && msg.system.length > 0,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function buildRequestBodySummary(session: ProxySession): Record<string, unknown> {
const msg = session.request.message as Record<string, unknown>;
return {
model: session.request.model,
messageCount: session.getMessagesLength(),
hasSystemPrompt: Array.isArray(msg.system) && msg.system.length > 0,
toolsCount: Array.isArray(msg.tools) ? msg.tools.length : 0,
stream: msg.stream === true,
maxTokens: typeof msg.max_tokens === "number" ? msg.max_tokens : undefined,
temperature: typeof msg.temperature === "number" ? msg.temperature : undefined,
};
}
function buildRequestBodySummary(session: ProxySession): Record<string, unknown> {
const msg = session.request.message as Record<string, unknown>;
return {
model: session.request.model,
messageCount: session.getMessagesLength(),
hasSystemPrompt: Array.isArray(msg.system) ? msg.system.length > 0 : typeof msg.system === "string" && msg.system.length > 0,
toolsCount: Array.isArray(msg.tools) ? msg.tools.length : 0,
stream: msg.stream === true,
maxTokens: typeof msg.max_tokens === "number" ? msg.max_tokens : undefined,
temperature: typeof msg.temperature === "number" ? msg.temperature : undefined,
};
}
🤖 Prompt for AI Agents
In `@src/lib/langfuse/trace-proxy-request.ts` around lines 26 - 37, The
hasSystemPrompt check in buildRequestBodySummary currently only treats
msg.system as an array; update the logic in buildRequestBodySummary (which takes
a ProxySession and reads session.request.message as msg) to consider both
non-empty strings and non-empty arrays as a system prompt (e.g., treat typeof
msg.system === "string" && msg.system.trim().length > 0 OR
Array.isArray(msg.system) && msg.system.length > 0 as true); keep existing
handling for toolsCount, stream, maxTokens and temperature unchanged.

Comment on lines +962 to +995
describe("isLangfuseEnabled", () => {
const originalPublicKey = process.env.LANGFUSE_PUBLIC_KEY;
const originalSecretKey = process.env.LANGFUSE_SECRET_KEY;

afterEach(() => {
// Restore env
if (originalPublicKey !== undefined) {
process.env.LANGFUSE_PUBLIC_KEY = originalPublicKey;
} else {
delete process.env.LANGFUSE_PUBLIC_KEY;
}
if (originalSecretKey !== undefined) {
process.env.LANGFUSE_SECRET_KEY = originalSecretKey;
} else {
delete process.env.LANGFUSE_SECRET_KEY;
}
});

test("should return false when env vars are not set", () => {
delete process.env.LANGFUSE_PUBLIC_KEY;
delete process.env.LANGFUSE_SECRET_KEY;

// Direct function test (not using the mock)
const isEnabled = !!(process.env.LANGFUSE_PUBLIC_KEY && process.env.LANGFUSE_SECRET_KEY);
expect(isEnabled).toBe(false);
});

test("should return true when both keys are set", () => {
process.env.LANGFUSE_PUBLIC_KEY = "pk-lf-test-mock";
process.env.LANGFUSE_SECRET_KEY = "test-mock-not-real";

const isEnabled = !!(process.env.LANGFUSE_PUBLIC_KEY && process.env.LANGFUSE_SECRET_KEY);
expect(isEnabled).toBe(true);
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

isLangfuseEnabled 测试块并未测试真实的 isLangfuseEnabled 函数。

由于 Line 66-68 已将 @/lib/langfuse/index mock 为 () => langfuseEnabled,这里 Lines 980-995 实际测试的是 !!(process.env.X && process.env.Y) 这个内联表达式,而非真实模块中的 isLangfuseEnabled() 实现。如果真实函数的逻辑发生变化(如增加采样率检查),此测试不会捕获到回归。

建议将此 describe 块移到单独的测试文件中(不 mock @/lib/langfuse/index),或使用 vi.importActual 在此 block 内恢复真实实现。

🤖 Prompt for AI Agents
In `@tests/unit/langfuse/langfuse-trace.test.ts` around lines 962 - 995, The tests
in the "isLangfuseEnabled" describe block are checking a local inline expression
instead of calling the real isLangfuseEnabled implementation (the module
"@/lib/langfuse/index" is mocked earlier), so update the block to exercise the
real function: either move this describe to a separate test file that does not
mock "@/lib/langfuse/index" or restore the real implementation inside the block
using vi.importActual to import and call isLangfuseEnabled from
"@/lib/langfuse/index"; ensure you remove the inline
!!(process.env.LANGFUSE_PUBLIC_KEY && process.env.LANGFUSE_SECRET_KEY) checks
and instead call isLangfuseEnabled() so changes to the function (e.g., sampling
logic) are tested.

@github-actions
Copy link
Contributor

🧪 测试结果

测试类型 状态
代码质量
单元测试
集成测试
API 测试

总体结果: ✅ 所有测试通过

ding113 and others added 8 commits February 15, 2026 22:28
…est tracing

Add optional Langfuse observability integration that auto-enables when
LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY env vars are set. Traces
every proxy request with full context: provider chain, model, usage
metrics, cost, TTFB, headers (auth redacted), special settings, and
error details. Built on OpenTelemetry via @langfuse/otel with async
non-blocking fire-and-forget semantics and zero overhead when disabled.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…e-handler for complete data

The previous Langfuse integration fired traces in proxy-handler.ts
immediately after dispatch() returned. For streaming responses, the
response body hadn't been consumed yet, so responseText, usageMetrics,
costUsd, sseEventCount, and errorMessage were all undefined.

Move trace invocation into response-handler.ts at the exact points
where ALL data has been collected:
- Non-streaming standard path (after response body parsed + cost calculated)
- Non-streaming Gemini passthrough (after finalizeRequestStats)
- Streaming standard path (end of finalizeStream)
- Streaming Gemini passthrough (after finalizeRequestStats)
- Error/abort paths (inside persistRequestFailure)

Also enhance the trace quality:
- Generation input: actual request messages instead of summary object
- Generation output: actual parsed response body instead of empty/status
- Tags: add provider.name and model alongside providerType
- Metadata: add model, originalModel, endpoint, sessionId, keyName,
  requestSequence, sseEventCount, and requestSummary
- Root span output: include costUsd
- Add truncateForLangfuse() helper (configurable via LANGFUSE_MAX_IO_SIZE)
- TraceContext: replace response:Response with responseHeaders:Headers
- Import UsageMetrics type from response-handler
propagateAttributes() does not support input/output fields, and the
root span's input/output do not auto-inherit to the trace level. This
caused Langfuse UI to show Input: undefined and Output: undefined.

Add explicit rootSpan.updateTrace() call to set trace-level input/output
per Langfuse SDK documentation (Solution B: set input/output directly on
the trace).
… names

- Pass actual startTime/endTime to startObservation and end() so root
  span and generation reflect real request duration instead of ~0ms
- Change costDetails key from 'totalUsd' to 'total' per Langfuse SDK
  convention (values are already in USD)
- Change usageDetails keys from camelCase (cacheRead, cacheCreation) to
  snake_case (cache_read_input_tokens, cache_creation_input_tokens) so
  Langfuse UI can categorize them as input usage types
…serId

- Calculate raw cost (multiplier=1.0) separately for Langfuse while
  keeping multiplied cost for session billing and rate limiting
- Change userId from numeric ID to user.name for readable Langfuse traces
- Key name already available in trace metadata as keyName
…r-attempt events, and timing breakdown

Enrich Langfuse traces from a flat 2-observation structure to a proper
span hierarchy that reveals where request time is spent:

- Add forwardStartTime to ProxySession to mark when guard pipeline ends
- Create guard-pipeline child span (startTime -> forwardStartTime)
- Emit provider-attempt events for each failed chain item with
  WARNING/ERROR level classification
- Compute timingBreakdown (guardPipelineMs, upstreamTotalMs,
  ttfbFromForwardMs, tokenGenerationMs, failedAttempts) with
  Math.max(0) defensive guards against clock jitter
- Set LLM generation startTime to forwardStartTime for accurate
  upstream call duration
- Update test mocks to route by observation name, add 7 new tests
…est mock

- Add LANGFUSE_MAX_IO_SIZE to Zod env schema with int/min/max/default
  validation, replacing raw Number() parse that risked NaN on invalid input
- Use lazy getEnvConfig() accessor instead of module-level constant
- Add missing recordForwardStart() to proxy-handler test mock
…d cost breakdown

- Root span input/output now shows actual forwarded request body and response
  body instead of metadata summaries (summaries moved to root span metadata)
- Add observation level on root span: DEFAULT (200), WARNING (retried), ERROR
  (non-200/499)
- Add CostBreakdown type with input/output/cache_creation/cache_read/total
  categories, passed through to Langfuse costDetails
- Capture forwardedRequestBody in ProxySession from forwarder (both Gemini and
  standard paths)
- updateTrace uses actual bodies for trace-level input/output
…servability

- Remove truncateForLangfuse, sanitizeHeaders, REDACTED_HEADERS, and
  LANGFUSE_MAX_IO_SIZE env config -- observability platforms should
  store raw data without lossy transformations
- Replace sanitizeHeaders with headersToRecord (pass-through)
- Unify generation input to use actualRequestBody (forwardedRequestBody
  when available) instead of session.request.message, ensuring the
  complete forwarded request body is recorded everywhere
@ding113 ding113 force-pushed the feat/langfuse-integration branch from 0d84d3b to a04356e Compare February 15, 2026 14:29
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@src/lib/langfuse/trace-proxy-request.ts`:
- Around line 246-253: 当前把 messageContext?.user?.name 直接作为 userId 传给
propagateAttributes,可能会把真实姓名/邮箱等 PII 持久化到 Langfuse;请改为使用非敏感标识(例如内部 user
id/username slug:messageContext?.user?.id 或
messageContext?.user?.username)或在传递前对 name 做不可逆脱敏/哈希(调用项目内的哈希函数或新增 util,例如
hashUserIdentifier(name))后再赋给 userId,确保 propagateAttributes 调用和相关变量名(userId /
messageContext?.user?.name / propagateAttributes)保持一致。
- Around line 166-204: The generationMetadata object currently includes
sensitive/internal fields that should not be sent to external observability
services; update the code that builds generationMetadata (symbol:
generationMetadata in trace-proxy-request.ts) to separate or redact
internal-only fields (e.g., session.getProviderChain(), session.sessionId,
messageContext?.key?.name, headersToRecord(session.headers),
headersToRecord(ctx.responseHeaders), and any raw body summary from
buildRequestBodySummary(session)) from the external payload: create two objects
(externalMetadata and internalMetadata) or apply a whitelist/redaction based on
a config flag (e.g., sendInternalMetadata=false) and ensure only safe fields are
included in the externalMetadata sent out while keeping internalMetadata
logged/stored locally with appropriate access controls; also add a code comment
documenting this design decision and the config flag controlling exposure.
- Around line 27-33: headersToRecord currently returns all headers verbatim
which leaks credentials; update it to redact known auth-sensitive headers (e.g.,
Authorization, X-Api-Key, x-api-key, Anthropics-Api-Key, anthropic-api-key,
Api-Key, api-key, etc.) in a case-insensitive way by replacing their values with
a constant like "[REDACTED]" while preserving other headers, and ensure any
other call sites that serialize headers (the other occurrences referenced in the
review) use this redacting logic or call headersToRecord so credentials are
never sent to Langfuse; keep the header keys unchanged but normalize matching to
lower-case to detect sensitive names.
- Around line 232-243: The TypeScript assertions around startObservation (used
to create rootSpan) are compensating for incomplete type declarations in
`@langfuse/tracing` (you pass startTime and asType via `as`), so first try
updating `@langfuse/tracing` to a version that exposes these parameters; if no
updated typing exists, open an SDK typing issue/PR requesting startTime/asType
in the public types, and meanwhile keep the runtime assertions but add a concise
TODO comment near startObservation/rootSpan documenting the typing mismatch
(reference startObservation, rootSpan, startTime, asType) so future maintainers
know to remove the `as` once the SDK types are fixed.
🧹 Nitpick comments (3)
src/lib/langfuse/trace-proxy-request.ts (3)

98-119: failedAttempts 重复计算。

timingBreakdown.failedAttempts(Line 107)和 rootSpanLevel 判断中的 failedAttempts(Line 116-118)使用完全相同的逻辑各计算一次。可复用 timingBreakdown.failedAttempts

建议的修复
     // Compute observation level for root span
     let rootSpanLevel: ObservationLevel = "DEFAULT";
     if (statusCode < 200 || statusCode >= 300) {
       rootSpanLevel = "ERROR";
     } else {
-      const failedAttempts = session
-        .getProviderChain()
-        .filter((i) => !isSuccessReason(i.reason)).length;
-      if (failedAttempts >= 1) rootSpanLevel = "WARNING";
+      if (timingBreakdown.failedAttempts >= 1) rootSpanLevel = "WARNING";
     }

298-304: generationOutputactualResponseBody(Line 128-132)逻辑完全重复。

直接复用 actualResponseBody 即可,无需重新构造。

建议的修复
-        // Generation input/output = raw payload, no truncation
-        const generationInput = actualRequestBody;
-        const generationOutput = ctx.responseText
-          ? tryParseJsonSafe(ctx.responseText)
-          : isStreaming
-            ? { streaming: true, sseEventCount: ctx.sseEventCount }
-            : { statusCode };
+        const generationInput = actualRequestBody;
+        const generationOutput = actualResponseBody;

346-352: tryParseJsonSafe 对超大 response body 无保护。

ctx.responseText 可能非常大(如长对话或大量 tool output)。对其调用 JSON.parse 会在内存中生成完整的 JS 对象图,且该解析在 Lines 129 和 301 各执行一次(修复重复后仍至少一次)。对于 observability 路径,考虑是否需要对超大 payload 做截断或仅保留摘要。

@github-actions
Copy link
Contributor

🧪 测试结果

测试类型 状态
代码质量
单元测试
集成测试
API 测试

总体结果: ✅ 所有测试通过

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

14 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link

greptile-apps bot commented Feb 15, 2026

Additional Comments (1)

src/app/v1/_lib/proxy/response-handler.ts
early return prevents trace emission on missing keys

  if (!process.env.LANGFUSE_PUBLIC_KEY || !process.env.LANGFUSE_SECRET_KEY) return;

Use isLangfuseEnabled() for consistency with other files.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 20:20

Comment:
early return prevents trace emission on missing keys

```suggestion
  if (!process.env.LANGFUSE_PUBLIC_KEY || !process.env.LANGFUSE_SECRET_KEY) return;
```

Use `isLangfuseEnabled()` for consistency with other files.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

session.headers = client -> CCH (user's own key, safe to log).
Upstream provider API key (outboundKey) is injected by ProxyForwarder
into a separate Headers object, never present in traced headers.
@github-actions
Copy link
Contributor

🧪 测试结果

测试类型 状态
代码质量
单元测试
集成测试
API 测试

总体结果: ✅ 所有测试通过

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

14 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

const { NodeSDK: OtelNodeSDK } = await import("@opentelemetry/sdk-node");
const { LangfuseSpanProcessor: LfSpanProcessor } = await import("@langfuse/otel");

const sampleRate = Number.parseFloat(process.env.LANGFUSE_SAMPLE_RATE || "1.0");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bypasses env schema validation

The env schema (env.schema.ts) defines LANGFUSE_SAMPLE_RATE with z.coerce.number().min(0).max(1).default(1.0), but initLangfuse re-parses the raw process.env string directly. This means the 0–1 bounds check is bypassed at init time — a value like LANGFUSE_SAMPLE_RATE=5 would pass through to TraceIdRatioBasedSampler without validation. The same applies to LANGFUSE_BASE_URL and LANGFUSE_DEBUG on lines 33, 55, 57, 60.

Consider using getEnvConfig() instead:

import { getEnvConfig } from "@/lib/config/env.schema";
// ...
const env = getEnvConfig();
const sampleRate = env.LANGFUSE_SAMPLE_RATE;
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/langfuse/index.ts
Line: 28:28

Comment:
**Bypasses env schema validation**

The env schema (`env.schema.ts`) defines `LANGFUSE_SAMPLE_RATE` with `z.coerce.number().min(0).max(1).default(1.0)`, but `initLangfuse` re-parses the raw `process.env` string directly. This means the 0–1 bounds check is bypassed at init time — a value like `LANGFUSE_SAMPLE_RATE=5` would pass through to `TraceIdRatioBasedSampler` without validation. The same applies to `LANGFUSE_BASE_URL` and `LANGFUSE_DEBUG` on lines 33, 55, 57, 60.

Consider using `getEnvConfig()` instead:

```
import { getEnvConfig } from "@/lib/config/env.schema";
// ...
const env = getEnvConfig();
const sampleRate = env.LANGFUSE_SAMPLE_RATE;
```

How can I resolve this? If you propose a fix, please make it concise.

errorMessage?: string;
}
): void {
if (!process.env.LANGFUSE_PUBLIC_KEY || !process.env.LANGFUSE_SECRET_KEY) return;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant enabled check

emitLangfuseTrace checks process.env.LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY here, and then traceProxyRequest immediately checks isLangfuseEnabled() which performs the exact same check. This is not a bug, but the outer check could use isLangfuseEnabled() for consistency, or be removed entirely since traceProxyRequest already guards against it.

Suggested change
if (!process.env.LANGFUSE_PUBLIC_KEY || !process.env.LANGFUSE_SECRET_KEY) return;
if (!process.env.LANGFUSE_PUBLIC_KEY || !process.env.LANGFUSE_SECRET_KEY) return;

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 61:61

Comment:
**Redundant enabled check**

`emitLangfuseTrace` checks `process.env.LANGFUSE_PUBLIC_KEY` and `LANGFUSE_SECRET_KEY` here, and then `traceProxyRequest` immediately checks `isLangfuseEnabled()` which performs the exact same check. This is not a bug, but the outer check could use `isLangfuseEnabled()` for consistency, or be removed entirely since `traceProxyRequest` already guards against it.

```suggestion
  if (!process.env.LANGFUSE_PUBLIC_KEY || !process.env.LANGFUSE_SECRET_KEY) return;
```

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

@ding113 ding113 merged commit a6c20e5 into dev Feb 15, 2026
13 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in Claude Code Hub Roadmap Feb 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core dependencies Pull requests that update a dependency file enhancement New feature or request javascript Pull requests that update javascript code size/XL Extra Large PR (> 1000 lines)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant