Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,16 @@ FETCH_HEADERS_TIMEOUT=600000
FETCH_BODY_TIMEOUT=600000
MAX_RETRY_ATTEMPTS_DEFAULT=2 # 单供应商最大尝试次数(含首次调用),范围 1-10,留空使用默认值 2

# Langfuse Observability (optional, auto-enabled when keys are set)
# 功能说明:企业级 LLM 可观测性集成,自动追踪所有代理请求的完整生命周期
# - 配置 PUBLIC_KEY 和 SECRET_KEY 后自动启用
# - 支持 Langfuse Cloud 和自托管实例
LANGFUSE_PUBLIC_KEY= # Langfuse project public key (pk-lf-...)
LANGFUSE_SECRET_KEY= # Langfuse project secret key (sk-lf-...)
LANGFUSE_BASE_URL=https://cloud.langfuse.com # Langfuse server URL (self-hosted or cloud)
LANGFUSE_SAMPLE_RATE=1.0 # Trace sampling rate (0.0-1.0, default: 1.0 = 100%)
LANGFUSE_DEBUG=false # Enable Langfuse debug logging
Comment on lines +131 to +139
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

缺少 LANGFUSE_MAX_IO_SIZE 配置项说明。

env schema 中定义了 LANGFUSE_MAX_IO_SIZE(默认 100,000,范围 1-10,000,000),但 .env.example 中未包含该配置项。建议补充以便运维人员了解此可调参数。

建议补充
 LANGFUSE_SAMPLE_RATE=1.0                    # Trace sampling rate (0.0-1.0, default: 1.0 = 100%)
 LANGFUSE_DEBUG=false                        # Enable Langfuse debug logging
+LANGFUSE_MAX_IO_SIZE=100000                 # Max I/O size per trace (chars, default: 100000, max: 10000000)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Langfuse Observability (optional, auto-enabled when keys are set)
# 功能说明:企业级 LLM 可观测性集成,自动追踪所有代理请求的完整生命周期
# - 配置 PUBLIC_KEY 和 SECRET_KEY 后自动启用
# - 支持 Langfuse Cloud 和自托管实例
LANGFUSE_PUBLIC_KEY= # Langfuse project public key (pk-lf-...)
LANGFUSE_SECRET_KEY= # Langfuse project secret key (sk-lf-...)
LANGFUSE_BASE_URL=https://cloud.langfuse.com # Langfuse server URL (self-hosted or cloud)
LANGFUSE_SAMPLE_RATE=1.0 # Trace sampling rate (0.0-1.0, default: 1.0 = 100%)
LANGFUSE_DEBUG=false # Enable Langfuse debug logging
# Langfuse Observability (optional, auto-enabled when keys are set)
# 功能说明:企业级 LLM 可观测性集成,自动追踪所有代理请求的完整生命周期
# - 配置 PUBLIC_KEY 和 SECRET_KEY 后自动启用
# - 支持 Langfuse Cloud 和自托管实例
LANGFUSE_PUBLIC_KEY= # Langfuse project public key (pk-lf-...)
LANGFUSE_SECRET_KEY= # Langfuse project secret key (sk-lf-...)
LANGFUSE_BASE_URL=https://cloud.langfuse.com # Langfuse server URL (self-hosted or cloud)
LANGFUSE_SAMPLE_RATE=1.0 # Trace sampling rate (0.0-1.0, default: 1.0 = 100%)
LANGFUSE_DEBUG=false # Enable Langfuse debug logging
LANGFUSE_MAX_IO_SIZE=100000 # Max I/O size per trace (chars, default: 100000, max: 10000000)
🧰 Tools
🪛 dotenv-linter (4.0.0)

[warning] 135-135: [SpaceCharacter] The line has spaces around equal sign

(SpaceCharacter)


[warning] 135-135: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)


[warning] 136-136: [SpaceCharacter] The line has spaces around equal sign

(SpaceCharacter)


[warning] 136-136: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)


[warning] 137-137: [UnorderedKey] The LANGFUSE_BASE_URL key should go before the LANGFUSE_PUBLIC_KEY key

(UnorderedKey)


[warning] 137-137: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)


[warning] 138-138: [UnorderedKey] The LANGFUSE_SAMPLE_RATE key should go before the LANGFUSE_SECRET_KEY key

(UnorderedKey)


[warning] 138-138: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)


[warning] 139-139: [UnorderedKey] The LANGFUSE_DEBUG key should go before the LANGFUSE_PUBLIC_KEY key

(UnorderedKey)


[warning] 139-139: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)

🤖 Prompt for AI Agents
In @.env.example around lines 131 - 139, Add the missing LANGFUSE_MAX_IO_SIZE
environment variable documentation to .env.example: describe the variable name
LANGFUSE_MAX_IO_SIZE, its default value (100000), allowed range (1-10000000),
and its purpose (limits Langfuse I/O payload size) along with an example value
and note that it is optional; place it near the other LANGFUSE_* entries
(alongside LANGFUSE_SAMPLE_RATE and LANGFUSE_DEBUG) so operators can discover
and tune the setting.


Copy link
Contributor

@github-actions github-actions bot Feb 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] [COMMENT-INCOMPLETE] Missing LANGFUSE_MAX_IO_SIZE in .env.example

Location: .env.example:140

Why this is a problem: Tracing truncation uses LANGFUSE_MAX_IO_SIZE (src/lib/config/env.schema.ts:137, src/lib/langfuse/trace-proxy-request.ts:46), but operators will not discover/tune it via .env.example.

Suggested fix (add to the Langfuse block):

LANGFUSE_MAX_IO_SIZE=100000                 # Max chars of traced input/output payload (default: 100000)

# 智能探测配置
# 功能说明:当熔断器处于 OPEN 状态时,定期探测供应商以实现更快恢复
# - ENABLE_SMART_PROBING:是否启用智能探测(默认:false)
Expand Down
4 changes: 4 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,11 @@
"@hono/zod-openapi": "^1",
"@hookform/resolvers": "^5",
"@iarna/toml": "^2.2.5",
"@langfuse/client": "^4.6.1",
"@langfuse/otel": "^4.6.1",
"@langfuse/tracing": "^4.6.1",
"@lobehub/icons": "^2",
"@opentelemetry/sdk-node": "^0.212.0",
Comment on lines +47 to +51
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

@opentelemetry/sdk-node npm registry version 0.212.0

💡 Result:

@opentelemetry/sdk-node@0.212.0 does not appear to be published on the npm registry. The latest published npm version shown by multiple package indexes is 0.211.0. [1][2][3]

There is an upstream GitHub release tag experimental/v0.212.0 in the open-telemetry/opentelemetry-js repo, but that doesn’t necessarily mean the @opentelemetry/sdk-node npm package was published at 0.212.0. [4]

What to do

  • Install the latest available: npm i @opentelemetry/sdk-node@0.211.0 [1][2][3]
  • Or verify from your environment: npm view @opentelemetry/sdk-node versions --json | jq 'map(select(.=="0.212.0"))' (should return nothing if it’s unpublished)

Sources: [1][2][3][4]


🌐 Web query:

@langfuse/client @langfuse/otel @langfuse/tracing npm registry version 4.6.1

💡 Result:

I couldn’t find any evidence that version 4.6.1 exists on the npm registry for @langfuse/client, @langfuse/tracing, or @langfuse/otel based on publicly indexable sources.

What I could verify:

  • npm package pages (as indexed) show 4.0.0 for @langfuse/client, @langfuse/tracing, and @langfuse/otel. [1][2][3]
  • A registry mirror (Tessl) shows at least 4.2.0 for @langfuse/otel and @langfuse/tracing. [4][5]
  • A recent community post shows usage of ^4.5.1 for @langfuse/otel and @langfuse/tracing. [6]

To confirm from your environment (authoritative), run:

npm view `@langfuse/client` version
npm view `@langfuse/tracing` version
npm view `@langfuse/otel` version

npm view `@langfuse/client`@4.6.1 version
npm view `@langfuse/tracing`@4.6.1 version
npm view `@langfuse/otel`@4.6.1 version

Sources: [1] [2] [3] [4] [5] [6]


修正依赖包版本号:所指定版本在 npm 注册表中不存在。

  • @opentelemetry/sdk-node@0.212.0 未在 npm 注册表中发布。最新可用版本为 0.211.0。请更新为 ^0.211.0
  • @langfuse/client@4.6.1@langfuse/otel@4.6.1@langfuse/tracing@4.6.1 无法在公开来源验证版本 4.6.1 是否存在。已确认的版本有 4.0.04.2.04.5.1。请确认实际可用版本并更新(建议验证 4.5.1 是否符合需求)。
🤖 Prompt for AI Agents
In `@package.json` around lines 47 - 51, 在 package.json 中修正不可用的依赖版本:将
"@opentelemetry/sdk-node" 的版本从 "0.212.0" 改为 "^0.211.0",并将三个 Langfuse 包
"@langfuse/client", "@langfuse/otel", "@langfuse/tracing" 的版本从 "4.6.1" 更新为一个在
npm 上存在的版本(例如 "^4.5.1")或确认并替换为你们实际需要的已发布版本;保存后运行 npm/yarn 安装以验证没有版本解析错误并修复
package-lock.json 或 yarn.lock 中的锁定文件。

"@radix-ui/react-alert-dialog": "^1",
"@radix-ui/react-avatar": "^1",
"@radix-ui/react-checkbox": "^1",
Expand Down
5 changes: 4 additions & 1 deletion src/app/v1/_lib/proxy-handler.ts
Original file line number Diff line number Diff line change
Expand Up @@ -78,9 +78,12 @@ export async function handleProxyRequest(c: Context): Promise<Response> {
});
}

session.recordForwardStart();
const response = await ProxyForwarder.send(session);
const handled = await ProxyResponseHandler.dispatch(session, response);
return await attachSessionIdToErrorResponse(session.sessionId, handled);
const finalResponse = await attachSessionIdToErrorResponse(session.sessionId, handled);

return finalResponse;
} catch (error) {
logger.error("Proxy handler error:", error);
if (session) {
Expand Down
2 changes: 2 additions & 0 deletions src/app/v1/_lib/proxy/forwarder.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1648,6 +1648,7 @@ export class ProxyForwarder {

const bodyString = JSON.stringify(bodyToSerialize);
requestBody = bodyString;
session.forwardedRequestBody = bodyString;
}

// 检测流式请求:Gemini 支持两种方式
Expand Down Expand Up @@ -1974,6 +1975,7 @@ export class ProxyForwarder {

const bodyString = JSON.stringify(messageToSend);
requestBody = bodyString;
session.forwardedRequestBody = bodyString;

try {
const parsed = JSON.parse(bodyString);
Expand Down
179 changes: 170 additions & 9 deletions src/app/v1/_lib/proxy/response-handler.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ import { RateLimitService } from "@/lib/rate-limit";
import type { LeaseWindowType } from "@/lib/rate-limit/lease";
import { SessionManager } from "@/lib/session-manager";
import { SessionTracker } from "@/lib/session-tracker";
import { calculateRequestCost } from "@/lib/utils/cost-calculation";
import type { CostBreakdown } from "@/lib/utils/cost-calculation";
import { calculateRequestCost, calculateRequestCostBreakdown } from "@/lib/utils/cost-calculation";
import { hasValidPriceData } from "@/lib/utils/price-data";
import { isSSEText, parseSSEData } from "@/lib/utils/sse";
import { detectUpstreamErrorFromSseOrJsonText } from "@/lib/utils/upstream-error-detection";
Expand Down Expand Up @@ -39,6 +40,49 @@ export type UsageMetrics = {
output_image_tokens?: number;
};

/**
* Fire Langfuse trace asynchronously. Non-blocking, error-tolerant.
*/
function emitLangfuseTrace(
session: ProxySession,
data: {
responseHeaders: Headers;
responseText: string;
usageMetrics: UsageMetrics | null;
costUsd: string | undefined;
costBreakdown?: CostBreakdown;
statusCode: number;
durationMs: number;
isStreaming: boolean;
sseEventCount?: number;
errorMessage?: string;
}
): void {
if (!process.env.LANGFUSE_PUBLIC_KEY || !process.env.LANGFUSE_SECRET_KEY) return;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant enabled check

emitLangfuseTrace checks process.env.LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY here, and then traceProxyRequest immediately checks isLangfuseEnabled() which performs the exact same check. This is not a bug, but the outer check could use isLangfuseEnabled() for consistency, or be removed entirely since traceProxyRequest already guards against it.

Suggested change
if (!process.env.LANGFUSE_PUBLIC_KEY || !process.env.LANGFUSE_SECRET_KEY) return;
if (!process.env.LANGFUSE_PUBLIC_KEY || !process.env.LANGFUSE_SECRET_KEY) return;

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 61:61

Comment:
**Redundant enabled check**

`emitLangfuseTrace` checks `process.env.LANGFUSE_PUBLIC_KEY` and `LANGFUSE_SECRET_KEY` here, and then `traceProxyRequest` immediately checks `isLangfuseEnabled()` which performs the exact same check. This is not a bug, but the outer check could use `isLangfuseEnabled()` for consistency, or be removed entirely since `traceProxyRequest` already guards against it.

```suggestion
  if (!process.env.LANGFUSE_PUBLIC_KEY || !process.env.LANGFUSE_SECRET_KEY) return;
```

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.


void import("@/lib/langfuse/trace-proxy-request")
.then(({ traceProxyRequest }) => {
void traceProxyRequest({
session,
responseHeaders: data.responseHeaders,
durationMs: data.durationMs,
statusCode: data.statusCode,
isStreaming: data.isStreaming,
responseText: data.responseText,
usageMetrics: data.usageMetrics,
costUsd: data.costUsd,
costBreakdown: data.costBreakdown,
sseEventCount: data.sseEventCount,
errorMessage: data.errorMessage,
});
})
.catch((err) => {
logger.warn("[ResponseHandler] Langfuse trace failed", {
error: err instanceof Error ? err.message : String(err),
});
});
}
Comment on lines +46 to +84
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

emitLangfuseTrace 使用 process.env 直接检查,与 isLangfuseEnabled() 不一致。

Line 61 直接读取 process.env.LANGFUSE_PUBLIC_KEY / process.env.LANGFUSE_SECRET_KEY,而 trace-proxy-request.ts 内部使用的是 isLangfuseEnabled() (来自 @/lib/langfuse/index)。两处判断逻辑如果后续出现分歧(例如 isLangfuseEnabled 增加采样率判断),会导致不必要的动态 import 或跳过 trace。建议统一使用 isLangfuseEnabled()

建议统一为 isLangfuseEnabled()
+import { isLangfuseEnabled } from "@/lib/langfuse/index";
+
 function emitLangfuseTrace(
   session: ProxySession,
   data: { ... }
 ): void {
-  if (!process.env.LANGFUSE_PUBLIC_KEY || !process.env.LANGFUSE_SECRET_KEY) return;
+  if (!isLangfuseEnabled()) return;
 
   void import("@/lib/langfuse/trace-proxy-request")
🤖 Prompt for AI Agents
In `@src/app/v1/_lib/proxy/response-handler.ts` around lines 46 - 84, The
emitLangfuseTrace function currently checks process.env directly which diverges
from trace-proxy-request's isLangfuseEnabled; replace the direct env check with
a call to isLangfuseEnabled() (imported from "@/lib/langfuse" or
"@/lib/langfuse/index") so both use the same enablement logic, i.e., call
isLangfuseEnabled() at the top of emitLangfuseTrace and only perform the dynamic
import/traceProxyRequest when it returns true, preserving the existing error
logging behavior for the import/trace call.


/**
* 清理 Response headers 中的传输相关 header
*
Expand Down Expand Up @@ -520,6 +564,18 @@ export class ProxyResponseHandler {
duration,
errorMessageForFinalize
);

emitLangfuseTrace(session, {
responseHeaders: response.headers,
responseText,
usageMetrics: parseUsageFromResponseText(responseText, provider.providerType)
.usageMetrics,
costUsd: undefined,
statusCode,
durationMs: duration,
isStreaming: false,
errorMessage: errorMessageForFinalize,
});
} catch (error) {
if (!isClientAbortError(error as Error)) {
logger.error(
Expand Down Expand Up @@ -687,10 +743,11 @@ export class ProxyResponseHandler {
await trackCostToRedis(session, usageMetrics);
}

// 更新 session 使用量到 Redis(用于实时监控)
if (session.sessionId && usageMetrics) {
// 计算成本(复用相同逻辑)
let costUsdStr: string | undefined;
// Calculate cost for session tracking (with multiplier) and Langfuse (raw)
let costUsdStr: string | undefined;
let rawCostUsdStr: string | undefined;
let costBreakdown: CostBreakdown | undefined;
if (usageMetrics) {
try {
if (session.request.model) {
const priceData = await session.getCachedPriceDataByBillingSource();
Expand All @@ -704,14 +761,41 @@ export class ProxyResponseHandler {
if (cost.gt(0)) {
costUsdStr = cost.toString();
}
// Raw cost without multiplier for Langfuse
if (provider.costMultiplier !== 1) {
const rawCost = calculateRequestCost(
usageMetrics,
priceData,
1.0,
session.getContext1mApplied()
);
if (rawCost.gt(0)) {
rawCostUsdStr = rawCost.toString();
}
} else {
rawCostUsdStr = costUsdStr;
}
Comment on lines +764 to +777
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic for calculating the raw cost for Langfuse (without the provider's cost multiplier) appears to be duplicated from the handleStream method (lines 1700-1713). To improve maintainability, consider extracting this into a private helper method within the ProxyResponseHandler class.

// Cost breakdown for Langfuse (raw, no multiplier)
try {
costBreakdown = calculateRequestCostBreakdown(
usageMetrics,
priceData,
session.getContext1mApplied()
);
} catch {
/* non-critical */
}
}
}
} catch (error) {
logger.error("[ResponseHandler] Failed to calculate session cost, skipping", {
error: error instanceof Error ? error.message : String(error),
});
}
}

// 更新 session 使用量到 Redis(用于实时监控)
if (session.sessionId && usageMetrics) {
void SessionManager.updateSessionUsage(session.sessionId, {
inputTokens: usageMetrics.input_tokens,
outputTokens: usageMetrics.output_tokens,
Expand Down Expand Up @@ -782,6 +866,17 @@ export class ProxyResponseHandler {
providerName: provider.name,
statusCode,
});

emitLangfuseTrace(session, {
responseHeaders: response.headers,
responseText,
usageMetrics,
costUsd: rawCostUsdStr,
costBreakdown,
statusCode,
durationMs: Date.now() - session.startTime,
isStreaming: false,
});
} catch (error) {
// 检测 AbortError 的来源:响应超时 vs 客户端中断
const err = error as Error;
Expand Down Expand Up @@ -1220,6 +1315,18 @@ export class ProxyResponseHandler {
finalized.errorMessage ?? undefined,
finalized.providerIdForPersistence ?? undefined
);

emitLangfuseTrace(session, {
responseHeaders: response.headers,
responseText: allContent,
usageMetrics: parseUsageFromResponseText(allContent, provider.providerType)
.usageMetrics,
costUsd: undefined,
statusCode: finalized.effectiveStatusCode,
durationMs: duration,
isStreaming: true,
errorMessage: finalized.errorMessage ?? undefined,
});
} catch (error) {
const err = error instanceof Error ? error : new Error(String(error));
const clientAborted = session.clientAbortSignal?.aborted ?? false;
Expand Down Expand Up @@ -1588,11 +1695,13 @@ export class ProxyResponseHandler {
// 追踪消费到 Redis(用于限流)
await trackCostToRedis(session, usageForCost);

// 更新 session 使用量到 Redis(用于实时监控)
if (session.sessionId) {
let costUsdStr: string | undefined;
// Calculate cost for session tracking (with multiplier) and Langfuse (raw)
let costUsdStr: string | undefined;
let rawCostUsdStr: string | undefined;
let costBreakdown: CostBreakdown | undefined;
if (usageForCost) {
try {
if (usageForCost && session.request.model) {
if (session.request.model) {
const priceData = await session.getCachedPriceDataByBillingSource();
if (priceData) {
const cost = calculateRequestCost(
Expand All @@ -1604,14 +1713,41 @@ export class ProxyResponseHandler {
if (cost.gt(0)) {
costUsdStr = cost.toString();
}
// Raw cost without multiplier for Langfuse
if (provider.costMultiplier !== 1) {
const rawCost = calculateRequestCost(
usageForCost,
priceData,
1.0,
session.getContext1mApplied()
);
if (rawCost.gt(0)) {
rawCostUsdStr = rawCost.toString();
}
} else {
rawCostUsdStr = costUsdStr;
}
Comment on lines +1716 to +1729
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This is a duplication of the raw cost calculation logic found in handleNonStream (lines 760-773). As suggested in the other comment, refactoring this into a shared helper method would reduce code duplication.

// Cost breakdown for Langfuse (raw, no multiplier)
try {
costBreakdown = calculateRequestCostBreakdown(
usageForCost,
priceData,
session.getContext1mApplied()
);
} catch {
/* non-critical */
}
}
}
} catch (error) {
logger.error("[ResponseHandler] Failed to calculate session cost (stream), skipping", {
error: error instanceof Error ? error.message : String(error),
});
}
}

// 更新 session 使用量到 Redis(用于实时监控)
if (session.sessionId) {
const payload: SessionUsageUpdate = {
status: effectiveStatusCode >= 200 && effectiveStatusCode < 300 ? "completed" : "error",
statusCode: effectiveStatusCode,
Expand Down Expand Up @@ -1650,6 +1786,19 @@ export class ProxyResponseHandler {
providerId: providerIdForPersistence ?? session.provider?.id, // 更新最终供应商ID(重试切换后)
context1mApplied: session.getContext1mApplied(),
});

emitLangfuseTrace(session, {
responseHeaders: response.headers,
responseText: allContent,
usageMetrics: usageForCost,
costUsd: rawCostUsdStr,
costBreakdown,
statusCode: effectiveStatusCode,
durationMs: duration,
isStreaming: true,
sseEventCount: chunks.length,
errorMessage: streamErrorMessage ?? undefined,
});
};

try {
Expand Down Expand Up @@ -2919,6 +3068,18 @@ async function persistRequestFailure(options: {
});
}
}

// Emit Langfuse trace for error/abort paths
emitLangfuseTrace(session, {
responseHeaders: new Headers(),
responseText: "",
usageMetrics: null,
costUsd: undefined,
statusCode,
durationMs: duration,
isStreaming: phase === "stream",
errorMessage,
});
}

/**
Expand Down
16 changes: 16 additions & 0 deletions src/app/v1/_lib/proxy/session.ts
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,12 @@ export class ProxySession {
// Time To First Byte (ms). Streaming: first chunk. Non-stream: equals durationMs.
ttfbMs: number | null = null;

// Timestamp when guard pipeline finished and forwarding started (epoch ms).
forwardStartTime: number | null = null;

// Actual serialized request body sent to upstream (after all preprocessing).
forwardedRequestBody: string | null = null;

// Session ID(用于会话粘性和并发限流)
sessionId: string | null;

Expand Down Expand Up @@ -313,6 +319,16 @@ export class ProxySession {
return value;
}

/**
* Record the timestamp when guard pipeline finished and upstream forwarding begins.
* Called once; subsequent calls are no-ops.
*/
recordForwardStart(): void {
if (this.forwardStartTime === null) {
this.forwardStartTime = Date.now();
}
}

/**
* 设置 session ID
*/
Expand Down
19 changes: 19 additions & 0 deletions src/instrumentation.ts
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,15 @@ function warmupApiKeyVacuumFilter(): void {
export async function register() {
// 仅在服务器端执行
if (process.env.NEXT_RUNTIME === "nodejs") {
// Initialize Langfuse observability (no-op if env vars not set)
try {
const { initLangfuse } = await import("@/lib/langfuse");
await initLangfuse();
} catch (error) {
logger.warn("[Instrumentation] Langfuse initialization failed (non-critical)", {
error: error instanceof Error ? error.message : String(error),
});
}
// Skip initialization in CI environment (no DB connection needed)
if (process.env.CI === "true") {
logger.warn(
Expand Down Expand Up @@ -216,6 +225,16 @@ export async function register() {
});
}

// Flush Langfuse pending spans
try {
const { shutdownLangfuse } = await import("@/lib/langfuse");
await shutdownLangfuse();
} catch (error) {
logger.warn("[Instrumentation] Failed to shutdown Langfuse", {
error: error instanceof Error ? error.message : String(error),
});
}

// 尽力将 message_request 的异步批量更新刷入数据库(避免终止时丢失尾部日志)
try {
const { stopMessageRequestWriteBuffer } = await import(
Expand Down
Loading
Loading