Skip to content

perf: tune skill embedding concurrency limit #403

@bug-ops

Description

@bug-ops

Problem

Skill matcher uses buffer_unordered(50) with no tuning or timeout, potentially spawning 50 concurrent HTTP requests.

File: crates/zeph-skills/src/matcher.rs lines 20-37

Current code:

stream::iter(skills.iter().enumerate())
    .map(|(i, skill)| { embed_fn(&skill.description) })
    .buffer_unordered(50)  // Unbounded for large skill sets
    .filter_map(|x| async { x })
    .collect()
    .await;

Impact

  • Memory: 175KB peak with 50 skills × 3.5KB embeddings
  • Network: May hit API rate limits
  • No timeout: Slow embeddings block entire batch

Solution

  1. Reduce concurrency to 20 (optimal for network I/O)
  2. Add per-embedding timeout:
.map(|(i, skill)| {
    let fut = timeout(Duration::from_secs(5), embed_fn(&skill.description));
    // handle timeout
})
.buffer_unordered(20)

Priority: P1
Effort: Small (1 hour)
Related to #391

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePerformance optimizationskillsSKILL.md system

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions