Skip to content

Conversation

@waleedlatif1
Copy link
Collaborator

Summary

modify docs embeddings ci to only run on english documentation. docs embeddings were running in every language and updates were taking a while

Type of Change

  • Other: performance

Testing

Tested manually.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link

vercel bot commented Sep 26, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
sim Building Building Preview Comment Sep 26, 2025 0:51am
1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
docs Skipped Skipped Sep 26, 2025 0:51am

@waleedlatif1 waleedlatif1 merged commit e49cde7 into staging Sep 26, 2025
4 of 5 checks passed
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Summary

Modified the documentation embeddings CI process to target only English documentation by changing the default docs path from /docs to /docs/en. This optimization reduces processing time by excluding German, Spanish, French, Japanese, and Chinese documentation from the embeddings generation.

  • Performance improvement: Processes only English docs instead of all 6 languages
  • Path change: Updated default docsPath from docs/ to docs/en/
  • Maintains compatibility: Still allows custom path override via options parameter

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • Single line change that correctly appends '/en' to existing path structure, verified directory exists, maintains all existing functionality while improving performance
  • No files require special attention

Important Files Changed

File Analysis

Filename        Score        Overview
apps/sim/scripts/process-docs-embeddings.ts 5/5 Simple path modification to target only English docs - correctly appends '/en' to existing path structure

Sequence Diagram

sequenceDiagram
    participant CI as CI Pipeline
    participant Script as process-docs-embeddings.ts
    participant FS as File System
    participant Chunker as DocsChunker
    participant DB as Database

    Note over CI,DB: Before: Processing all languages
    CI->>Script: Execute with default config
    Script->>FS: Read docs from /docs/ (all languages)
    FS-->>Script: Return docs from en/, es/, fr/, de/, ja/, zh/
    Script->>Chunker: Process all language docs
    Chunker-->>Script: Generate embeddings for all languages
    Script->>DB: Store embeddings

    Note over CI,DB: After: Processing only English docs
    CI->>Script: Execute with modified config
    Script->>FS: Read docs from /docs/en/ (English only)
    FS-->>Script: Return docs from en/ only
    Script->>Chunker: Process English docs only
    Chunker-->>Script: Generate embeddings for English only
    Script->>DB: Store English embeddings
Loading

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants