Skip to content

Conversation

@waleedlatif1
Copy link
Collaborator

Summary

  • modified embeddings utils to only index english docs

Type of Change

  • Bug fix

Testing

Tested manually

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link

vercel bot commented Nov 20, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
docs Ready Ready Preview Comment Nov 20, 2025 10:04pm

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Nov 20, 2025

Greptile Overview

Greptile Summary

This PR restricts embeddings processing to English documentation only, removing multi-language support. The default docs path is changed from docs to docs/en, and CI workflows are optimized to only trigger embeddings jobs when English docs change.

Key Changes:

  • Modified default path to only process English docs (docs/en instead of docs)
  • Added path filtering in CI to skip embeddings job when English docs unchanged
  • Removed staging environment support from embeddings workflow
  • Always runs with --clear flag to ensure database consistency
  • Updated help documentation to clarify English-only processing

Issues Found:

  • Critical workflow dependency bug: process-docs job depends on check-docs-changes which only runs on main branch, creating invalid dependency graph

Confidence Score: 3/5

  • This PR has a workflow dependency bug that will cause CI failures
  • The process-docs job in ci.yml has a dependency on check-docs-changes which only runs on main branch, but process-docs has a condition that could theoretically trigger on other branches. This creates an invalid dependency graph. The script changes are solid, but the workflow logic needs fixing.
  • Fix the dependency logic in .github/workflows/ci.yml before merging

Important Files Changed

File Analysis

Filename Score Overview
.github/workflows/ci.yml 4/5 Added path filter to only trigger docs embeddings when English docs change, improving CI efficiency
.github/workflows/docs-embeddings.yml 3/5 Removed staging environment support and simplified to only run on main branch with --clear flag
apps/sim/scripts/process-docs.ts 4/5 Updated default path to process only English docs (docs/en), with improved help text documenting the change

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub (main branch)
    participant CI as CI Workflow
    participant Filter as Path Filter
    participant Embed as Docs Embeddings Job
    participant Script as process-docs.ts
    participant DB as Database

    Dev->>GH: Push to main branch
    GH->>CI: Trigger CI workflow
    CI->>CI: Run tests & build AMD64 images
    
    par Check for docs changes
        CI->>Filter: Check if docs changed
        Filter->>Filter: Check paths:<br/>- apps/docs/content/docs/en/**<br/>- process-docs.ts<br/>- chunkers/**
        Filter-->>CI: docs_changed = true/false
    end
    
    alt docs_changed == true
        CI->>Embed: Trigger docs-embeddings workflow
        Embed->>Script: Run process-docs.ts --clear
        Script->>Script: Read from docs/en only
        Script->>Script: Generate embeddings for English docs
        Script->>DB: Clear existing embeddings
        Script->>DB: Insert new English-only embeddings
        DB-->>Embed: Success
    else docs_changed == false
        CI->>CI: Skip docs processing
    end
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@waleedlatif1
Copy link
Collaborator Author

@greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@waleedlatif1 waleedlatif1 merged commit 4a0450d into staging Nov 20, 2025
4 of 5 checks passed
@waleedlatif1 waleedlatif1 deleted the fix/embeddings branch November 20, 2025 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants