-
Notifications
You must be signed in to change notification settings - Fork 3.2k
feat(kb-tags): natural language pre-filter tag system for knowledge base searches #800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
1 Skipped Deployment
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR implements a comprehensive natural language tagging system for knowledge base documents that enables pre-filtering before vector search operations. The system allows users to create custom, human-readable tag names (like 'Department' or 'Priority') instead of generic numbered labels, while maintaining efficient search performance through actual database columns.
The core architecture maps natural language tag names to fixed database columns (tag1-tag7) through a new knowledge_base_tag_definitions table. This design choice provides good performance for filtering operations while capping the system at 7 tags per knowledge base. The implementation includes:
- Database Layer: New migration adds
knowledge_base_tag_definitionstable and indexes tag columns on bothdocumentandembeddingtables for efficient filtering - API Layer: New endpoints for managing tag definitions at the knowledge base level, with proper CRUD operations and cleanup utilities
- Frontend Components: New React components (
KnowledgeTagFilters,DocumentTagEntry,KnowledgeTagFilter) that integrate with the existing sub-block workflow system - Block Integration: Updated Knowledge block to use the new tag system, replacing individual tag1-tag7 inputs with dynamic tag management
- Search Enhancement: Modified search functionality to support OR logic within tag groups and restrict to single knowledge base searches
The system maintains backward compatibility with existing tag workflows while providing a more intuitive user experience. Users can now define meaningful tag categories and filter knowledge base content using natural language terms before expensive vector similarity searches.
Confidence score: 4/5
- This PR introduces significant new functionality with proper architectural design and maintains backward compatibility
- The implementation follows established patterns and includes comprehensive error handling and validation
- Potential concerns include some type assertions, magic number usage for ID generation, and the restriction to single knowledge base searches which may impact existing workflows
27 files reviewed, 15 comments
...rkflow-block/components/sub-block/components/knowledge-tag-filters/knowledge-tag-filters.tsx
Show resolved
Hide resolved
...nts/workflow-block/components/sub-block/components/document-tag-entry/document-tag-entry.tsx
Show resolved
Hide resolved
apps/sim/app/api/knowledge/[id]/documents/[documentId]/route.ts
Outdated
Show resolved
Hide resolved
apps/sim/app/api/knowledge/[id]/documents/[documentId]/tag-definitions/route.ts
Show resolved
Hide resolved
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
…m into feat/kb-tags-natural-desc
…ase searches (simstudioai#800) * fix lint * checkpoint * works * simplify * checkpoint * works * fix lint * checkpoint - create doc ui * working block * fix import conflicts * fix tests * add blockers to going past max tag slots * remove console logs * forgot a few * Update apps/sim/tools/knowledge/search.ts Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * remove console.warn * Update apps/sim/hooks/use-tag-definitions.ts Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * use tag slots consts in more places * remove duplicate title --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Description
Can set natural language tag names for knowledge base records to pre-filter before vector search.
System design:
Type of Change
How Has This Been Tested?
Test adding, editing tags in KB Block [Create Document Tool]. Search using filters in search tool.
Checklist:
bun run test)Security Considerations: