Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Enhanced adaptive token pruning for neural sparse search #989

Open
martin-gaievski opened this issue Nov 16, 2024 · 1 comment

Comments

@martin-gaievski
Copy link
Member

Enhance the basic token pruning mechanism with adaptive capabilities to optimize storage efficiency while preserving search quality.

The basic token pruning (covered in #946) uses fixed thresholds and limits. This enhancement proposes adaptive mechanisms that automatically adjust pruning parameters based on content characteristics and quality metrics.

Proposed Functionality

1. Dynamic Threshold Adjustment

PUT _neural/sparse_model/config
{
  "name": "adaptive_pruning_config",
  "pruning": {
    "mode": "adaptive",
    "quality_target": 0.95,
    "token_budget": {
      "min": 50,
      "max": 500
    },
    "weight_threshold": {
      "base": 0.001,
      "adaptive": true
    }
  }
}

2. Quality Preservation

  • Monitor quality metrics during pruning
  • Adjust parameters to maintain specified quality target
  • Support different quality metrics (precision, recall, MRR)

3. Token Importance Analysis

  • Analyze semantic importance of tokens
  • Consider token relationships
  • Preserve critical tokens for search quality

If implemented, solution promises following benefits:

  • Improved storage efficiency
  • Better search quality preservation
  • Automatic adaptation to content
  • Reduced manual configuration

As of now I do see following dependencies

  • requires basic token pruning ([Enhancement] Implement pruning for neural sparse search #988)
  • neural sparse search functionality
  • metrics collection framework, need to implement stats collection for pruning metrics (new component), can leverage OpenSearch core stats functionality, will need to add pruning-specific metrics collection
@martin-gaievski martin-gaievski changed the title [FEATURE] Enhanced Adaptive Token Pruning for Neural Sparse Search [FEATURE] Enhanced adaptive token pruning for neural sparse search Nov 16, 2024
@dblock dblock removed the untriaged label Dec 9, 2024
@dblock
Copy link
Member

dblock commented Dec 9, 2024

[Catch All Triage - 1, 2, 3, 4]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants