Skip to content

Remove Legacy WatsonX Implementation #219

@manavgup

Description

@manavgup

Problem

The codebase currently has two WatsonX implementations:

  1. Legacy: backend/vectordbs/utils/watsonx.py - Standalone utility functions
  2. Modern: backend/rag_solution/generation/providers/watsonx.py - Factory pattern implementation

Analysis

The modern version (rag_solution/generation/providers/watsonx.py) is actively used in core services:

  • PipelineService uses LLMProviderFactory.get_provider()
  • QuestionService uses the provider factory pattern
  • SystemInitializationService initializes WatsonX models

The legacy version is still imported in 14+ files but not used in core application logic.

Files Using Legacy Implementation

Vector Store Implementations (5 files):

  • backend/vectordbs/weaviate_store.py
  • backend/vectordbs/chroma_store.py
  • backend/vectordbs/milvus_store.py
  • backend/vectordbs/elasticsearch_store.py
  • backend/vectordbs/pinecone_store.py

Data Processing (3 files):

  • backend/rag_solution/data_ingestion/pdf_processor.py
  • backend/rag_solution/data_ingestion/chunking.py
  • backend/rag_solution/doc_utils.py

Evaluation & Analysis (3 files):

  • backend/rag_solution/evaluation/evaluator.py
  • backend/rag_solution/evaluation/llm_as_judge_evals.py
  • backend/rag_solution/query_rewriting/query_rewriter.py

Tests & Documentation (3 files):

  • backend/tests/unit/test_settings_dependency_injection.py
  • backend/SETTINGS_MIGRATION_PLAN.md
  • docs/fixes/TEST_ISOLATION.md

Solution

  1. Update all imports from vectordbs.utils.watsonx to use LLMProviderFactory
  2. Replace direct function calls with provider pattern:
    • get_embeddings()provider.get_embeddings()
    • generate_text()provider.generate_text()
  3. Update vector stores to use provider factory for embeddings
  4. Update data processing to use provider pattern for text generation
  5. Remove legacy file backend/vectordbs/utils/watsonx.py

Benefits

  • Consistent architecture - all LLM interactions use the same provider pattern
  • Better testability - providers can be easily mocked via factory
  • Database-driven configuration - no hardcoded settings dependencies
  • Resource management - proper client lifecycle management
  • Error handling - standardized LLM provider error handling

Priority

High - This cleanup will improve code consistency and maintainability.

Acceptance Criteria

  • All 14 files updated to use modern provider pattern
  • Legacy vectordbs/utils/watsonx.py file removed
  • All tests pass
  • No breaking changes to existing functionality
  • Documentation updated to reflect new patterns

Metadata

Metadata

Assignees

No one assigned

    Labels

    backendBackend/API relatedgood first issueGood for newcomersllmLLM providers and integration

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions