This folder contains all the test files for the MLentory ETL pipeline. The tests are organized to validate each component of the pipeline's functionality.
The tests are organized according to the main components of the ETL pipeline:
tests/unit/hf/extractors/test_HFExtractor.py
- Tests for the HuggingFace model extractiontests/unit/hf/extractors/test_ModelCardQAParser.py
- Tests for parsing model card information
tests/unit/hf/transform/test_FieldProcessorHF.py
- Tests for field processing and transformations
tests/unit/hf/load/test_Elasticsearch.py
- Tests for Elasticsearch integrationtests/unit/hf/load/test_GraphHandler.py
- Tests for graph data handlingtests/unit/hf/load/test_IndexHandler.py
- Tests for index managementtests/unit/hf/load/test_SQLHandler.py
- Tests for SQL database operations
- Build the test containers:
docker-compose --profile local build
- Start the test environment:
docker-compose --profile local up
- Access the test container:
docker ps # Find the test container ID
docker exec -it <test_container_name> /bin/bash
- Run the tests:
pytest # Run all tests
pytest tests/unit/hf/extractors/ # Run specific test directory
pytest tests/unit/hf/extractors/test_HFExtractor.py # Run specific test file
If you have WSL2 or are on a Unix-based system:
- Navigate to the tests directory
- Run:
bash scripts/validate_tests.sh
The test environment requires several services:
- PostgreSQL for SQL database testing
- Elasticsearch for search functionality
- Virtuoso for graph database operations
These services are automatically configured when using Docker Compose.
For more detailed information about the ETL pipeline components and their interactions, refer to the main documentation.