-
Notifications
You must be signed in to change notification settings - Fork 329
Closed
Description
Model description
TEI currently does not seem to support Nomic's latest embedding model: nomic-ai/nomic-embed-text-v2-moe. When I try to load the model using the container image ghcr.io/huggingface/text-embeddings-inference:1.6.0, I get this error:
tei | 2025-02-21T12:03:12.972121Z INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "nom**-**/*****-*****-****-**-moe", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "379cf6f93689", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
tei | 2025-02-21T12:03:12.972326Z INFO hf_hub: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token"
tei | 2025-02-21T12:03:13.061604Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:20: Starting download
tei | 2025-02-21T12:03:13.061624Z INFO download_artifacts:download_pool_config: text_embeddings_core::download: core/src/download.rs:53: Downloading `1_Pooling/config.json`
tei | 2025-02-21T12:03:13.061695Z INFO download_artifacts:download_new_st_config: text_embeddings_core::download: core/src/download.rs:77: Downloading `config_sentence_transformers.json`
tei | 2025-02-21T12:03:13.061735Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:40: Downloading `config.json`
tei | 2025-02-21T12:03:13.061771Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:43: Downloading `tokenizer.json`
tei | 2025-02-21T12:03:13.061798Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:47: Model artifacts downloaded in 195.324µs
tei | 2025-02-21T12:03:13.529301Z INFO text_embeddings_router: router/src/lib.rs:188: Maximum number of tokens per request: 512
tei | 2025-02-21T12:03:13.529350Z INFO text_embeddings_core::tokenization: core/src/tokenization.rs:28: Starting 4 tokenization workers
tei | 2025-02-21T12:03:14.590727Z INFO text_embeddings_router: router/src/lib.rs:230: Starting model backend
tei | 2025-02-21T12:03:14.590785Z INFO text_embeddings_backend: backends/src/lib.rs:360: Downloading `model.safetensors`
tei | 2025-02-21T12:03:14.590856Z INFO text_embeddings_backend: backends/src/lib.rs:244: Model weights downloaded in 71.202µs
tei | 2025-02-21T12:03:15.057394Z INFO text_embeddings_backend_candle: backends/candle/src/lib.rs:332: Starting FlashNomicBert model on Cuda(CudaDevice(DeviceId(1)))
tei | 2025-02-21T12:03:15.057766Z ERROR text_embeddings_backend: backends/src/lib.rs:255: Could not start Candle backend: Could not start backend: config is not supported
tei | Error: Could not create backend
tei |
tei | Caused by:
tei | Could not start backend: Could not start a suitable backend
tei exited with code 1
Open source status
- The model implementation is available
- The model weights are available
Provide useful links for the implementation
apage43, ZwAnto, wenlanyang and mattlawhon
Metadata
Metadata
Assignees
Labels
No labels