[Bug] TEI Gaudi 2 image is failing to launch #815

ezelanza · 2024-09-15T19:29:13Z

Priority

Undecided

OS type

Ubuntu

Hardware type

Gaudi2

Installation method

Pull docker images from hub.docker.com
Build docker images from source

Deploy method

Docker compose
Docker
Kubernetes
Helm

Running nodes

Single Node

What's the version?

latest

Description

TEI Gaudi image is not lauching due to errors.

Reproduce steps

After running the docker compose the image "opea/tei-gaudi:latest " started but it fails after launching

Raw log

/ChatQnA/docker_compose/intel/hpu/gaudi$ docker logs 349bc3685e97
2024-09-15T19:24:20.318465Z  INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "BAA*/***-****-**-v1.5", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: true, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "349bc3685e97", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2024-09-15T19:24:20.318939Z  INFO hf_hub: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token"    
2024-09-15T19:24:20.445084Z  INFO download_pool_config: text_embeddings_core::download: core/src/download.rs:45: Downloading `1_Pooling/config.json`
2024-09-15T19:24:21.565035Z  INFO download_new_st_config: text_embeddings_core::download: core/src/download.rs:108: Downloading `config_sentence_transformers.json`
2024-09-15T19:24:21.693734Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:20: Starting download
2024-09-15T19:24:21.693774Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:22: Downloading `config.json`
2024-09-15T19:24:21.823411Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:25: Downloading `tokenizer.json`
2024-09-15T19:24:22.137401Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:52: Downloading `model.safetensors`
2024-09-15T19:24:43.732689Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:39: Model artifacts downloaded in 22.038954482s
2024-09-15T19:24:44.030320Z  INFO text_embeddings_router: router/src/lib.rs:199: Maximum number of tokens per request: 512
2024-09-15T19:24:44.049828Z  INFO text_embeddings_core::tokenization: core/src/tokenization.rs:26: Starting 152 tokenization workers
2024-09-15T19:24:44.374782Z  INFO text_embeddings_router: router/src/lib.rs:250: Starting model backend
2024-09-15T19:24:44.375230Z  INFO text_embeddings_backend_python::management: backends/python/src/management.rs:58: Starting Python backend
2024-09-15T19:24:48.899061Z  WARN python-backend: text_embeddings_backend_python::logging: backends/python/src/logging.rs:39: Could not import Flash Attention enabled models: No module named 'dropout_layer_norm'

2024-09-15T19:24:50.018977Z ERROR python-backend: text_embeddings_backend_python::logging: backends/python/src/logging.rs:40: Error when initializing model
Traceback (most recent call last):
  File "/usr/local/bin/python-text-embeddings-server", line 8, in <module>
    sys.exit(app())
  File "/usr/local/lib/python3.10/dist-packages/typer/main.py", line 311, in __call__
    return get_command(self)(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/typer/core.py", line 716, in main
    return _main(
  File "/usr/local/lib/python3.10/dist-packages/typer/core.py", line 216, in _main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/typer/main.py", line 683, in wrapper
    return callback(**use_params)  # type: ignore
  File "/usr/src/backends/python/server/text_embeddings_server/cli.py", line 51, in serve
    server.serve(model_path, dtype, uds_path)
  File "/usr/src/backends/python/server/text_embeddings_server/server.py", line 88, in serve
    asyncio.run(serve_inner(model_path, dtype))
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
    self.run_forever()
  File "/usr/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
    self._run_once()
  File "/usr/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
    handle._run()
  File "/usr/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
> File "/usr/src/backends/python/server/text_embeddings_server/server.py", line 57, in serve_inner
    model = get_model(model_path, dtype)
  File "/usr/src/backends/python/server/text_embeddings_server/models/__init__.py", line 56, in get_model
    raise ValueError("CPU device only supports float32 dtype")
ValueError: CPU device only supports float32 dtype

Error: Could not create backend

Caused by:
Could not start backend: Python backend failed to start

lvliang-intel · 2024-09-16T04:10:52Z

@ezelanza，
Please share your command of launching the service. I just verified the image opea/tei-gaudi:latest, it works well on my Gaudi2 server.

docker run -p 9780:80 -v $volume:/data -e http_proxy=$http_proxy -e https_proxy=$https_proxy --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e MAX_WARMUP_SEQUENCE_LENGTH=512 --cap-add=sys_nice --ipc=host opea/tei-gaudi:latest --model-id $model --pooling cls

ezelanza · 2024-09-16T07:36:57Z

It only works on my Gaudi 2 environment separately, but I'm still getting that error when I'm running the docker compose file from cd GenAIExamples/ChatQnA/docker_compose/intel/hpu/gaudi/ docker compose up -d

louie-tsai assigned ctao456 Sep 16, 2024

opea-project deleted a comment from ctao456 Sep 16, 2024

preethivenkatesh added the aitce label Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] TEI Gaudi 2 image is failing to launch #815

[Bug] TEI Gaudi 2 image is failing to launch #815

ezelanza commented Sep 15, 2024 •

edited

Loading

lvliang-intel commented Sep 16, 2024 •

edited

Loading

ezelanza commented Sep 16, 2024

[Bug] TEI Gaudi 2 image is failing to launch #815

[Bug] TEI Gaudi 2 image is failing to launch #815

Comments

ezelanza commented Sep 15, 2024 • edited Loading

Priority

OS type

Hardware type

Installation method

Deploy method

Running nodes

What's the version?

Description

Reproduce steps

Raw log

lvliang-intel commented Sep 16, 2024 • edited Loading

ezelanza commented Sep 16, 2024

ezelanza commented Sep 15, 2024 •

edited

Loading

lvliang-intel commented Sep 16, 2024 •

edited

Loading