Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] TEI Gaudi 2 image is failing to launch #815

Open
2 of 6 tasks
ezelanza opened this issue Sep 15, 2024 · 2 comments
Open
2 of 6 tasks

[Bug] TEI Gaudi 2 image is failing to launch #815

ezelanza opened this issue Sep 15, 2024 · 2 comments
Assignees
Labels

Comments

@ezelanza
Copy link

ezelanza commented Sep 15, 2024

Priority

Undecided

OS type

Ubuntu

Hardware type

Gaudi2

Installation method

  • Pull docker images from hub.docker.com
  • Build docker images from source

Deploy method

  • Docker compose
  • Docker
  • Kubernetes
  • Helm

Running nodes

Single Node

What's the version?

latest

Description

TEI Gaudi image is not lauching due to errors.

Reproduce steps

After running the docker compose the image "opea/tei-gaudi:latest " started but it fails after launching

Raw log

/ChatQnA/docker_compose/intel/hpu/gaudi$ docker logs 349bc3685e97
2024-09-15T19:24:20.318465Z  INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "BAA*/***-****-**-v1.5", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: true, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "349bc3685e97", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2024-09-15T19:24:20.318939Z  INFO hf_hub: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token"    
2024-09-15T19:24:20.445084Z  INFO download_pool_config: text_embeddings_core::download: core/src/download.rs:45: Downloading `1_Pooling/config.json`
2024-09-15T19:24:21.565035Z  INFO download_new_st_config: text_embeddings_core::download: core/src/download.rs:108: Downloading `config_sentence_transformers.json`
2024-09-15T19:24:21.693734Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:20: Starting download
2024-09-15T19:24:21.693774Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:22: Downloading `config.json`
2024-09-15T19:24:21.823411Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:25: Downloading `tokenizer.json`
2024-09-15T19:24:22.137401Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:52: Downloading `model.safetensors`
2024-09-15T19:24:43.732689Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:39: Model artifacts downloaded in 22.038954482s
2024-09-15T19:24:44.030320Z  INFO text_embeddings_router: router/src/lib.rs:199: Maximum number of tokens per request: 512
2024-09-15T19:24:44.049828Z  INFO text_embeddings_core::tokenization: core/src/tokenization.rs:26: Starting 152 tokenization workers
2024-09-15T19:24:44.374782Z  INFO text_embeddings_router: router/src/lib.rs:250: Starting model backend
2024-09-15T19:24:44.375230Z  INFO text_embeddings_backend_python::management: backends/python/src/management.rs:58: Starting Python backend
2024-09-15T19:24:48.899061Z  WARN python-backend: text_embeddings_backend_python::logging: backends/python/src/logging.rs:39: Could not import Flash Attention enabled models: No module named 'dropout_layer_norm'

2024-09-15T19:24:50.018977Z ERROR python-backend: text_embeddings_backend_python::logging: backends/python/src/logging.rs:40: Error when initializing model
Traceback (most recent call last):
  File "/usr/local/bin/python-text-embeddings-server", line 8, in <module>
    sys.exit(app())
  File "/usr/local/lib/python3.10/dist-packages/typer/main.py", line 311, in __call__
    return get_command(self)(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/typer/core.py", line 716, in main
    return _main(
  File "/usr/local/lib/python3.10/dist-packages/typer/core.py", line 216, in _main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/typer/main.py", line 683, in wrapper
    return callback(**use_params)  # type: ignore
  File "/usr/src/backends/python/server/text_embeddings_server/cli.py", line 51, in serve
    server.serve(model_path, dtype, uds_path)
  File "/usr/src/backends/python/server/text_embeddings_server/server.py", line 88, in serve
    asyncio.run(serve_inner(model_path, dtype))
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
    self.run_forever()
  File "/usr/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
    self._run_once()
  File "/usr/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
    handle._run()
  File "/usr/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
> File "/usr/src/backends/python/server/text_embeddings_server/server.py", line 57, in serve_inner
    model = get_model(model_path, dtype)
  File "/usr/src/backends/python/server/text_embeddings_server/models/__init__.py", line 56, in get_model
    raise ValueError("CPU device only supports float32 dtype")
ValueError: CPU device only supports float32 dtype

Error: Could not create backend

Caused by:
Could not start backend: Python backend failed to start
@lvliang-intel
Copy link
Collaborator

lvliang-intel commented Sep 16, 2024

@ezelanza
Please share your command of launching the service. I just verified the image opea/tei-gaudi:latest, it works well on my Gaudi2 server.

docker run -p 9780:80 -v $volume:/data -e http_proxy=$http_proxy -e https_proxy=$https_proxy --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e MAX_WARMUP_SEQUENCE_LENGTH=512 --cap-add=sys_nice --ipc=host opea/tei-gaudi:latest --model-id $model --pooling cls

image
image

@ezelanza
Copy link
Author

It only works on my Gaudi 2 environment separately, but I'm still getting that error when I'm running the docker compose file from cd GenAIExamples/ChatQnA/docker_compose/intel/hpu/gaudi/ docker compose up -d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants