Skip to content
This repository has been archived by the owner on Oct 16, 2023. It is now read-only.

Can not start the Bloom server #191

Open
SAI990323 opened this issue Feb 18, 2023 · 3 comments
Open

Can not start the Bloom server #191

SAI990323 opened this issue Feb 18, 2023 · 3 comments
Assignees

Comments

@SAI990323
Copy link

SAI990323 commented Feb 18, 2023

Infomation
V100
CUDA 11.3
transformers==4.23.1
torch==1.12.0
colossalai==0.2.5
energonai==0.0.1+torch1.12cu11.3
running for bloom-560m & bloom-7b1
Question
When I try to start the bloom server using the examples in this link, I find it stops in this scenario.
image
I do not meet any errors and I can not send request to http://[ip]:[host]//generation.

@cauyxy
Copy link

cauyxy commented Feb 23, 2023

Is there any other information? The normal startup situation should be as shown in the figure below
image

@baibaiw5
Copy link

baibaiw5 commented Feb 28, 2023

I have meet the same problem.I start bloom540 with docker:hpcaitech/energon-ai:latest
Infomation
4090
CUDA 11.3
transformers 4.24.0
colossalai 0.2.0+torch1.12cu11.3
energonai 0.0.1+torch1.12cu11.3
torch 1.12.1

running for bloom-560m & bloom-7b1
The application is hang.No other logs is print
image

image

@baibaiw5
Copy link

I have meet the same problem.I start bloom540 with docker:hpcaitech/energon-ai:latest Infomation 4090 CUDA 11.3 transformers 4.24.0 colossalai 0.2.0+torch1.12cu11.3 energonai 0.0.1+torch1.12cu11.3 torch 1.12.1

running for bloom-560m & bloom-7b1 The application is hang.No other logs is print image

image

comment random_init in run.sh .now it can be started
python server.py --tp ${GPU_NUM} --name ${DATASET} --dtype "int8" --max_batch_size 4 --random_model_size "560m" #--random_init False

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants