Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker and和 WebUI #38

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Docker and和 WebUI #38

wants to merge 5 commits into from

Conversation

juliancoy
Copy link

@juliancoy juliancoy commented Jan 27, 2025

This Docker implementation makes it easy for people to run the server and a helpful web UI with a single command:
这个 Docker 实现让人们可以用一个命令轻松运行服务器和一个有用的 Web UI:

docker run -it --rm \
  -p 8000:8000 \
  -d \
  -v huggingface:/root/.cache/huggingface \
  -w /app \
  --gpus all \
  --name janus \
  -e MODEL_NAME=deepseek-ai/Janus-1.3B \
  julianfl0w/janus:latest

You can make sure its running by navigating to
打开浏览器并访问
http://localhost:8000/webui

or,
或者,
docker logs janus

NOTE: You will need NVIDIA Container Runtime or equivalent
注意:你需要 NVIDIA Container Runtime 或同等版本

@rmcc3
Copy link

rmcc3 commented Jan 28, 2025

Does not work. Results in a 404 Not Found.

@jiangwei-yu-pony
Copy link

http://localhost:8000/docs#/

Does not work. Results in a 404 Not Found.

@juliancoy
Copy link
Author

http://localhost:8000/docs#/ is working for me. What do the logs say?

docker logs janus

@GeekyAnt
Copy link

Thanks for sharing the docker script. It setups ok, but now i need to interface with it. The test page from http://localhost:8000/ after installer the docker goes to:
{"detail":"Not Found"}

The output from the janus logs is:

Python version is above 3.10, patching the collections module.
/opt/conda/lib/python3.10/site-packages/transformers/models/auto/image_processing_auto.py:590: FutureWarning: The image_processor_class argument is deprecated and will be removed in v4.42. Please use `slow_image_processor_class`, or `fast_image_processor_class` instead
  warnings.warn(
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
Some kwargs in processor config are unused and will not have any effect: add_special_token, sft_format, ignore_id, mask_prompt, image_tag, num_image_tokens.
Add image tag = <image_placeholder> to the tokenizer
INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO:     172.17.0.1:36728 - "GET / HTTP/1.1" 404 Not Found

@juliancoy
Copy link
Author

Thanks for sharing the docker script. It setups ok, but now i need to interface with it. The test page from http://localhost:8000/ after installer the docker goes to: {"detail":"Not Found"}

The output from the janus logs is:

Python version is above 3.10, patching the collections module.
/opt/conda/lib/python3.10/site-packages/transformers/models/auto/image_processing_auto.py:590: FutureWarning: The image_processor_class argument is deprecated and will be removed in v4.42. Please use `slow_image_processor_class`, or `fast_image_processor_class` instead
  warnings.warn(
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
Some kwargs in processor config are unused and will not have any effect: add_special_token, sft_format, ignore_id, mask_prompt, image_tag, num_image_tokens.
Add image tag = <image_placeholder> to the tokenizer
INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO:     172.17.0.1:36728 - "GET / HTTP/1.1" 404 Not Found

You should be good to go, as that means the server is up. Try http://localhost:8000/docs

@rmcc3
Copy link

rmcc3 commented Jan 28, 2025

Thanks for sharing the docker script. It setups ok, but now i need to interface with it. The test page from http://localhost:8000/ after installer the docker goes to: {"detail":"Not Found"}
The output from the janus logs is:

Python version is above 3.10, patching the collections module.
/opt/conda/lib/python3.10/site-packages/transformers/models/auto/image_processing_auto.py:590: FutureWarning: The image_processor_class argument is deprecated and will be removed in v4.42. Please use `slow_image_processor_class`, or `fast_image_processor_class` instead
  warnings.warn(
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
Some kwargs in processor config are unused and will not have any effect: add_special_token, sft_format, ignore_id, mask_prompt, image_tag, num_image_tokens.
Add image tag = <image_placeholder> to the tokenizer
INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO:     172.17.0.1:36728 - "GET / HTTP/1.1" 404 Not Found

You should be good to go, as that means the server is up. Try http://localhost:8000/docs

http://localhost:8000/docs works and http://localhost:8000 results in a 404, so, there is no usable UI.

@GeekyAnt
Copy link

You should be good to go, as that means the server is up. Try http://localhost:8000/docs

http://localhost:8000/docs works and http://localhost:8000 results in a 404, so, there is no usable UI.

You are right, thank you. Going to /docs gives me the FastAPI document which even i get can now use to get results. Thank you!

@GeekyAnt
Copy link

Ok super impressed so far and thanks for making it this easy to use. This is going to sound like a stupid question. But which size model is the docker using?

@juliancoy
Copy link
Author

juliancoy commented Jan 28, 2025

Ok super impressed so far and thanks for making it this easy to use. This is going to sound like a stupid question. But which size model is the docker using?

Good question, as the current quality isn't very high. I'll look into the possibility of running the Pro model. Currently it's Janus1.3B. Perhaps I'll make this settable as an environment variable

@GeekyAnt
Copy link

Good question, as the current quality isn't very high. I'll look into the possibility of running the Pro model. Currently it's Janus1.3B. Perhaps I'll make this settable as an environment variable

the low quality is has been great to explore, but i'm hoping the Pro one will give the cats less legs. An environment variable would be great.

@juliancoy
Copy link
Author

Pull the latest image, and try

docker run -it --rm \
  -p 8000:8000 \
  -d \
  -v huggingface:/root/.cache/huggingface \
  -w /app \
  --gpus all \
  --name janus \
  -e MODEL_NAME=deepseek-ai/Janus-Pro-7B \
  julianfl0w/janus:latest

And let me know how it goes. The model is too big for my humble RTX3060

@juliancoy juliancoy changed the title adds Dockerfile adds添加Dockerfile Jan 28, 2025
@evilaprojects
Copy link

The output is text instead of image, any idea how to fix that?
sshot

@juliancoy
Copy link
Author

looks like compiled PNG data. Either FastAPI needs to handle the generated image better or you need a better Web UI

@juliancoy
Copy link
Author

The output is text instead of image, any idea how to fix that? sshot

Pull the latest image, Run it, and navigate to:
http://localhost:8000/webui

It should look like this:
janus_webui

@juliancoy juliancoy changed the title adds添加Dockerfile Docker and WebUI Jan 29, 2025
@juliancoy juliancoy changed the title Docker and WebUI Docker and和 WebUI Jan 29, 2025
@evilaprojects
Copy link

The output is text instead of image, any idea how to fix that? sshot

Pull the latest image, Run it, and navigate to: http://localhost:8000/webui

It should look like this: janus_webui

it works

@mskyttner
Copy link

Does it need a lot from the graphics card? Is running with a NVIDIA T550 possible?

I got this at startup of the container:

Python version is above 3.10, patching the collections module.
/opt/conda/lib/python3.10/site-packages/transformers/models/auto/image_processing_auto.py:590: FutureWarning: The image_processor_class argument is deprecated and will be removed in v4.42. Please use `slow_image_processor_class`, or `fast_image_processor_class` instead
  warnings.warn(
Traceback (most recent call last):
  File "/app/demo/fastapi_app.py", line 27, in <module>
    vl_gpt = vl_gpt.to(torch.bfloat16).cuda()
  File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3070, in cuda
    return super().cuda(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 911, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 802, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 802, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 825, in _apply
    param_applied = fn(param)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 911, in <lambda>
    return self._apply(lambda t: t.cuda(device))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 400.00 MiB. GPU 0 has a total capacity of 3.81 GiB of which 57.44 MiB is free. Process 165489 has 3.76 GiB memory in use. Of the allocated memory 3.51 GiB is allocated by PyTorch, and 199.46 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

@juliancoy
Copy link
Author

juliancoy commented Jan 29, 2025

Does it need a lot from the graphics card? Is running with a NVIDIA T550 possible?

I got this at startup of the container:

Python version is above 3.10, patching the collections module.
/opt/conda/lib/python3.10/site-packages/transformers/models/auto/image_processing_auto.py:590: FutureWarning: The image_processor_class argument is deprecated and will be removed in v4.42. Please use `slow_image_processor_class`, or `fast_image_processor_class` instead
  warnings.warn(
Traceback (most recent call last):
  File "/app/demo/fastapi_app.py", line 27, in <module>
    vl_gpt = vl_gpt.to(torch.bfloat16).cuda()
  File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3070, in cuda
    return super().cuda(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 911, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 802, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 802, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 825, in _apply
    param_applied = fn(param)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 911, in <lambda>
    return self._apply(lambda t: t.cuda(device))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 400.00 MiB. GPU 0 has a total capacity of 3.81 GiB of which 57.44 MiB is free. Process 165489 has 3.76 GiB memory in use. Of the allocated memory 3.51 GiB is allocated by PyTorch, and 199.46 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Not enough memory on your GPU. I get this error when running Pro-7B model on my NVIDIA RTX 3060.

I can run
Janus-1.3B and
Janus-Pro-1b

@mskyttner
Copy link

When thinking about runnit it locally with a single command on many common hardware, I wonder if there is a llamafile for it out there somewhere?

@thiswillbeyourgithub
Copy link

Hi does anyone know of a project with an openai-like api to replace dall-e with janus pro? It would pair nicely with open-webui

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants