Want to provide a Docker version #1

zsinba · 2024-05-22T03:56:05Z

Use Docker to configure the environment and provide API services to facilitate confirmation and service. Thanks for contribution

BUJIDAOVS · 2024-05-24T03:25:09Z

+1 plz

adithya-s-k · 2024-05-24T21:59:01Z

Coming soon !!

zsinba · 2024-05-25T01:20:39Z

Great

adithya-s-k · 2024-05-31T12:24:32Z

@zsinba @BUJIDAOVS i have added docker support and along with that added skypilot as well
Enjoy !!

zsinba · 2024-05-31T14:07:43Z

@zsinba @BUJIDAOVS i have added docker support and along with that added skypilot as well Enjoy !!

Great. I'll try it right away

zsinba · 2024-05-31T15:43:14Z

@BUJIDAOVS

INFO: 192.168.1.222:55630 - "POST /convert HTTP/1.1" 500 Internal Server Error ERROR: Exception in ASGI application Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 411, in run_asgi result = await app( # type: ignore[func-returns-value] File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 69, in __call__ return await self.app(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1054, in __call__ await super().__call__(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 123, in __call__ await self.middleware_stack(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 186, in __call__ raise exc File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 164, in __call__ await self.app(scope, receive, _send) File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 65, in __call__ await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 756, in __call__ await self.middleware_stack(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 776, in app await route.handle(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 297, in handle await self.app(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 77, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 72, in app response = await func(request) File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 278, in app raw_response = await run_endpoint_function( File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 191, in run_endpoint_function return await dependant.call(**values) File "/app/server.py", line 74, in convert_pdf_to_markdown markdown_text, metadata, image_data = parse_pdf_and_return_markdown(await pdf_file.read(), extract_images=extract_images) File "/app/server.py", line 17, in parse_pdf_and_return_markdown full_text, images, out_meta = convert_single_pdf(pdf_file, model_list) File "/app/marker/convert.py", line 65, in convert_single_pdf pages, toc = get_text_blocks( File "/app/marker/pdf/extract_text.py", line 85, in get_text_blocks char_blocks = dictionary_output(doc, page_range=page_range, keep_chars=True) File "/usr/local/lib/python3.10/dist-packages/pdftext/extraction.py", line 75, in dictionary_output pages = _get_pages(pdf_path, model, page_range, workers=workers) File "/usr/local/lib/python3.10/dist-packages/pdftext/extraction.py", line 26, in _get_pages pdf_doc = pdfium.PdfDocument(pdf_path) File "/usr/local/lib/python3.10/dist-packages/pypdfium2/_helpers/document.py", line 78, in __init__ self.raw, to_hold, to_close = _open_pdf(self._input, self._password, self._autoclose) File "/usr/local/lib/python3.10/dist-packages/pypdfium2/_helpers/document.py", line 674, in _open_pdf raise TypeError(f"Invalid input type '{type(input_data).__name__}'") TypeError: Invalid input type 'PdfDocument'

I'm sorry I didn't get the result

zsinba · 2024-05-31T15:44:03Z

if pdf_file:

adithya-s-k · 2024-05-31T17:07:31Z

will test it out and get back to you

adithya-s-k · 2024-06-01T14:05:59Z

I have updated the Docker image.

docker pull savatar101/marker-api:0.2
# If you are running on a GPU
docker run --gpus all -p 8000:8000 savatar101/marker-api:0.2
# Otherwise
docker run -p 8000:8000 savatar101/marker-api:0.2

Let me know if everything works properly.

In the next update, I will make the server more concurrent to handle multiple API requests simultaneously.

zsinba · 2024-06-01T14:51:28Z

I have updated the Docker image.
docker pull savatar101/marker-api:0.2
# If you are running on a GPU
docker run --gpus all -p 8000:8000 savatar101/marker-api:0.2
# Otherwise
docker run -p 8000:8000 savatar101/marker-api:0.2
Let me know if everything works properly.

In the next update, I will make the server more concurrent to handle multiple API requests simultaneously.

thanks a lot. I will try.

zsinba · 2024-06-02T03:41:18Z

it's ok now.

Thank you for sharing and giving

zsinba · 2024-06-02T03:42:36Z

I used a 4090 GPU to convert a 17-page PDF (plain text type) and used about 27.37 seconds, is this time within the normal range?

BUJIDAOVS · 2024-06-02T10:45:03Z

@adithya-s-k
It seems that the model needs to be downloaded from Hugging Face after starting from Docker. Is it possible to directly include the model in the Docker image? Chinese users are unable to download the model from Hugging Face directly.

zsinba · 2024-06-02T13:24:19Z

BUJIDAOVS

It is true, but the current image has 16G, and it will be larger, which is not convenient for later updates.

adithya-s-k · 2024-06-02T17:49:23Z

I used a 4090 GPU to convert a 17-page PDF (plain text type) and used about 27.37 seconds, is this time within the normal range?

Yep it takes about that much time to parse it
I will soon be adding support for optimised inference to speed the whole process

currently working on it

adithya-s-k · 2024-06-02T17:51:15Z

@adithya-s-k It seems that the model needs to be downloaded from Hugging Face after starting from Docker. Is it possible to directly include the model in the Docker image? Chinese users are unable to download the model from Hugging Face directly.

will create another docker image with the weights already present as an alternative but it might be around 20 to 25 gb in size

zsinba · 2024-06-02T17:52:58Z

Thanks a lot.

BUJIDAOVS · 2024-06-02T17:56:59Z

@adithya-s-k It seems that the model needs to be downloaded from Hugging Face after starting from Docker. Is it possible to directly include the model in the Docker image? Chinese users are unable to download the model from Hugging Face directly.

will create another docker image with the weights already present as an alternative but it might be around 20 to 25 gb in size

Thank you, I have already found a way to avoid re-downloading when restarting the container.

version: '3'

services:
  marker-api:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              capabilities: [gpu]
    image: savatar101/marker-api:0.2
    volumes:
      - /home/user/Documents/Projects/hf-download/pdf2md/huggingface:/root/.cache/huggingface
    environment:
      - CUDA_VISIBLE_DEVICES=0
      - HF_ENDPOINT=https://hf-mirror.com
    ports:
      - "17915:8000"

This requires extracting the downloaded models from Docker to the host machine after the first successful startup.

liwenlong520 · 2024-06-13T06:22:47Z

How do I install the Mac M1 Docker container?

liwenlong520 · 2024-06-13T06:23:40Z

Detecting bboxes: 0%| | 0/4 [00:00<?, ?it/s][W NNPACK.cpp:61] Could not initialize NNPACK! Reason: Unsupported hardware.

yemoutao · 2024-06-18T08:14:25Z

@adithya-s-k好像从Docker启动后需要从Hugging Face下载模型，能不能直接把模型包含到Docker镜像里？中国用户无法直接从Hugging Face下载模型。

将创建另一个具有现有权重的 docker 镜像作为替代方案，但它的大小可能约为 20 到 25 gb

谢谢，我已经找到了避免在重新启动容器时重新下载的方法。
version: '3'

services:
  marker-api:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              capabilities: [gpu]
    image: savatar101/marker-api:0.2
    volumes:
      - /home/user/Documents/Projects/hf-download/pdf2md/huggingface:/root/.cache/huggingface
    environment:
      - CUDA_VISIBLE_DEVICES=0
      - HF_ENDPOINT=https://hf-mirror.com
    ports:
      - "17915:8000"
这需要在第一次成功启动后将下载的模型从 Docker 提取到主机。

可以绑定主机名这个办法：

extra_hosts:
- "huggingface.co:13.33.174.80"
- "cdn-lfs.huggingface.co:13.33.174.80"
- "www.huggingface.co:13.33.174.80"

adithya-s-k closed this as completed May 31, 2024

adithya-s-k reopened this May 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Want to provide a Docker version #1

Want to provide a Docker version #1

zsinba commented May 22, 2024

BUJIDAOVS commented May 24, 2024

adithya-s-k commented May 24, 2024

zsinba commented May 25, 2024

adithya-s-k commented May 31, 2024

zsinba commented May 31, 2024

zsinba commented May 31, 2024

zsinba commented May 31, 2024

adithya-s-k commented May 31, 2024

adithya-s-k commented Jun 1, 2024

zsinba commented Jun 1, 2024

zsinba commented Jun 2, 2024

zsinba commented Jun 2, 2024

BUJIDAOVS commented Jun 2, 2024

zsinba commented Jun 2, 2024

adithya-s-k commented Jun 2, 2024 •

edited

Loading

adithya-s-k commented Jun 2, 2024

zsinba commented Jun 2, 2024

BUJIDAOVS commented Jun 2, 2024

liwenlong520 commented Jun 13, 2024

liwenlong520 commented Jun 13, 2024

yemoutao commented Jun 18, 2024

Want to provide a Docker version #1

Want to provide a Docker version #1

Comments

zsinba commented May 22, 2024

BUJIDAOVS commented May 24, 2024

adithya-s-k commented May 24, 2024

zsinba commented May 25, 2024

adithya-s-k commented May 31, 2024

zsinba commented May 31, 2024

zsinba commented May 31, 2024

zsinba commented May 31, 2024

adithya-s-k commented May 31, 2024

adithya-s-k commented Jun 1, 2024

zsinba commented Jun 1, 2024

zsinba commented Jun 2, 2024

zsinba commented Jun 2, 2024

BUJIDAOVS commented Jun 2, 2024

zsinba commented Jun 2, 2024

adithya-s-k commented Jun 2, 2024 • edited Loading

adithya-s-k commented Jun 2, 2024

zsinba commented Jun 2, 2024

BUJIDAOVS commented Jun 2, 2024

liwenlong520 commented Jun 13, 2024

liwenlong520 commented Jun 13, 2024

yemoutao commented Jun 18, 2024

adithya-s-k commented Jun 2, 2024 •

edited

Loading