Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issuue when installing tesseract-ocr, poppler-utils, libmagic-dev, libgl1 #215

Open
shriharshan opened this issue Nov 21, 2024 · 1 comment

Comments

@shriharshan
Copy link

I am trying to Dockerize my application and the library I am using requires system dependencies like tesseract-ocr, poppler-utils, libmagic-dev, and libgl1 I tried installing these using dnf and microdnf but each time I try to Dockerize my app I am getting some of these packages are not found, Could anyone guide me in the right direction to install these and by the way I am using Lambda/python-3.12

FROM public.ecr.aws/lambda/python:3.12

RUN microdnf install -y dnf-plugins-core
RUN dnf config-manager --add-repo https://download.opensuse.org/repositories/home:Alexander_Pozdnyakov/Fedora_32/home:Alexander_Pozdnyakov.repo
RUN dnf update -y && dnf install -y \
    libmagic-dev \
    poppler-utils \
    tesseract-ocr \
    libgl1
WORKDIR /app

COPY requirements.txt .

# Install dependencies
RUN pip install -r requirements.txt --target "${LAMBDA_TASK_ROOT}"
    
COPY app/main.py "${LAMBDA_TASK_ROOT}/main.py"
COPY app/middleware.py "${LAMBDA_TASK_ROOT}/middleware.py"

CMD ["main.handler"]

This is my Docker file.

@leandrodamascena
Copy link

Hi @shriharshan! I'm not sure if you can install these packages using the base Lambda image for Python 3.12 because a few points:

1 - The python3.12 Lambda base image uses minimal linux container which implements microdnf instead of dnf and even if you install dnf-plugins-core you probably won't be able to run dnf config-manager --add-repo to add this new repository.

2 - This repository seems wrong, they don't support Feroda32, I could only see support for CentoOS and RHEL and even using them on an amazonlinux2023 image I wasn't able to install these packages due to dependency conflicts.

image

3 - You can try downloading the rpm for these libraries and installing it from there, but I think that will be a nightmare to deal with conflict resolution.

4 - If I were you, I would build this image from the python:3.12 image and then install those dependencies + aws_lambda_ric. Check the documentation here on how to build a custom image.

5 - The upstream python3.12 image is built on top of Debian, so you can use this Dockerfile to get some ideas and build your own image:

# Define custom function directory
ARG FUNCTION_DIR="/function"

FROM python:3.12 AS build-image

# Include global arg in this stage of the build
ARG FUNCTION_DIR

# Install necessary packages
RUN apt-get update && apt-get install -y \
    libmagic-dev \
    poppler-utils \
    tesseract-ocr \
    libgl1 \
    && rm -rf /var/lib/apt/lists/*

# Copy function code
RUN mkdir -p ${FUNCTION_DIR}
COPY . ${FUNCTION_DIR}

# Install the function's dependencies
RUN pip install \
    --target ${FUNCTION_DIR} \
        awslambdaric


COPY requirements.txt .

# Install dependencies
RUN pip install -r requirements.txt --target "${FUNCTION_DIR}"
    
COPY app/main.py "${FUNCTION_DIR}/main.py"
COPY app/middleware.py "${FUNCTION_DIR}/middleware.py"

# Set runtime interface client as default command for the container runtime
ENTRYPOINT [ "/usr/local/bin/python", "-m", "awslambdaric" ]
# Pass the name of the function handler as an argument to the runtime
CMD [ "lambda_function.handler" ]

I hope this help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants