Auto crawling content of website using GPT-4O vision api.
!!! Due to limitation of performance, crawling result can be different compared to original website content. !!!
+@) the openai change of policy, after "gpt-4o-2024-05-13" model, can't be crawling text content of image.
- make ".env" file. ".env" file must have "OPENAI_API_KEY" information.
cd autocrawling_agent vi .env ...
- build docker image
docker build -t autocrawling_agent_image .
- run docker container
docker run -itd -p 8000:8000 --name autocrawling_agent_api_container autocrawling_agent_image
- docker exec to container
docker exec -it autocrawling_agent_api_container bash
- install playwright
cd /workspace playwright install
- start uvicorn server
uvicorn src.api.main:app --port 8000