Skip to content

JminJ/autocrawling_agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Autocrawling Agent

Auto crawling content of website using GPT-4O vision api.

!!! Due to limitation of performance, crawling result can be different compared to original website content. !!!
+@) the openai change of policy, after "gpt-4o-2024-05-13" model, can't be crawling text content of image. 

How to use

  1. make ".env" file. ".env" file must have "OPENAI_API_KEY" information.
    cd autocrawling_agent
    vi .env
    ...    
  2. build docker image
    docker build -t autocrawling_agent_image .
  3. run docker container
    docker run -itd -p 8000:8000 --name autocrawling_agent_api_container autocrawling_agent_image
  4. docker exec to container
    docker exec -it autocrawling_agent_api_container bash
  5. install playwright
    cd /workspace
    playwright install
  6. start uvicorn server
    uvicorn src.api.main:app --port 8000

About

Auto crawling website using GPT-4O vision api.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published