Image Scraper

This algorithm is aimed to be utilized as part of the backend for the Object Detection project. It takes two arguments, a keyword and request number. The algorithm searches the keyword from google images, scrapes the provided number of images, converts the images to an array of blobs, and returns the array.

The Source Code

The src code for this project is found in the app directory; app.py the actual scraping happens in scraper.

Dockerize the app and run locally

Go to the Dockerfile's directory in the terminal and issue these commands:

docker build -t image-scraper-fargate-container .

and

docker run -p 9000:8080 image-scraper-fargate-container

'image-scraper-fargate-container' will be the name of the created image and can be replaced with any other name.

Note

Make sure to run

pip3 install -r requirements.txt

to install the necessary modules.

Push the docker image to ECR

Connect docker client with AWS ECR (be sure to replace the region and AWS account ID with your own):

aws ecr get-login-password --region ap-northeast-2 | docker login --username AWS --password-stdin 190047048560.dkr.ecr.ap-northeast-2.amazonaws.com

Obtain the ID of the image that you are trying to push:

docker images

Tag the image using its ID:

docker tag d8a3b74c72ca 190047048560.dkr.ecr.ap-northeast-2.amazonaws.com/image-scraping-repo:latest

and finally push it to ECR:

docker push 190047048560.dkr.ecr.ap-northeast-2.amazonaws.com/image-scraping-repo:latest

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Dockerfile		Dockerfile
README.md		README.md
TODO		TODO
app.py		app.py
chrome_headless.zip		chrome_headless.zip
layer-headless_chrome-v0.2-beta.0.zip		layer-headless_chrome-v0.2-beta.0.zip
python.zip		python.zip
requirements.txt		requirements.txt
stable-headless-chromium-amazonlinux-2017-03.zip		stable-headless-chromium-amazonlinux-2017-03.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Scraper

The Source Code

Dockerize the app and run locally

Note

Push the docker image to ECR

About

Releases

Packages

Languages

Intizar-T/image-scraping

Folders and files

Latest commit

History

Repository files navigation

Image Scraper

The Source Code

Dockerize the app and run locally

Note

Push the docker image to ECR

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages