GitHub - rxhxn30/Web-Crawling: Web Crawler using Scrapy

A simple spider program using the Scrapy framework in python. This spider program scrapes the first 10 pages of the 'English Books' section in Amazon.in. It stores the book names and the respective authors in a mySQL table.

Setup

1] Open the project folder in pycharm/any other IDE.
2] Setup mySQL if not done.
3] Open Terminal

>> cd <project directory>
>> scrapy crawl spider1

4] Restart mySQL. The contents should be visible in a newly created table.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
amazonscraping		amazonscraping
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Setup

About

Releases

Packages

Languages

rxhxn30/Web-Crawling

Folders and files

Latest commit

History

Repository files navigation

Setup

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages