This spider is used to crawl the https://www.class-central.com and generate dataset containing the information of various courses. The spider gets the information of 15 data points. For more information check dataset.csv in Spider root directory.
- Python 3.x
- Scrapy
pip install scrapy
conda install -c anaconda scrapy
You can download it directly or by running the below command in your terminal
git clone https://github.com/darshan-majithiya/Class-Central-Spider.git
Go to Spider's root folder. And then run the command
scrapy crawl ClassCentral
You can also pass the specific domain as argument of which you want to get the course information. You can check the available domains on https://www.class-central.com/subjects.
scrapy crawl ClassCentral -a domain="Data Science"
And finally you can output the information to your desired data fromat (csv, json, and xml).
scrapy crawl ClassCentral -o dataset.csv