Skip to content

Latest commit

 

History

History
54 lines (35 loc) · 1.88 KB

README.md

File metadata and controls

54 lines (35 loc) · 1.88 KB

#Data analysis of Lagou LagouIcon ###Main Function

  1. scrape data from Lagou, and know the latest info of Internet career

  2. data analysis and visualize

  3. crawl job details info and generate word cloud as Job Impression

###Note Because lagou's back-end API has been changed, this repository may not work well.

I will try to fix these problems and publish V2.0 in the near future.

THX for your star and watching!

I will try my best to make it better and more robust with more new features as well!

Sorry for the inconvenience it may bring!

V2.0_ALPHA is developing ~

###Install Prerequisition

  1. Python Version >= 3.4
  2. Third Party Library:

pip install requests pip install beautifulsoup4 pip install jieba pip install openpyxl

###Basic Usage

  1. clone this project from github

  2. change the path of job.xml in lagouspider.py readconfig() method configmap = toolkit.readconfig(YourLocalPath)

  3. run lagouspider.py to get job data in JSON

  4. run excelhelper.py to generate every Excel file towards each job

  5. run jobdetailspider.py to get job recruitment details ----V1.3 updated

  6. run analyser.py to cut sentences, and return TOP20 hot words ----V1.3 updated

###Analysis Results

Image1 Image2 Image3 Image4 Image5

For more information, please visit my answer at Zhihu