Skip to content

Sparklist scripts (python mostly) for automating workflow, streamlining progress and directing energy to new ventures: new campaigns, outreach and more!

License

Notifications You must be signed in to change notification settings

ml-lubich/cal-club-scrapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CalClubScraper

Scrapes all the emails of UC Berkeley's (Cal) student organizations (clubs). Outputs a .csv of the clubs and contact email(s) per club like so:

Note: If there are more than 1 email, the entry is returned as a list of emails. Powered by Selenium and BeautifulSoup4.


Runtime Enviornment

CalClubScraper runs using pip3 packages. You also would need Python 3.6+ chromedriver may be flagged by the security system of platforms like MacOS, causing the program to crash. All you need to do: go to Security Preferences and click "Allow" to open the chromedriver executable. *Note: chromedriver is installed and deleted automatically by the program at runtime.


Installation Steps

  1. If you have not already, install Python 3.6+
  2. Install all pip3 required packages by running pip3 install -r requirements.txt in command line.

Running the Scraper

To run the scraper, you can run the scraper.py script by typing python3 src/scraper.py in your terminal (from the root directory). Follow the status messages! It may take a while

Can change the endpoint URL being scraped. Simply go to https://callink.berkeley.edu/organizations and select an option (in the dropdown to the left) to filter the clubs you would like to scrape.


Sources

About

Sparklist scripts (python mostly) for automating workflow, streamlining progress and directing energy to new ventures: new campaigns, outreach and more!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages