Skip to content

davidteather/everything-web-scraping

Repository files navigation

Everything Web Scraping

Learn everything web scraping by David Teather find the video series on YouTube.

LinkedIn Sponsor Me Discord Server Twitter URL

Table Of Contents

  1. Course Catalogue
  2. How To start The Mock Websites

Please consider giving Course Feedback

Welcome!

Glad you're here! If it's your first time check out the the introduction, if not welcome back!

Consider sponsoring me on GitHub to make work like this possible

Supporting The Project

  • Star the repo 😎
    • Maybe share it with some people new to web-scraping?
  • Consider sponsoring me on GitHub
  • Send me an email or a LinkedIn message telling me what you enjoy in the course (and maybe what else you want to see in the future)
  • Submit PRs for suggestions/issues :)

Course Catalogue

  1. Introduction To The Course
  2. Introduction To Forging API Requests
  3. Proxies
  4. Beautiful Soup Scraping With Static and Server Side Rendered Sites

How To Start The Mock Websites

Video Walkthrough

With GitHub Codespaces (Recommended)

If you don't want to deal with installing and configuring software, I've set up this repository so that a GitHub Codespace can do all of that for you.

Note: A free GitHub account comes with 60 hours of Codespaces free each month, and if you're a student you can get 90 hours free each month with GitHub Pro through the GitHub Student Developer Pack (source)

Creating A Codespace

If you want to save your solutions, create a fork then create a Codespace from your own repo, then you'll be able to use git to save your changes as normal.

Create a Codespace using the instructions below or here

Select Code -> Codespaces Tab -> The + Icon -> New With Options

Or click here

Select the configuration of the lesson you're on, and after hitting create a Codespace

VS Code editor will open in the browser and start all programs needed for the activity!

Cleaning Up

After finishing each lesson you can visit the GitHub Codespaces menu and delete the Codespace so you don't get charged while you're not using it.

Delete a Codespace with the 3 dots -> Delete

This will delete any changes you've made

Note: If you enjoy GitHub Codespaces consider checking out my ~30 minute LinkedIn Learning Course on Codespaces, you can get free 24h access through my LinkedIn post and feel free to send a connection request while you're over there 🤠

With Docker

Run docker-compose up while in a lesson directory, when it says development server started open localhost:3000 in your browser to check that it's working properly.

When done with this lesson you can control + c to shut down your docker containers.

Cleaning Up

With Docker Desktop
  1. Navigate to the containers tab on the side, find the lesson you want to delete and click the trashcan icon to remove it.
  2. Navigate to the images tab on the side, find the images starting with the course name to delete and hit the trash can.
With Command line
  1. To remove containers, docker rm $(docker ps -a -q --filter name=XXX), where XXX is the lesson number you want removed (ex: 001).
  2. To remove images, docker rmi $(docker images --filter label=lesson.number=X -a -q), where X is the number you want removed (ex: 1, ex: 10)