Skip to content

Commit

Permalink
Add support for scraping products from Newegg (#154)
Browse files Browse the repository at this point in the history
* Add support for scraping products from Newegg

* Add Newegg.com and Newegg.ca to supported websites in README
  • Loading branch information
Crinibus authored Apr 9, 2022
1 parent b5a95e8 commit 43e96ee
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ This scraper can (so far) scrape prices on products from:
- [MM-Vision.dk](https://www.mm-vision.dk/)
- [Coolshop.dk](https://www.coolshop.dk/)
- [Sharkgaming.dk](https://www.sharkgaming.dk/)
- [Newegg.com](https://www.newegg.com/) & [Newegg.ca](https://www.newegg.ca/)

****OBS these Amazon domains should work: [.com](https://www.amazon.com/), [.ca](https://www.amazon.ca/), [.es](https://www.amazon.es/), [.fr](https://www.amazon.fr/), [.de](https://www.amazon.de/) and [.it](https://www.amazon.it/)<br/>
The listed Amazon domains is from my quick testing with one or two products from each domain.<br/>
Expand Down
12 changes: 12 additions & 0 deletions scraper/domains.py
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,17 @@ def sharkgaming(soup: BeautifulSoup) -> Info:
return Info(product_user_name, price, currency, id)


def newegg(soup: BeautifulSoup) -> Info:
script_data_raw = soup.find_all("script", type="application/ld+json")[2].text
product_data = json.loads(script_data_raw)
name = product_data.get("name")
product_user_name = Format.get_user_product_name(name)
price = float(product_data.get("offers").get("price"))
currency = product_data.get("offers").get("priceCurrency")
id = product_data.get("sku")
return Info(product_user_name, price, currency, id)


domains = {
"komplett": komplett,
"proshop": proshop,
Expand All @@ -229,4 +240,5 @@ def sharkgaming(soup: BeautifulSoup) -> Info:
"mm-vision": mmvision,
"coolshop": coolshop,
"sharkgaming": sharkgaming,
"newegg": newegg,
}
1 change: 1 addition & 0 deletions scraper/format.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ def shorten_url(website_name: str, url: str, info: Info) -> str:
"mmvision": url,
"coolshop": f'https://www.coolshop.dk/produkt/{url.split("/")[-2]}/',
"sharkgaming": url,
"newegg": f"https://www.newegg.com/p/{info.id}",
}

if website_name == "ebay":
Expand Down

0 comments on commit 43e96ee

Please sign in to comment.