Eshop Scraper is a powerful npm package designed for web scraping e-commerce websites.
To install the package, use one of the following commands:
npm install eshop-scraper
pnpm add eshop-scraper
yarn add eshop-scraper
This package allows you to extract important data such as price, currency, and name from various well-known e-commerce websites, including Amazon, Steam, Ebay, and many others. It facilitates efficient web scraping for obtaining detailed product information.
{
"node": ">=20.11.0",
"npm": ">=10.2.4",
}
First, you need to create an instance of the EshopScraper
class. Configure it with optional parameters as needed:
import { EshopScraper, ResultData } from 'eshop-scraper';
const scraper: EshopScraper = new EshopScraper({
timeout: 15, // Timeout for requests in seconds
// Additional configuration options
});
Call the getData
method to scrape data from the provided URL:
import { EshopScraper, ResultData } from 'eshop-scraper';
const scraper = new EshopScraper({
timeout: 15,
});
(async () => {
try {
const result: ResultData = await scraper.getData('https://example.com/product-page');
if (result.isError) {
console.error('Error:', result.errorMsg);
} else {
console.log('Product Data:', result);
}
} catch (error) {
console.error('Unexpected Error:', error);
}
})();
This method scrapes data from a website based on the provided configuration.
link: string
: The absolute URI of the item you want to scrape.timeoutAmount?: number
: Timeout amount for the request in seconds.
await scraper.getData(uri);
It returns a Promise that resolves to an object with the following structure:
{
price?: number; // The price of the product
currency?: string; // The currency of the price
name?: string; // The name of the product
site?: string; // The source website's name
link?: string; // The link to the product page
isError?: boolean; // Whether an error occurred
errorMsg?: string; // The error message, if any
}
Updates entries in the _currencyMap
.
key: string[][] | string[]
: The key(s) to be updated.value: string[] | string
: The value(s) to be assigned.
scraper.updateCurrencyMap([['$', 'usd']], 'USD');
scraper.updateCurrencyMap(['$', 'usd'], 'USD');
Deletes entries from the _currencyMap
.
key: string[][] | string[]
: The key(s) to be deleted.
scraper.deleteCurrencyMap([['$', 'usd']]);
scraper.deleteCurrencyMap(['$', 'usd']);
Updates entries in the _webProps
.
site: string | string[]
: The site(s) to be updated.properties: { site: string; selector: { price: string[]; name: string[] } } | { site: string; selector: { price: string[]; name: string[] } }[]
: The properties to be assigned.
scraper.updateWebProps('exampleSite', { site: 'exampleSite', selector: { price: ['priceSelector'], name: ['nameSelector'] } });
scraper.updateWebProps(['site1.com', 'site2.com'], [
{ site: 'site1', selector: { price: ['priceSelector1'], name: ['nameSelector1'] } },
{ site: 'site2', selector: { price: ['priceSelector2'], name: ['nameSelector2'] } }
]);
Deletes entries from the _webProps.
site: string | string[]
: The site(s) to be deleted.
scraper.deleteWebProps('exampleSite');
scraper.deleteWebProps(['site1', 'site2']);
Updates entries in the _replaceObj
.
key: string | string[]
: The key(s) to be updated.value: string | string[]
: The value(s) to be assigned.
scraper.updateReplaceObj('oldString', 'newString');
scraper.updateReplaceObj(['oldString1', 'oldString2'], ['newString1', 'newString2']);
Deletes entries from the _replaceObj
.
key: string | string[]
: The key(s) to be deleted.
scraper.deleteReplaceObj('oldString');
scraper.deleteReplaceObj(['oldString1', 'oldString2']);
You can customize the scraper by providing additional configurations.
Add new website configurations to the scraper:
import { EshopScraper } from 'eshop-scraper';
const propsList = new Map([
['test.com', {
site: 'Test',
selectors: {
priceSelector: ['span[itemprop="price"]'],
nameSelector: ['h1[itemprop="name"]'],
},
}],
]);
const scraper = new EshopScraper({
webProps: propsList,
});
Modify or exclude certain strings in the scraped data:
import { EshopScraper } from 'eshop-scraper';
const replaceObj = {
'price is:': '',
now: '',
usd: '$',
};
const scraper = new EshopScraper({
replaceObj: replaceObj,
});
Map additional currencies for accurate conversion:
import { EshopScraper } from 'eshop-scraper';
const currencyList = new Map([
[['$'], 'USD'],
[['euro', '€'], 'EUR'],
]);
const scraper = new EshopScraper({
currencyMap: currencyList,
});
Provide custom headers to mimic realistic browser requests:
import { EshopScraper } from 'eshop-scraper';
const newHeaders = [
{
Accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36',
},
{
Accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:90.0) Gecko/20100101 Firefox/90.0',
},
];
const scraper = new EshopScraper({
headersArr: newHeaders,
});
Configure the request timeout:
import { EshopScraper } from 'eshop-scraper';
const scraper = new EshopScraper({
timeout: 10, // Timeout in seconds
});
Configure the number of retry attempts for failed requests:
import { EshopScraper } from 'eshop-scraper';
const scraper = new EshopScraper({
retry: 3, // Number of retry attempts
});
Use this script to inspect default values for supported websites, replaced strings, headers, and more:
import { EshopScraper } from 'eshop-scraper';
const scraper = new EshopScraper();
(async () => {
console.log('Supported websites:', scraper._webProps);
console.log('Replaced strings:', scraper._replaceObj);
console.log('Headers:', scraper._headers);
console.log('Currency map:', scraper._currencyMap);
console.log('Timeout amount:', scraper._timeoutAmount);
console.log('Retry attempts:', scraper._retry);
process.exit(0);
})();
The eshop-scraper
package supports 8 websites by default. Additional websites can be added through configuration.
- Steam (store.steampowered.com)
- Amazon (amazon.com, amazon.in)
- Crutchfield (crutchfield.com)
- Playstation (store.playstation.com, gear.playstation.com)
- Ebay (ebay.com)
- Bikroy (bikroy.com)
-
Static vs. Dynamic Websites: This scraper is designed for static websites. It does not support dynamic or Single Page Applications (SPAs) at this time. Future versions may include support for dynamic content.
-
Price Format Issues: Some websites might display prices in an unexpected format. For instance, prices may initially appear without a decimal point or use a comma instead of a dot. The scraper cannot execute JavaScript, so it cannot dynamically convert these formats. As a result, prices may be shown incorrectly (e.g., "2345" instead of "23.45").
-
Language and Currency: The scraper processes prices in English. If a website displays prices in a local language or script, the scraper might not interpret them correctly. Ensure that the price format is in English for accurate results.
We welcome contributions to the eshop-scraper
project! To contribute, please open a pull request on GitHub. Your input helps improve the scraper for everyone.