openGraphScraper

A simple node module(with TypeScript declarations) for scraping Open Graph and Twitter Card and other metadata off a site.

Note: open-graph-scraper doesn't support browser usage at this time but you can use open-graph-scraper-lite if you already have the HTML and can't use Node's Fetch API.

Installation

npm install open-graph-scraper --save

Usage

const ogs = require('open-graph-scraper');
const options = { url: 'http://ogp.me/' };
ogs(options)
  .then((data) => {
    const { error, html, result, response } = data;
    console.log('error:', error);  // This returns true or false. True if there was an error. The error itself is inside the result object.
    console.log('html:', html); // This contains the HTML of page
    console.log('result:', result); // This contains all of the Open Graph results
    console.log('response:', response); // This contains response from the Fetch API
  })

Results JSON

Check the return for a success flag. If success is set to true, then the url input was valid. Otherwise it will be set to false. The above example will return something like...

{
  ogTitle: 'Open Graph protocol',
  ogType: 'website',
  ogUrl: 'https://ogp.me/',
  ogDescription: 'The Open Graph protocol enables any web page to become a rich object in a social graph.',
  ogImage: [
    {
      height: '300',
      type: 'image/png',
      url: 'https://ogp.me/logo.png',
      width: '300'
    }
  ],
  charset: 'utf-8',
  requestUrl: 'http://ogp.me/',
  success: true
}

Options

Name	Info	Default Value	Required
url	URL of the site.		x
html	You can pass in an HTML string to run ogs on it. (use without options.url)
fetchOptions	Options that are used by the Fetch API	{}
timeout	Request timeout for Fetch (Default is 10 seconds)	10
blacklist	Pass in an array of sites you don't want ogs to run on.	[]
onlyGetOpenGraphInfo	Only fetch open graph info and don't fall back on anything else.	false
customMetaTags	Here you can define custom meta tags you want to scrape.	[]
urlValidatorSettings	Sets the options used by validator.js for testing the URL	Here

Note: open-graph-scraper uses the Fetch API for requests and most of Fetch's options should work as open-graph-scraper's fetchOptions options.

Custom Meta Tag Example

const ogs = require('open-graph-scraper');
const options = {
  url: 'https://github.com/jshemas/openGraphScraper',
  customMetaTags: [{
    multiple: false, // is there more than one of these tags on a page (normally this is false)
    property: 'hostname', // meta tag name/property attribute
    fieldName: 'hostnameMetaTag', // name of the result variable
  }],
};
ogs(options)
  .then((data) => {
    const { result } = data;
    console.log('hostnameMetaTag:', result.customMetaTags.hostnameMetaTag); // hostnameMetaTag: github.com
  })

HTML Example

const ogs = require('open-graph-scraper');
const options = {
  html: `<html><head>
  <link rel="icon" type="image/png" href="https://bar.com/foo.png" />
  <meta charset="utf-8" />
  <meta property="og:description" name="og:description" content="html description example" />
  <meta property="og:image" name="og:image" content="https://www.foo.com/bar.jpg" />
  <meta property="og:title" name="og:title" content="foobar" />
  <meta property="og:type" name="og:type" content="website" />
  </head></html>`
};
ogs(options)
  .then((data) => {
    const { result } = data;
    console.log('result:', result);
    // result: {
    //   ogDescription: 'html description example',
    //   ogTitle: 'foobar',
    //   ogType: 'website',
    //   ogImage: [ { url: 'https://www.foo.com/bar.jpg', type: 'jpg' } ],
    //   favicon: 'https://bar.com/foo.png',
    //   charset: 'utf-8',
    //   success: true
    // }
  })

User Agent Example

The request header is set to undici by default. Some sites might block this, and changing the userAgent might work. If not you can try using a proxy for the request and then pass the html into open-graph-scraper.

const ogs = require("open-graph-scraper");
const userAgent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36';
ogs({ url: 'https://www.wikipedia.org/', fetchOptions: { headers: { 'user-agent': userAgent } } })
  .then((data) => {
    const { error, html, result, response } = data;
    console.log('error:', error);  // This returns true or false. True if there was an error. The error itself is inside the result object.
    console.log('html:', html); // This contains the HTML of page
    console.log('result:', result); // This contains all of the Open Graph results
    console.log('response:', response); // This contains response from the Fetch API
  })

Name		Name	Last commit message	Last commit date
Latest commit History 1,054 Commits
.github		.github
dist		dist
lib		lib
tests		tests
.eslintignore		.eslintignore
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.snyk		.snyk
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
index.ts		index.ts
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.tests.json		tsconfig.tests.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

openGraphScraper

Installation

Usage

Results JSON

Options

Custom Meta Tag Example

HTML Example

User Agent Example

About

Releases

Packages

Languages

License

totto2727/openGraphScraper

Folders and files

Latest commit

History

Repository files navigation

openGraphScraper

Installation

Usage

Results JSON

Options

Custom Meta Tag Example

HTML Example

User Agent Example

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages