Skip to content

Latest commit

 

History

History
66 lines (45 loc) · 2.12 KB

.verb.md

File metadata and controls

66 lines (45 loc) · 2.12 KB

Heads up!

See the release notes for information about changes made in v2.0.

Usage

This library recursively calls needle's .get method as long as the user-provided next() function returns a string (the next url to get). See an example.

Example

const request = require('{%= name %}');

request(url, options, next)
  .then(acc => console.log(acc.pages.length))
  .catch(console.error);

Params

  • url {string} - (required) the initial url to get
  • options {object} - (optional) options object to pass to [needle][]
  • next {function} - (required) function that returns the next url to get, a promise or undefined.

next function params

  • url {string} - the original (base) user-provided url
  • resp {object} - [needle][] response object
  • acc {object} - accumulator object with the following properties:
    • options {object} - user-provided options object
    • pages {array} - array of responses
    • urls {array} - array of requested urls

The next function should return a string (the next url to get), promise or undefined.

Example

The following example shows how to loop over pages of CSS posts on smashingmagazine.com (an arbitrary example, but they have great content!).

const request = require('{%= name %}');

async function next(url, resp, acc) {
  // do stuff to check response first if necessary
  const regex = /href="\/.*?\/(\d+)\/"/;
  const num = (regex.exec(resp.data) || [])[1];

  if (num && /^[0-9]+$/.test(num) && +num <= n) {
    // use the "original" url to avoid having to reparse
    // and recreate the url each time
    return `${acc.orig}/page/${num}/`;
  }
}

request('https://www.smashingmagazine.com/category/css', {}, next)
  .then(acc => console.log(acc.pages.length))
  .catch(console.error);

Release notes

v2.0

  • renamed .hrefs to .urls in response object
  • now using [axios][] instead of [needle][]. Please see the axios documentation for API information.