Redux middleware for Cheerio
Cheerio works under the hood for parsing the HTML or XML document of HTTP requests. Cheerio uses a very simple, consistent DOM model. As a result parsing, manipulating, and rendering are incredibly efficient.
npm install redux-cheerio --save
Insert redux-cheerio into the middleware chain like you would with any other piece of middleware.
import { createStore, applyMiddleware, combineReducers } from 'redux';
import cheerioMiddleware from 'redux-cheerio';
import reducers from './reducers';
const reducer = combineReducers(reducers);
const createStoreWithMiddleware = applyMiddleware(cheerioMiddleware)(createStore);
function configureStore(initialState) {
return createStoreWithMiddleware(reducer, initialState);
}
const store = configureStore(initialState);
Go to the end of this readme for a full self-contained example of using redux-cheerio
Dispatch actions to your Redux store that have a type of CHEERIO_TASK and a payload consisting of a url to make the request to and a task function whose job is to parse the HTML using jQuery selectors.
To use the middleware, dispatch an action takes the following form:
const whateverNameWeLike = {
type: 'CHEERIO_TASK',
payload: {
url: // a string
task: // a function
}
};
#Example
Here is an example action that returns the HTML body of a response as a JSON object.
Define the first Cheerio action which makes the request
const getBodyOfHTML = {
type: 'CHEERIO_TASK',
payload: { // the payload properties are customised by you
url: 'http://www.example.com',
task: function yourCheerioFuncHere($){ return $('body') }
}
};
Note that in this example we set the following custom payload properties:
Where the HTTP request will be made to.
Under the hood our middleware takes the HTML from the response and calls the Cheerio.load function like so
let $ = cheerio.load(response)
Now we can use jQuery selectors to extract the the data we want from the HTML. This is especially useful for webscraping. We must remember to return the result of this selection.
Lets dispatch the action to make the request and parse the response with the task function in our action.
// note that a promise is returned
store.dispatch(cheerioAction)
Now watch as redux-cheerio handles the dispatching of one further action depending on the success of the HTTP request.
When our task function returns something a CHEERIO_TASK_FULFILLED action will be dispatched as long as no errors have occured. The payload of this new action will consist of the thing we returned in our task function that was in the first CHEERIO_TASK action.
{
type: 'CHEERIO_TASK_FULFILLED'
payload: {
'whatever was the result of our $('div').text() jquery selector'
}
}
If there was an error during the request such as a timeout or 404 status code then redux-cheerio middleware will dispatch a rejected action instead of a fulfilled one.
{
type: 'CHEERIO_TASK_REJECTED'
payload: {
err: { // error defined here}
}
}
import { createStore, applyMiddleware } from 'redux';
import cheerioMiddleware from 'redux-cheerio'
// simplest reducer possible - just returns the next state
const reducer = (state, action) => state
const middleware = [cheerioMiddleware]
const store = createStore(reducer, applyMiddleware(...middleware)));
// Create a action that follows the redux-cheerio signature
const cheerioAction = {
type: 'CHEERIO_TASK',
payload: { // the payload properties are customised by you
url: 'http://www.example.com',
task: function yourCheerioFuncHere($){ return $('body') }
}
};
// dispatch our action which returns a promise that we can chain with other logic
store.dispatch(cheerioAction).then(() => {
console.log('request successsful')
}).catch(() => {
console.log('request unsuccessful')
})