Skip to content
This repository has been archived by the owner on Oct 23, 2023. It is now read-only.

Adds parseUser option to control user parsing behavior #274

Merged
merged 1 commit into from
Feb 23, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions docs/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,29 @@ Those configuration options are documented below:
extra: {planet: {name: 'Earth'}}
}

.. describe:: parseUser

Controls how Raven tries to parse user context when parsing a request object.

An array of strings will serve as a whitelist for fields to grab from ``req.user``.
``true`` will collect all keys from ``req.user``. ``false`` will collect nothing.

Defaults to ``['id', 'username', 'email']``.

Alternatively, a function can be provided for fully custom parsing:

.. code-block:: javascript

{
parseUser: function (req) {
// custom user parsing logic
return {
username: req.specialUserField.username,
id: req.specialUserField.getId()
};
}
}

.. describe:: dataCallback

A function that allows mutation of the data payload right before being
Expand Down
3 changes: 2 additions & 1 deletion lib/client.js
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ extend(Raven.prototype, {
this.loggerName = options.logger || '';
this.dataCallback = options.dataCallback;
this.shouldSendCallback = options.shouldSendCallback;
this.parseUser = options.parseUser;

if (!this.dsn) {
utils.consoleAlert('no DSN provided, error reporting disabled');
Expand Down Expand Up @@ -407,7 +408,7 @@ extend(Raven.prototype, {
// skip anything not marked as an internal server error
if (status < 500) return next(err);

var kwargs = parsers.parseRequest(req);
var kwargs = parsers.parseRequest(req, self.parseUser);
var eventId = self.captureException(err, kwargs);
res.sentry = eventId;
return next(err);
Expand Down
102 changes: 42 additions & 60 deletions lib/parsers.js
Original file line number Diff line number Diff line change
Expand Up @@ -53,123 +53,105 @@ module.exports.parseError = function parseError(err, kwargs, cb) {
});
};

module.exports.parseRequest = function parseRequest(req, kwargs) {
kwargs = kwargs || {};
module.exports.parseRequest = function parseRequest(req, parseUser) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is parseRequest a public/documented API method? If so, might want to consider how to make this signature more extendable going forward.

e.g. I'd almost prefer if we kept kwargs and you did:

parseRequest(req, {
  parseUser: ['username']
});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not public; only callsite is currently in errorHandler middleware, soon we'll add another in requestHandler and in process but all internal. The kwargs param was unused so I got rid of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k

var kwargs = {};

// headers:
//
// node: req.headers
// express: req.headers
// node, express: req.headers
// koa: req.header
//
var headers = req.headers || req.header || {};

// method:
//
// node: req.method
// express: req.method
// koa: req.method
//
// node, express, koa: req.method
var method = req.method;

// host:
//
// node: req.headers.host
// express: req.hostname in > 4 and req.host in < 4
// koa: req.host
//
// node: req.headers.host
var host = req.hostname || req.host || headers.host || '<no host>';

// protocol:
//
// node: <n/a>
// express: req.protocol
// koa: req.protocol
//
// express, koa: req.protocol
var protocol = req.protocol === 'https' || req.secure || (req.socket || {}).encrypted ? 'https' : 'http';

// url (including path and query string):
//
// node: req.originalUrl
// express: req.originalUrl
// node, express: req.originalUrl
// koa: req.url
//
var originalUrl = req.originalUrl || req.url;

// absolute url
var url = protocol + '://' + host + originalUrl;
var absoluteUrl = protocol + '://' + host + originalUrl;

// query string
//
// query string:
// node: req.url (raw)
// express: req.query
// koa: req.query
//
// express, koa: req.query
var query = req.query || urlParser.parse(originalUrl || '', true).query;

// cookies:
//
// node: req.headers.cookie
// express: req.headers.cookie
// koa: req.headers.cookie
//
// node, express, koa: req.headers.cookie
var cookies = cookie.parse(headers.cookie || '');

// body data:
//
// node: req.body
// express: req.body
// koa: req.body
//
// node, express, koa: req.body
var data = req.body;
if (['GET', 'HEAD'].indexOf(method) === -1) {
if (typeof data === 'undefined') {
data = '<unavailable>';
}
}

if (data && {}.toString.call(data) !== '[object String]') {
if (data && typeof data !== 'string' && {}.toString.call(data) !== '[object String]') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The {}.toString.call check is really checking two cases without being obvious about it, and I sort of prefer using typeof as much as possible and avoiding checking object toString types when we don't need to be. This better delineates/escalates through the cases we care about IMO (and typeof is a cheaper check in the common case, not that we care about a trivial perf diff in a once-per-error check).

// Make sure the request body is a string
data = stringify(data);
}

// client ip:
//
// node: req.connection.remoteAddress
// express: req.ip
// koa: req.ip
//
var ip = req.ip || (req.connection || {}).remoteAddress;

// http interface
var http = {
method: method,
query_string: query,
headers: headers,
cookies: cookies,
data: data,
url: url
url: absoluteUrl
};

// expose http interface
kwargs.request = http;

// user
//
// typically found on req.user according to Express and Passport

var user = {};
if (!kwargs.user) {
if (req.user) {
// shallow copy is okay because we are only modifying top-level
// object (req.user)
for (var key in req.user) {
if ({}.hasOwnProperty.call(req.user, key)) {
user[key] = req.user[key];
// user: typically found on req.user in express/passport patterns
// five cases for parseUser value:
// absent: grab only id, username, email from req.user
// false: capture nothing
// true: capture all keys from req.user
// array: provided whitelisted keys to grab from req.user
// function :: req -> user: custom parsing function
if (parseUser == null) parseUser = ['id', 'username', 'email'];
if (parseUser) {
var user = {};
if (typeof parseUser === 'function') {
user = parseUser(req);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we pass req to parseUser and not just strictly the user?

(I mean, this is probably more flexible, but want to get your take.)

Copy link
Contributor Author

@LewisJEllis LewisJEllis Feb 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind the function option exists primarily as a catch-all for "well what if it's not at req.user...?"; other main case would be if we want to pull in other stuff off the req object (like maybe we want stuff from req.session), so either way I pass the entire req object for max flexibility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

} else if (req.user) {
if (parseUser === true) {
for (var key in req.user) {
if ({}.hasOwnProperty.call(req.user, key)) {
user[key] = req.user[key];
}
}
} else {
parseUser.forEach(function (fieldName) {
if ({}.hasOwnProperty.call(req.user, fieldName)) {
user[fieldName] = req.user[fieldName];
}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks good to me but this is a big gnarly block of logic. consider labeling some of the control flow paths?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ehh...normally id agree but i did it that way at first and it felt like borderline overcommenting since each branch has a pretty straightforward condition check, so i made sure to describe the 5 cases together above instead

}
}

// client ip:
// node: req.connection.remoteAddress
// express, koa: req.ip
var ip = req.ip || req.connection && req.connection.remoteAddress;
if (ip) {
user.ip_address = ip;
}
Expand Down
62 changes: 52 additions & 10 deletions test/raven.parsers.js
Original file line number Diff line number Diff line change
Expand Up @@ -416,38 +416,80 @@ describe('raven.parsers', function () {
});
});

it('should NOT assign req.user if already present in kwargs', function () {
it('should add ip address to user if available', function () {
var mockReq = {
method: 'POST',
hostname: 'example.org',
url: '/some/path?key=value',
ip: '127.0.0.1',
user: {
username: 'janedoe',
email: 'hello@janedoe.com'
}
};

var parsed = raven.parsers.parseRequest(mockReq, { user: {} });
parsed.should.have.property('user', {});
var parsed = raven.parsers.parseRequest(mockReq);
parsed.should.have.property('user', {
username: 'janedoe',
email: 'hello@janedoe.com',
ip_address: '127.0.0.1'
});
});

it('should add ip address to user if available', function () {
describe('parseUser option', function () {
var mockReq = {
method: 'POST',
hostname: 'example.org',
url: '/some/path?key=value',
ip: '127.0.0.1',
user: {
username: 'janedoe',
email: 'hello@janedoe.com'
email: 'hello@janedoe.com',
random: 'random'
}
};

var parsed = raven.parsers.parseRequest(mockReq);
parsed.should.have.property('user', {
username: 'janedoe',
email: 'hello@janedoe.com',
ip_address: '127.0.0.1'
it('should limit to default whitelisted fields', function () {
var parsed = raven.parsers.parseRequest(mockReq);
parsed.should.have.property('user', {
username: 'janedoe',
email: 'hello@janedoe.com',
ip_address: '127.0.0.1'
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this check that random is -not- set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it checks that the user property has that exact value

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

});

it('should limit to provided whitelisted fields array', function () {
var parsed = raven.parsers.parseRequest(mockReq, ['username', 'random']);
parsed.should.have.property('user', {
username: 'janedoe',
random: 'random',
ip_address: '127.0.0.1'
});
});

it('should parse all fields when passed true', function () {
var parsed = raven.parsers.parseRequest(mockReq, true);
parsed.should.have.property('user', {
username: 'janedoe',
email: 'hello@janedoe.com',
random: 'random',
ip_address: '127.0.0.1'
});
});

it('should parse nothing when passed false', function () {
var parsed = raven.parsers.parseRequest(mockReq, false);
parsed.should.not.have.property('user');
});

it('should take a custom parsing function', function () {
var parsed = raven.parsers.parseRequest(mockReq, function (req) {
return { user: req.user.username };
});
parsed.should.have.property('user', {
user: 'janedoe',
ip_address: '127.0.0.1'
});
});
});
});
Expand Down