Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace ua-parser-js with a reduced in-house version #3720

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

davidje13
Copy link

As discussed in #3715

This replaces the external library ua-parser-js with an in-house version which has a significantly reduced scope.

The parser has been written from scratch with a focus on passing the existing tests, but should be easy to extend with more browsers and OSes if required (though I would recommend keeping it minimal and maybe even removing some entries). The method used is a bit more robust than the wildcard searches used in the other project, and should be less prone to accidental catastrophic backtracking issues, but the main point of this is to reduce the breadth of dependencies to help avoid upstream attacks in the future.

Currently it is not possible for users to plug in their own function; the recommendation would be for people who need "friendly" names for a wide selection of browsers to use the fullName property instead and do their own parsing (e.g. via ua-parser-js if they wish).

@google-cla
Copy link

google-cla bot commented Oct 30, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@davidje13
Copy link
Author

@googlebot I signed it

@jginsburgn
Copy link
Member

Nice CL!

Comment on lines +8 to 132
}
}
if (!result.firstEntry) {
// ensure firstEntry is always set no matter what (null object pattern)
result.firstEntry = {
version: null,
details: [],
hasDetail: () => false,
getDetail: () => []
}
}
return result
}

const extractFromParts = (parts, checks) => checks
.map((check) => check(parts))
.filter((result) => result)[0] || []

const extractMacVersion = (m) => m[1].replace(/_/g, '.')

const WINDOWS_NT_VERSION_MAP = {
5.1: 'XP',
5.2: 'XP',
'6.0': 'Vista',
6.1: '7',
6.2: '8',
6.3: '8.1',
6.4: '10',
'10.0': '10'
}
const extractWindowsVersion = (m) => WINDOWS_NT_VERSION_MAP[m[1]]

const UA_BROWSERS = [
({ phantomjs }) => phantomjs && // also contains Safari
['PhantomJS', phantomjs.version],

({ headlesschrome }) => headlesschrome && // also contains Safari
['Chrome Headless', headlesschrome.version],

({ opera, version }) => opera &&
['Opera', version && version.version],

({ firefox }) => firefox &&
['Firefox', firefox.version],

({ edg }) => edg && // also contains Chrome, Safari
['Edge', edg.version],

({ chrome }) => chrome && // also contains Safari
['Chrome', chrome.version],

({ firstEntry, version }) => firstEntry.hasDetail(/^iphone/i) && // also contains Safari
['Mobile Safari', version && version.version],

({ firstEntry, version }) => firstEntry.hasDetail(/^android/i) && // also contains Safari
['Android Browser', version && version.version],

({ safari, version }) => safari &&
['Safari', version && version.version],

({ firstEntry }) => firstEntry.hasDetail(/^msie/i) &&
['IE', firstEntry.getDetail(/^msie ([\d.]+)/i)[1]]
]

const UA_SYSTEMS = [
({ firstEntry }) => firstEntry.hasDetail(/^android/i) &&
['Android', firstEntry.getDetail(/^android ([\d.]+)/i)[1]],

({ firstEntry }) => firstEntry.hasDetail(/^iphone/i) &&
['iOS', firstEntry.getDetail(/iphone os ([\d._]+)/i, extractMacVersion)],

({ ubuntu }) => ubuntu &&
['Ubuntu', ubuntu.version],

({ firstEntry }) => firstEntry.hasDetail(/^freebsd/i) &&
['FreeBSD', null],

({ firstEntry }) => firstEntry.hasDetail(/^linux/i) &&
['Linux', firstEntry.getDetail(/^linux (.+)/i)[1]],

({ firstEntry }) => firstEntry.hasDetail(/mac os/i) &&
['Mac OS', firstEntry.getDetail(/mac os(?: x)? ([\d._]+)/i, extractMacVersion)],

({ firstEntry }) => firstEntry.hasDetail(/^windows/i) &&
['Windows', firstEntry.getDetail(/windows nt ([\d.]+)/i, extractWindowsVersion)]
]

exports.browserFullNameToShort = (fullName) => {
const ua = useragent(fullName)
if (!ua.browser.name && !ua.browser.version && !ua.os.name && !ua.os.version) {
return fullName
const parts = extractUAParts(fullName)
const [browserName, browserVersion] = extractFromParts(parts, UA_BROWSERS)
const [osName, osVersion] = extractFromParts(parts, UA_SYSTEMS)
if (browserName || osName) {
return `${browserName || 'unknown'} ${browserVersion || '0.0.0'} (${osName || 'unknown'} ${osVersion || '0.0.0'})`
}
return `${ua.browser.name} ${ua.browser.version || '0.0.0'} (${ua.os.name} ${ua.os.version || '0.0.0'})`
return fullName || 'unknown'
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should move this logic to its own file. @devoto13 WDYT?

const mm = require('minimatch')

const extractUAParts = (ua) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some unit tests for this? @davidje13

Comment on lines +57 to +123
const WINDOWS_NT_VERSION_MAP = {
5.1: 'XP',
5.2: 'XP',
'6.0': 'Vista',
6.1: '7',
6.2: '8',
6.3: '8.1',
6.4: '10',
'10.0': '10'
}
const extractWindowsVersion = (m) => WINDOWS_NT_VERSION_MAP[m[1]]

const UA_BROWSERS = [
({ phantomjs }) => phantomjs && // also contains Safari
['PhantomJS', phantomjs.version],

({ headlesschrome }) => headlesschrome && // also contains Safari
['Chrome Headless', headlesschrome.version],

({ opera, version }) => opera &&
['Opera', version && version.version],

({ firefox }) => firefox &&
['Firefox', firefox.version],

({ edg }) => edg && // also contains Chrome, Safari
['Edge', edg.version],

({ chrome }) => chrome && // also contains Safari
['Chrome', chrome.version],

({ firstEntry, version }) => firstEntry.hasDetail(/^iphone/i) && // also contains Safari
['Mobile Safari', version && version.version],

({ firstEntry, version }) => firstEntry.hasDetail(/^android/i) && // also contains Safari
['Android Browser', version && version.version],

({ safari, version }) => safari &&
['Safari', version && version.version],

({ firstEntry }) => firstEntry.hasDetail(/^msie/i) &&
['IE', firstEntry.getDetail(/^msie ([\d.]+)/i)[1]]
]

const UA_SYSTEMS = [
({ firstEntry }) => firstEntry.hasDetail(/^android/i) &&
['Android', firstEntry.getDetail(/^android ([\d.]+)/i)[1]],

({ firstEntry }) => firstEntry.hasDetail(/^iphone/i) &&
['iOS', firstEntry.getDetail(/iphone os ([\d._]+)/i, extractMacVersion)],

({ ubuntu }) => ubuntu &&
['Ubuntu', ubuntu.version],

({ firstEntry }) => firstEntry.hasDetail(/^freebsd/i) &&
['FreeBSD', null],

({ firstEntry }) => firstEntry.hasDetail(/^linux/i) &&
['Linux', firstEntry.getDetail(/^linux (.+)/i)[1]],

({ firstEntry }) => firstEntry.hasDetail(/mac os/i) &&
['Mac OS', firstEntry.getDetail(/mac os(?: x)? ([\d._]+)/i, extractMacVersion)],

({ firstEntry }) => firstEntry.hasDetail(/^windows/i) &&
['Windows', firstEntry.getDetail(/windows nt ([\d.]+)/i, extractWindowsVersion)]
]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did you get all this info? Probably from the implementation of ua-parser-js? Can we add a reference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you please add comments to the functions?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source of this is actually the test user-agent strings which were already in this project. The code and regular expressions here have no relation to ua-parser-js's code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for documentation: if the team is happy with this approach, I can extract the code into its own file, add finer-grained tests, and document. Wanted to get a quick first pass submitted without all that so that I didn't waste too much time if the team was opposed to the whole idea.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with your approach. It is very elegant! :)
I'd like to hear from @devoto13 as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach overall LGTM!
I'll try to look more into details in the coming days.

@jginsburgn
Copy link
Member

@davidje13 also please take a look at the checks as there are some failures :)

@davidje13
Copy link
Author

@davidje13 also please take a look at the checks as there are some failures :)

seems Node 10 doesn't have matchAll. Easy to replace. I'll update it when I get time.

@jginsburgn
Copy link
Member

@davidje13 also please take a look at the checks as there are some failures :)

seems Node 10 doesn't have matchAll. Easy to replace. I'll update it when I get time.

Oh yeah! That is minor. :)

@devoto13
Copy link
Collaborator

devoto13 commented Nov 2, 2021

seems Node 10 doesn't have matchAll. Easy to replace. I'll update it when I get time.

I would be more comfortable landing this change in a major release, so it's probably okay to keep using matchAll as we're going to drop Node 10 in the next major.

@jginsburgn
Copy link
Member

jginsburgn commented Nov 2, 2021

seems Node 10 doesn't have matchAll. Easy to replace. I'll update it when I get time.

I would be more comfortable landing this change in a major release, so it's probably okay to keep using matchAll as we're going to drop Node 10 in the next major.

Or, can we add a TODO where appropriate to replace the Node < 10 equivalent with matchAll when we release the next major (#3503)?

Maybe something in the lines of:

// TODO(#3503): Replace the following logic with `matchAll`.
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants