Skip to content
This repository has been archived by the owner on Dec 11, 2019. It is now read-only.

Don't break up as URL for autocomplete if location bar text isn't a URL #13693

Merged
merged 3 commits into from
Apr 23, 2018

Conversation

rvagg
Copy link

@rvagg rvagg commented Apr 3, 2018

fix #13718

Some urls that pass the isUrl() test don't parse properly into useful components by the muon URL parser which I believe is here.

foo and foo. are fine, foo.b and foo.bar is not, foo.bar. is fine again. What happens is that isUrl() says it can be parsed but muon.url.parse() returns an empty object so nothing gets tokenized.

Instead of changing the muon behaviour, this change makes it more robust no matter which URL parser is used.

Unfortunately, the Node.js url.parse() function is much more liberal than what muon is using, it'll throw any old string into path and href if it can't work it out. So the unittest suite lies about it working.

I attempted a conversion to the new Node.js WHATWG-compatible URL parser url.URL but that's even more different to what muon uses and breaks unittests for most other uses of urlParse.js so gave up on that idea. I've probably not ended up with the most elegant approach but it's minimal-touch to fix a bug.

Test plan

  1. open browser, visit some github URLs like github.com/brave
  2. slowly type in github.com to the urlbar
  3. none of the autocomplete results should disappear once you hit github.c

@codecov-io
Copy link

codecov-io commented Apr 3, 2018

Codecov Report

Merging #13693 into master will decrease coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #13693      +/-   ##
==========================================
- Coverage   56.47%   56.45%   -0.03%     
==========================================
  Files         283      283              
  Lines       28982    28983       +1     
  Branches     4807     4808       +1     
==========================================
- Hits        16369    16361       -8     
- Misses      12613    12622       +9
Flag Coverage Δ
#unittest 56.45% <100%> (-0.03%) ⬇️
Impacted Files Coverage Δ
app/common/lib/siteSuggestions.js 88.73% <100%> (+0.07%) ⬆️
js/stores/appStoreRenderer.js 91.66% <0%> (-8.34%) ⬇️
app/renderer/components/reduxComponent.js 57.75% <0%> (-3.45%) ⬇️
js/stores/windowStore.js 28.46% <0%> (-0.3%) ⬇️

const parsedUrl = urlParse(url.toLowerCase())
const parsedUrl = isUrl(url) && urlParse(url.toLowerCase())

if (parsedUrl && (parsedUrl.hash || parsedUrl.host || parsedUrl.pathname || parsedUrl.query || parsedUrl.query || parsedUrl.protocol)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parsedUrl.query is repeated twice

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed and force pushed

@diracdeltas
Copy link
Member

this looks useful, thanks! do you have a specific issue / STR that this fixes?

@rvagg rvagg force-pushed the rvagg/search-bar-url-fix branch from 06bd846 to 8d181ea Compare April 4, 2018 02:28
@rvagg
Copy link
Author

rvagg commented Apr 4, 2018

@diracdeltas no, sorry, do I need to open one? I saw a note about needing an issue but felt kind of silly opening both an issue and a PR saying the same thing.

@rvagg
Copy link
Author

rvagg commented Apr 4, 2018

I did search through the issue list fwiw, lots of urlbar autocomplete reports but nothing that seemed to be specifically about this (which is surprising because this has an annoyance for me for a while)

@diracdeltas
Copy link
Member

@rvagg yes please open an issue so that QA can check it. if this fixes some existing URL autocomplete issues, you can just link to those instead.

@rvagg
Copy link
Author

rvagg commented Apr 4, 2018

Issue @ #13718

@rvagg rvagg force-pushed the rvagg/search-bar-url-fix branch from 8d181ea to d308fd5 Compare April 4, 2018 14:00
@diracdeltas
Copy link
Member

would you mind adding a unit test for this case in the tokenizeInput test in test/unit/app/common/lib/siteSuggestionsTest.js? npm run unittest runs unittests

@diracdeltas
Copy link
Member

this works great, btw

@rvagg
Copy link
Author

rvagg commented Apr 5, 2018

@diracdeltas well, there already are tests in there that cover this case. For instance:

    it('periods get tokenized', function () {
      assert.deepEqual(tokenizeInput('brad.hates.primes'), ['brad', 'hates', 'primes'])
    })

This passes now, and without this patch because it uses Node's url.parse() as per app/common/urlParse.js when running the unit tests. url.parse() is very liberal and will tokenize just about anything.

$ node -p 'require("url").parse("brad.hates.primes")'
Url {
  protocol: null,
  slashes: null,
  auth: null,
  host: null,
  port: null,
  hostname: null,
  hash: null,
  search: null,
  query: null,
  pathname: 'brad.hates.primes',
  path: 'brad.hates.primes',
  href: 'brad.hates.primes' }

When you're running as a full browser, however, app/common/urlParse.js switches to muon.url.parse which is uses GURL and even though the isURL() check passes prior to this, GURL decides it's not a URL and doesn't give anything back for any of the URL components—you get an empty object {}, which is why this new check works. So unfortunately I can't think of a unit test that is going to hit this change in path and is actually useful.

Ideally we could have access to muon.url.parse for unit tests, or ditch that entirely and use Node's new url.URL for all parsing. It's WHATWG compatible and passes all the right tests so it's the same that's in Chrome and Firefox (and Edge I guess). As I said, though, there's too much currently depending on app/common/urlParse.js to make that a trivial switch.

@rvagg
Copy link
Author

rvagg commented Apr 5, 2018

I will say, however, the fact that there is a lot depending on app/common/urlParse.js and changing to a url.URL shim breaks so much stuff is a little concerning since all those other dependencies could be having similar problems with the mismatch between url.parse() and muon.url.parse, making the unit tests that touch these less than helpful. So the switch is probably worth the change.

fwiw this is a shim that I tried in app/common/urlParse.js that replaces require('url').parse for the unit tests, it could be expanded to remove muon.url.parse entirely as well:

-  // TODO: move to the new node URL API: https://nodejs.org/api/url.html#url_url
-  urlParse = require('url').parse
+  const url = require('url')
+  // modern Node.js with WHATWG URL parser
+  if (typeof url.URL === 'function') {
+    const urlProps = [ 'href', 'origin', 'protocol', 'username', 'password', 'host', 'hostname', 'port', 'pathname', 'search', 'searchParams', 'hash' ]
+    urlParse = (str) => {
+      try {
+        const parsedUrl = new url.URL(str)
+        // parsedUrl is not copyable via Object.assign(), so make a plain object
+        return urlProps.reduce((acc, curr) => {
+          acc[curr] = parsedUrl[curr]
+          return acc
+        }, {})
+      } catch (e) {
+        return null
+      }
+    }
+  } else {
+    // old version of Node.js
+    urlParse = url.parse
+  }

(it's a bit ugly because the URL object returned by url.URL is not a pojo, it uses getters and setters around a URLContext object so Object.assign({}, url) to make a copy doesn't work properly—this is the same as in the browser).

@diracdeltas
Copy link
Member

Oops, I forgot this bug only exists with muon.url.parse and so won't be encountered in unit tests by default

Ideally we could have access to muon.url.parse for unit tests

Actually I did create a way to run URL parser test using muon.url.parse in test/misc-components/muonTest.js (the tests themselves are in js/muonTest.entry.js). you can run these with npm run test -- --grep='muon tests'

@diracdeltas
Copy link
Member

I will say, however, the fact that there is a lot depending on app/common/urlParse.js and changing to a url.URL shim breaks so much stuff is a little concerning since all those other dependencies could be having similar problems with the mismatch between url.parse() and muon.url.parse, making the unit tests that touch these less than helpful. So the switch is probably worth the change.

@rvagg i completely agree. It would be a priority for us to fix in the current codebase if we weren't migrating to a new codebase (https://github.com/brave/brave-browser) that doesn't have node integration at all, in which we'll only be using Chromium's URL parser.

@rvagg
Copy link
Author

rvagg commented Apr 5, 2018

I've got three different variations of "before all" hook errors trying to get it to run unfortunately.

$ npm run test -- --grep='muon tests'

> brave@0.22.0 test /Users/rvagg/git/browser-laptop
> cross-env NODE_ENV=test mocha "test/**/*Test.js" "--grep=muon tests"



  muon tests
    1) "before all" hook


  0 passing (8s)
  1 failing

  1) muon tests "before all" hook:
     Error: Promise was rejected with the following reason: timeout
      at getUrl() - brave.js:282:40

Another was a timeout "at windowHandles() - brave.js:278:32" and another was a selenium runtime error "unknown error: devTools is not defined".

@diracdeltas
Copy link
Member

@rvagg forgot to mention, you have to do npm run watch-test in a separate terminal before running the test

@diracdeltas diracdeltas self-assigned this Apr 17, 2018
@rvagg rvagg force-pushed the rvagg/search-bar-url-fix branch from d308fd5 to 1ef653b Compare April 21, 2018 12:33
@rvagg rvagg force-pushed the rvagg/search-bar-url-fix branch from 1ef653b to 762a1b7 Compare April 21, 2018 12:34
* muon unit test runner cleanup
* muon unit test runner able to do async
* muon unit test runner add before/affter

Auditors:

Test Plan:
@rvagg rvagg force-pushed the rvagg/search-bar-url-fix branch from 762a1b7 to 26455ad Compare April 21, 2018 13:09
@rvagg
Copy link
Author

rvagg commented Apr 21, 2018

689c890 & 26455ad are not strictly part of this fix, you can take them or leave them. They extend your muon test runner to be able to handle tests that can be written for both mocha and within muon which reports back to mocha.

The muon unit test runner now implements a simple test runner that does enough of what mocha does to allow tests to be mostly unmodified. Unfortunately they have to be split out into separate modules and exported as a nested object, where the keys are the test names and objects are the test functions. These are then either pulled into the muon test runner and use a wrapper around assert to do the HTML printing and reporting back for the command line mocha instance, or when run natively in Node there's a test/unit/runMuonCompatibleTests.js that takes the nested object full of tests and just writes it out as a mocha test. It's slightly clunky but it means that tests can be run in both environments rather than having to assume that tests run in node reflect what happens when they're run in muon, which they clearly aren't right now.

If you take out 370f8f6 and run the muon unit tests you'll see a couple of the siteSuggestion tests fail on tokenizing, which is the original bug. They don't fail with 370f8f6 in.

Currently there are 3 tests failing in this current config, I haven't fixed them because they're not what I was trying to address and I think they may just be bad assumptions within the tests themselves so I figured I'd leave it to you to decide what to do with them. They point to the same problem we're addressing with this PR that the Node environment is artificial when there's places like urlParse.js that have a conditional switch between muon-native and node-native.

  1. 'muon tests urlutilTest isFileType relative file' in test/unit/lib/urlutilTestComponents.js: test.equal(urlUtil().isFileType('/file/abc/test.pdf', 'pdf'), true) is false because it needs to be file:/// in muon.
  2. 'muon tests urlutilTest getOrigin gets URL origin for url with port' in test/unit/lib/urlutilTestComponents.js: test.strictEqual(urlUtil().getOrigin('https://bing.com:443/?test=1#abc'), 'https://bing.com:443') fails because in muon it'll strip out the :443 for https (the next test confirms that it won't strip :443 when you have http).
  3. 'muon tests urlutilTest getOrigin gets URL origin for slashless protocol URL' in test/unit/lib/urlutilTestComponents.js: test.strictEqual(urlUtil().getOrigin('about:test/foo'), 'about:test') fails, I don't know whether this is a critical thing that should be passing or not.

I know this may not fit what you're trying to do with the muon unit test runner, and I won't be upset if you decide not to take these commits, over to you. I enjoyed the spelunking anyway. I really just want the url suggest bug fixed, I'm running my own fork right now cause it annoys me so much to have it broken.

As an aside:

  • You're using a very old version of sinon, it doesn't webpack as it is cause it has the old crappy AMD detection stuff in it that webpack borks at. I tried updating to the latest but too much has changed and I decided I didn't want to fix every use of it. Instead I went for swapping out sinon for lolex for the fakeTimer stuff, sinon uses lolex to do it so it's no real change and I put the same version in package.json as is being used by the version of sinon you're using so there's no duplication. If you want to go any further will pulling tests into the frontend then this is going to have to be addressed.
  • 'sessionStore unit tests loadAppState when checking data.lastAppVersion when it does NOT match app.getVersion calls cleanAppData' is failing for me in test/unit/app/sessionStoreTest.js now that I've rebased to current master. I don't think this has anything to do with my changes so I'm ignoring it for now.

@diracdeltas diracdeltas added this to the 0.24.x (Nightly Channel) milestone Apr 23, 2018
Copy link
Member

@diracdeltas diracdeltas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diracdeltas
Copy link
Member

@rvagg Thanks for the significant test improvements! I am really glad this points out some more issues with Node/Muon parsing compatibility.

WRT asides,

You're using a very old version of sinon, it doesn't webpack as it is cause it has the old crappy AMD detection stuff in it that webpack borks at. I tried updating to the latest but too much has changed and I decided I didn't want to fix every use of it.

cc @bbondy - can we address this in the new brave-browser repo?

'sessionStore unit tests loadAppState when checking data.lastAppVersion when it does NOT match app.getVersion calls cleanAppData'

It also fails for me locally, but passes on Travis somehow. I have opened #13907

@diracdeltas diracdeltas merged commit 2e63d0d into brave:master Apr 23, 2018
@diracdeltas
Copy link
Member

master / 0.24.x: 2e63d0d

@bsclifton
Copy link
Member

Working on uplifting this to 0.23.x 😄 Code is not there yet... hang tight

@diracdeltas
Copy link
Member

@bsclifton do you still need help uplifting? i can do it if so

@bsclifton
Copy link
Member

@diracdeltas yes- that would be awesome 😄

diracdeltas added a commit that referenced this pull request Aug 3, 2018
Don't break up as URL for autocomplete if location bar text isn't a URL
@diracdeltas
Copy link
Member

0.23.x: 7928737

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

URL-like strings can cause empty autocomplete results
4 participants