Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new_audit(dobetterweb): add new audit for doctype #5274

Merged
merged 4 commits into from
Jul 9, 2018
Merged

new_audit(dobetterweb): add new audit for doctype #5274

merged 4 commits into from
Jul 9, 2018

Conversation

schalkneethling
Copy link

[WIP] Adds audit for Doctype

Closes #5213

afterPass(passContext) {
const driver = passContext.driver;

return driver.getDocument().then(node => node);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am hoping this will return document as it does not look like I can call driver.document as what I am really after is document and then from that document.name

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than using the protocol's DOM methods, let's just grab the doctype via javascript.

define a function declaration in the gatherer.. something like

function getDoctype() {
  const {name, publicId, systemId} = document.doctype;
  return {name, publicId, systemId};
}

( i think we should grab the name, publicId, and systemId )

then, on L18 you can do

driver.evaluateAsync(`(${getDoctype.toString()}())`)

see the cache-contents gatherer for a model.

Copy link
Member

@paulirish paulirish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice start!

You'll want to add this audit to default-config.js so you can actually try it out. ;)

};
}

return {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

over here we'll assert that .name === 'html'

link up https://html.spec.whatwg.org/multipage/parsing.html#the-initial-insertion-mode in a comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and i guess that publicId and systemId are blank (if i'm reading the spec right)

afterPass(passContext) {
const driver = passContext.driver;

return driver.getDocument().then(node => node);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than using the protocol's DOM methods, let's just grab the doctype via javascript.

define a function declaration in the gatherer.. something like

function getDoctype() {
  const {name, publicId, systemId} = document.doctype;
  return {name, publicId, systemId};
}

( i think we should grab the name, publicId, and systemId )

then, on L18 you can do

driver.evaluateAsync(`(${getDoctype.toString()}())`)

see the cache-contents gatherer for a model.

return {
name: 'doctype',
description: 'Document uses HTML5 doctype',
failureDescription: 'Document does not use HTML5 doctype or no doctype found',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Page is missing the HTML doctype

static get meta() {
return {
name: 'doctype',
description: 'Document uses HTML5 doctype',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Page has the HTML doctype

@schalkneethling
Copy link
Author

Great stuff, thanks for the review and feedback @paulirish - I will work on this next round and update the PR.

@paulirish
Copy link
Member

Cool! btw here's how to add to the config (and temporarily only run this one audit, skipping the rest)

diff --git a/lighthouse-core/config/constants.js b/lighthouse-core/config/constants.js
index a8995d63..e9808423 100644
--- a/lighthouse-core/config/constants.js
+++ b/lighthouse-core/config/constants.js
@@ -42,7 +42,9 @@ const defaultSettings = {
   blockedUrlPatterns: null,
   additionalTraceCategories: null,
   extraHeaders: null,
-  onlyAudits: null,
+  onlyAudits: [
+    'doctype',
+  ],
   onlyCategories: null,
   skipAudits: null,
 };
diff --git a/lighthouse-core/config/default-config.js b/lighthouse-core/config/default-config.js
index f06ab1f5..cbb0b6b1 100644
--- a/lighthouse-core/config/default-config.js
+++ b/lighthouse-core/config/default-config.js
@@ -32,6 +32,7 @@ module.exports = {
       'dobetterweb/all-event-listeners',
       'dobetterweb/anchors-with-no-rel-noopener',
       'dobetterweb/appcache',
+      'dobetterweb/doctype',
       'dobetterweb/domstats',
       'dobetterweb/js-libraries',
       'dobetterweb/optimized-images',
@@ -162,6 +163,7 @@ module.exports = {
     'byte-efficiency/uses-responsive-images',
     'byte-efficiency/efficient-animated-content',
     'dobetterweb/appcache-manifest',
+    'dobetterweb/doctype',
     'dobetterweb/dom-size',
     'dobetterweb/external-anchors-use-rel-noopener',
     'dobetterweb/geolocation-on-start',
@@ -371,6 +373,7 @@ module.exports = {
         {id: 'no-document-write', weight: 1},
         {id: 'external-anchors-use-rel-noopener', weight: 1},
         {id: 'geolocation-on-start', weight: 1},
+        {id: 'doctype', weight: 1},
         {id: 'no-vulnerable-libraries', weight: 1},
         {id: 'notification-on-start', weight: 1},
         {id: 'deprecations', weight: 1},

and to then iterate quickly you can do

# just run gatherers
lighthouse http://example.com -G

# just run audits from previously gathered artifacts
lighthouse http://example.com -A

# regular complete run but save artifacts to disk
lighthouse http://example.com -GA

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the type checking bug, you'll need to add your new Doctype artifact to the artifacts type definition so that other files know that it's provided: artifacts.d.ts

@schalkneethling
Copy link
Author

Thanks @brendankenny

@schalkneethling
Copy link
Author

Sorry folks, have been down with a side of the swine flu. Update coming soon.

@schalkneethling
Copy link
Author

Alrighty, finally got around to this again. Thanks so much for all of the feedback and guidance. It seems to me to be working so yay! \o/

screen shot 2018-06-17 at 12 59 59

With that said, running yarn test throws up a bunch of errors which I am finding a hard time correlating to the work I have done here. Let me know once you have had a chance to look over the changes. Thanks again.

Copy link
Contributor

@evenstensberg evenstensberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments / opinions. Nice work!

* @return {LH.Audit.Product}
*/
static audit(artifacts) {
if (artifacts.Doctype === null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nitpick here, you can do if(!artifacts.Doctype), and maybe the naming should be lower case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and maybe the naming should be lower case?

It's uppercase for historical reasons (it automatically uses the the gatherer class name), and we have wanted to change that at some point, but not starting with this PR :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ev1stensberg I totally could but, I tend to want to be explicit. And as the compiler/minifier will optimise the code in the end, it seems ok. Thoughts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to add your suggestion in the end ;)

/* Note that the value for name is case sensitive,
and must be the string `html`. For details see:
https://html.spec.whatwg.org/multipage/parsing.html#the-initial-insertion-mode */
if (artifacts.Doctype.name.trim() === 'html') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleaner to store this in a variable?

};
}

/* Note that the value for name is case sensitive,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just mention it's case sensitive and a link like: case sensitive: https://foo.bar

} else {
return {
rawValue: false,
explanation: 'Doctype name must be the lowercase string html',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discard the?

afterPass(passContext) {
const driver = passContext.driver;
/** @type {Object<string>} */
return driver.evaluateAsync(`(${getDoctype.toString()}())`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe extract this to a var? A lot of encapsulation

describe('DOBETTERWEB: doctype audit', () => {
it('fails when HTML does not contain a doctype', () => {
const auditResult = Audit.audit({
Doctype: null,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See lowercase suggestion

package.json Outdated
@@ -23,9 +23,7 @@
"build-viewer": "cd ./lighthouse-viewer && yarn build",
"clean": "rimraf *.report.html *.report.dom.html *.report.json *.screenshots.html *.screenshots.json *.devtoolslog.json *.trace.json || true",
"lint": "[ \"$CI\" = true ] && eslint --quiet -f codeframe . || eslint .",

"smoke": "node lighthouse-cli/test/smokehouse/run-smoke.js",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know if these whitespace trims are needed

Copy link
Member

@brendankenny brendankenny Jun 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, these are (annoyingly) trimmed automatically when running one of several different yarn/npm commands, but since no functional changes to the file can just remove the whole file from the PR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was required because there was a conflict reported in package.json by Github.

@brendankenny
Copy link
Member

It's kind of buried in the end of the yarn diff:sample-json test output, but you need to run yarn update:sample-json to update the "golden lhr" expectations file since there will be a new audit output in there.

The Viewer test uses that file for expected output as well, so running that command will fix that test too.

@brendankenny brendankenny changed the title new_audit(dobetterweb): add new audit for doctype. new_audit(dobetterweb): add new audit for doctype Jun 17, 2018
Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for staying with this @schalkneethling!

There's so many interlocking pieces to update now, I think we need to update the contributing guide and hopefully get you to review it after this :)

/**
* Get and return `name`, `publicId`, `systemId` from
* `document.doctype`
* @return {Object<string>}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe for the typescript compiler, Object needs the key and value types, so it's Object<string, string>, though it's only complaining for one of these, so I'm not positive.

Regardless, it would probably be better to do the more specific @return {{name: string, publicId: string, systemId: string}} anyways.

*/
afterPass(passContext) {
const driver = passContext.driver;
/** @type {Object<string>} */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@type {{name: string, publicId: string, systemId: string}}

@@ -0,0 +1,11762 @@
{
Copy link
Member

@brendankenny brendankenny Jun 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove file from PR?

package.json Outdated
@@ -23,9 +23,7 @@
"build-viewer": "cd ./lighthouse-viewer && yarn build",
"clean": "rimraf *.report.html *.report.dom.html *.report.json *.screenshots.html *.screenshots.json *.devtoolslog.json *.trace.json || true",
"lint": "[ \"$CI\" = true ] && eslint --quiet -f codeframe . || eslint .",

"smoke": "node lighthouse-cli/test/smokehouse/run-smoke.js",
Copy link
Member

@brendankenny brendankenny Jun 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, these are (annoyingly) trimmed automatically when running one of several different yarn/npm commands, but since no functional changes to the file can just remove the whole file from the PR

explanation: 'Doctype name must be the lowercase string html',
};
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @paulirish also wanted to give audit feedback on the "publicId and systemId should be empty string cases" to get full coverage of the spec? Though is that only xhtml or how does that even happen :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulirish You want me to add checks for those additional properties in this audit? Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup!

this applies to HTML pages as well. just make assert those params are empty string.

Doctype: null,
});
assert.equal(auditResult.rawValue, false);
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also want to test the empty string, non-"html", and "html" success cases (as well as publicId and systemId if added?)

@@ -53,6 +53,8 @@ declare global {
CrawlableLinks: {href: string, text: string}[];
/** CSS coverage information for styles used by page's final state. */
CSSUsage: {rules: Crdp.CSS.RuleUsage[], stylesheets: Artifacts.CSSStyleSheetInfo[]};
/** Information on the document's Doctype, specifically the name, publicId, and systemId */
Copy link
Member

@brendankenny brendankenny Jun 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also "Properties will be the empty string if not present on the doctype." or something in the comment? I was expecting undefined or whatever but at least publicId and systemId are specced to be the empty string if not present.

Also probably something like /** Information on the document's doctype (or null if not present), specifically...

@@ -53,6 +53,8 @@ declare global {
CrawlableLinks: {href: string, text: string}[];
/** CSS coverage information for styles used by page's final state. */
CSSUsage: {rules: Crdp.CSS.RuleUsage[], stylesheets: Artifacts.CSSStyleSheetInfo[]};
/** Information on the document's Doctype, specifically the name, publicId, and systemId */
Doctype: Artifacts.Doctype;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doctype: Artifacts.Doctype | null;

* @return {Object<string>}
*/
function getDoctype() {
const {name, publicId, systemId} = document.doctype; // eslint-disable-line no-undef
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will throw if there's no doctype, so I guess have an early return as well?

if (!document.doctype) {
  return null;
}
...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much better, thx

@schalkneethling
Copy link
Author

Thank you for the review everyone. Updated.

@schalkneethling
Copy link
Author

Oh I see one failing test when running yarn test but seems unrelated:

screen shot 2018-06-17 at 19 54 09

@schalkneethling
Copy link
Author

@paulirish @brendankenny Ok, I added tests for systemId and publicId.

I am seeing a new TS error now:

lighthouse-core/gather/gatherers/dobetterweb/doctype.js(17,5): error TS2322: Type 'undefined' is not assignable to type '{ name: string; publicId: string; systemId: string; }'.

I am guessing it is happening because the function could return undefined here:
https://github.com/GoogleChrome/lighthouse/pull/5274/files#diff-5bd8ddd9e30281c7e7ec5103a145d002R17

@paulirish
Copy link
Member

 /**
  * Get and return `name`, `publicId`, `systemId` from
  * `document.doctype`
- * @return {{name: string, publicId: string, systemId: string}}
+ * @return {{name: string, publicId: string, systemId: string}=}
  */
 function getDoctype() {
   if (!document.doctype) { // eslint-disable-line no-undef

@schalkneethling
Copy link
Author

Thanks @paulirish updated.

@schalkneethling
Copy link
Author

Also resolved the conflict, let me know and I should rebase if required.

@schalkneethling
Copy link
Author

@paulirish @brendankenny Are there still some items outstanding in the PR? Thanks!

@brendankenny
Copy link
Member

Not sure why appveyor keeps failing the smoke tests for this PR but Travis does not. We'll have to investigate.

@brendankenny
Copy link
Member

brendankenny commented Jun 25, 2018

Ah, I thought I saw a different failure before, but the failure now is also on master. Could be a Canary (appveyor, local) vs Stable (travis) issue? (we're investigating)

*/
static get meta() {
return {
name: 'doctype',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id

static get meta() {
return {
name: 'doctype',
description: 'Page has the HTML doctype',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

title

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these naming changes are something you'll notice when you rebase to master ;)

return {
name: 'doctype',
description: 'Page has the HTML doctype',
failureDescription: 'Page is missing the HTML doctype',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failureTitle

name: 'doctype',
description: 'Page has the HTML doctype',
failureDescription: 'Page is missing the HTML doctype',
helpText: 'Specifying a doctype prevents the browser from switching to quirks-mode.' +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

description

};
}

if (doctypeName.length === 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you can nuke this and let the L72 case handle it.

} else {
return {
rawValue: false,
explanation: 'Doctype name must be the lowercase string html',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd put html in backticks, though.

*/
function getDoctype() {
if (!document.doctype) { // eslint-disable-line no-undef
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's return null instead

/**
* Get and return `name`, `publicId`, `systemId` from
* `document.doctype`
* @return {{name: string, publicId: string, systemId: string}=}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{{name: string, publicId: string, systemId: string} | null}

*/
afterPass(passContext) {
const driver = passContext.driver;
/** @type {{name: string, publicId: string, systemId: string}} */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can delete this, you're covered by @return {Promise<LH.Artifacts['Doctype']>}

* @return {{name: string, publicId: string, systemId: string}=}
*/
function getDoctype() {
if (!document.doctype) { // eslint-disable-line no-undef
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw wanna add a comment saying that https://www.warnerbros.com/archive/spacejam/movie/jam.htm is one such site with a null doctype? :)

@schalkneethling
Copy link
Author

@paulirish @brendankenny updated. Apologies for how long this has taken.

Copy link
Contributor

@evenstensberg evenstensberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments 👍

@@ -0,0 +1,92 @@
/**
* @license Copyright 2017 Google Inc. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018

@@ -0,0 +1,36 @@
/**
* @license Copyright 2017 Google Inc. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018

@@ -0,0 +1,73 @@
/**
* @license Copyright 2017 Google Inc. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018

@@ -3287,7 +3300,7 @@
}
],
"id": "best-practices",
"score": 0
"score": null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this null?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this file is generated by update:sample-json so, not something I specifically did in this PR. Perhaps I am wrong? @paulirish @brendankenny

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right @schalkneethling, but it's null because your audit is erroring when run on our fixtures
because you added a new artifact. Not ideal that we're adding an erroring audit to our golden, do you think you could manually add an expected Doctype artifact to https://github.com/GoogleChrome/lighthouse/blob/master/lighthouse-core/test/results/artifacts/artifacts.json?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback @patrickhulce. I am happy to add it but, I probably need some guidance. Thanks!

@paulirish
Copy link
Member

paulirish commented Jul 6, 2018 via email

@schalkneethling
Copy link
Author

@paulirish Updates. I believe I did it correctly. The value is no longer null

@schalkneethling
Copy link
Author

Fixed the failing tests. Let's see if we can get an all green build 😃

@paulirish paulirish merged commit 9e0341a into GoogleChrome:master Jul 9, 2018
@paulirish
Copy link
Member

Looks great!

Nice job @schalkneethling and thanks for working with us on this one. :)

We're happy to have your contribution and happy to have this audit.

@schalkneethling
Copy link
Author

Thanks for all the support on this one everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants