-
Notifications
You must be signed in to change notification settings - Fork 13
HTMLFileContent and component for extracting IDL fragments #46
Conversation
}); | ||
} else { | ||
// Currently not doing anything for: | ||
// - Self closing tags (item.type.name === OPEN_CLOSE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OPEN_CLOSE that aren't pre
|
||
if (foam.core.FObject.isInstance(item)) { | ||
var top = tags[tags.length - 1]; | ||
if (top === undefined || item.type.name === OPEN) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need the top defined check. Only open tags should be pushed, so if you somehow ended up with an empty stack and either an extra close tag (or plain text, if parseString returns text), not pushing is probably the right thing to do.
}); | ||
|
||
it('should parse a pre tag with no content', function() { | ||
var content = '<pre class="idl"></pre>'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing the parse step here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, this was actually one of the old test that I am no longer using. Will likely modify this test or remove in next commit
// Determine if node is of class IDL | ||
var isIDL = false; | ||
item.attributes.forEach(function(attr) { | ||
if (attr.name === 'class' && attr.value.split(' ').includes('idl')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't there something about having to scan back up the stack for a parent tag with the right class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this for a little and decided to implement it slightly differently.
Scanning the stack every time we have a potential tag we are interested in seemed like it could be very costly (e.g. when there is deep nesting of tags occurring). So I am using a variable (skipStack
) to keep track of the level in the stack where these excluded tags are included. It itself behaves as a stack as we may potentially have nesting of excluded tags.
Please let me know if you see any flaws / problems with this approach.
var expectedContent = fs.readFileSync(`${testDirectory}/${filename}`).toString(); | ||
expect(preBlocks[testNum].content.trim()).toBe(expectedContent.trim()); | ||
} else if (filename !== 'spec.html') { | ||
console.warn(`${filename} was not used in ${testName} spec test`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably fail the test.
@@ -0,0 +1,552 @@ | |||
typedef ([AllowShared] Uint32Array or sequence<GLuint>) Uint32List; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is currently causing problems in parsing. The W3C Grammar does seem to have a rule for [ExtendedAttributes]
proceeding a type in a typedef
definition. It is present in HeyCam's Grammar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the approach taken so far has been to extend the IDL parser for each source of IDL content if something is encountered that doesn't follow the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment in Parser.js
seems to imply that it was designed around HeyCam's Grammar. Perhaps there were changes to the spec / grammar between the time the Parser was written and the current specs. I am currently working on adding this change. Hoping to have a PR before the end of the day today (hopefully).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved by #47
Refactor is complete. The test will not pass until foam-framework/foam2#415 and foam-framework/foam2#423 are merged into the |
Codecov Report
@@ Coverage Diff @@
## master #46 +/- ##
==========================================
+ Coverage 94.32% 94.92% +0.59%
==========================================
Files 81 84 +3
Lines 564 630 +66
==========================================
+ Hits 532 598 +66
Misses 32 32
Continue to review full report at Codecov.
|
|
||
if (!tagMatching) { | ||
// Ignoring all tags. Only extracting text within pre tags. | ||
if (isTag && item.nodeName === 'pre') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to handle nested pre tags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is hard to say at this point whether nested pre
tags affect the information we care about. From my current observations, there hasn't been any IDL fragments within nest pre
tags (they seem to be mostly used for formatting). Thus, it seems like it currently is sufficient for our purposes (at least I hope so)
We could attempt to put the content through another round of processing or implement a proper HTML parser (which was my first attempt at this problem, but was scrapped since it did a lot more than it needed to and likely had other issues too).
tagStack.push(item); | ||
} else if (top && item.type.name === CLOSE && top.nodeName === item.nodeName) { | ||
var parentCls = extractAttr(top, 'class'); | ||
if (isExcluded(parentCls)) exclude = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for there to be an excluded tag inside an excluded tag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not yet observed an instance of an exclude tag nested within another, but there is a relatively simple fix to this issue (using a stack to track excluded tags instead of a bool), so that has been implemented. Will be in next set of changes.
All requested changes should be made by now. @arobins please let me know if you have any feedback the latest set of changes and comments. |
LGTM. @mdittmer should probably look over it as well, because I'm definitely not a FOAM expert. |
class: 'String', | ||
name: 'url', | ||
required: true, | ||
final: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add trailing comma
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm comfortable with this design once comments are addressed. Future alternative also proposed at #53
class: 'String', | ||
name: 'url', | ||
required: true, | ||
final: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and elsewhere: May not be able to get away with "final: true" anymore; DatastoreDAO
(which we will use eventually) instantiates objects first, then sets properties. I think "final: true" will create setters that silently fail.
}, | ||
{ | ||
class: 'String', | ||
name: 'content', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HTMLFileContents.contents
would be a more consistent name.
package: 'org.chromium.webidl', | ||
name: 'HTMLFileContents', | ||
|
||
documentation: 'An HTML file that stores it contents.', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More docs: Is the HTMLFileContents.contents
pre-processed in any way? (E.g., &foo;
-escaped?) or is it the raw request body?
foam.CLASS({ | ||
package: 'org.chromium.webidl', | ||
name: 'IDLFragmentExtractor', | ||
documentation: 'extracts IDL Fragments from HTML files', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Full sentence.
(with capital and period)
var lexer = self.HTMLLexer.create(); | ||
var OPEN = lexer.TagType.OPEN.name; | ||
var CLOSE = lexer.TagType.CLOSE.name; | ||
var extractAttr = function(node, attrName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's legitimate to have:
<node-name attr="value1
value2" attr="value3">
to yield {attr: ['value1, 'value2', 'value3']}
I assume HTMLLexer doesn't collapse whitespace, so I think we need to revise this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made minor changes to the extract code which allows for this. It will be part of the next set of changes.
// As of this writing, there has not been any IDL fragments | ||
// that has been found within nested pre tags. | ||
if (!tagMatching) { | ||
// Ignoring all tags. Only extracting text within pre tags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this comment. Can we get a comment at the top of each if
branch in this method? The logic is pretty complex.
} | ||
tagStack.push(item); | ||
} else if (top && item.type.name === CLOSE && top.nodeName === item.nodeName) { | ||
var parentCls = extractAttr(top, 'class'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't parent
, right? It's openTag
? Maybe the code would be easier to read if this branch started with:
var openTag = top;
var closeTag = item
and open
and close
prefixes are used to refer to tag-related things.
test/any/htmlFileClasses-test.js
Outdated
expect(file.content).toBe(content); | ||
}); | ||
|
||
it('should fail to set HTMLFileContent props after creation', function() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably nix this due to how DatastoreDAO
works.
test/any/htmlFileClasses-test.js
Outdated
@@ -0,0 +1,51 @@ | |||
// Copyright 2017 The Chromium Authors. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file name: we just test HTMLFileContents
, yes? Let's name this file after that: HTMLFileContents-test.js
.
var IDLFragmentExtractor; | ||
var Parser; | ||
|
||
function cmpTest(testName, testDirectory, expectedIDL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expectedIDL
is a count/length, right? numExpectedIDLFragments
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor things, and I think we can drop the test.
class: 'Array', | ||
of: 'String', | ||
name: 'references', | ||
factory: function() { return []; }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the default factory for Array
; you can leave it out.
}, | ||
{ | ||
class: 'Array', | ||
of: 'String', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
documentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits: usually in order: class, of, documentation, name. Please uncomment documentation.
} | ||
}); | ||
return retVal; | ||
}; | ||
|
||
var results = lexer.parseString(self.file.content).value; | ||
var results = lexer.parseString(this.file.contents).value; | ||
if (!results) throw "IDL Parse was not successful."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: throw new Error(<msg>)
.
test/any/HTMLFileContents-test.js
Outdated
@@ -0,0 +1,28 @@ | |||
// Copyright 2017 The Chromium Authors. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test doesn't seem worthwhile. It amounts to checking that FOAM's create()
implementation is correct.
If you intend to guard against mistakenly final
ing something that may be set, we could switch the test to do foo.create(); foo.bar = 'bar'; expect(foo.bar).toBe('bar')
, but if that's not what you meant to test, then I'd say we can drop this test entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please merge after addressing last nits.
This work is towards #41. This PR is dependent on foam-framework/foam2#415 and foam-framework/foam2#423