Skip to content

Commit

Permalink
html5lib-tests
Browse files Browse the repository at this point in the history
Squashed commit of the following:

commit ce0f083
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Wed Jan 17 16:19:08 2024 +0100

    Add skip for known bug - all tests passing or skipped

commit dcdc710
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Wed Jan 17 16:15:05 2024 +0100

    Rename class and test function

commit 6a087d2
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Wed Jan 17 16:11:42 2024 +0100

    Fixing more lints

commit 60d1738
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Jan 16 22:44:52 2024 +0100

    Clean up and refactor test document parsing

commit 2871f31
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Jan 16 22:12:18 2024 +0100

    Add attributes to html5lib tests

commit 429c506
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Jan 16 15:08:26 2024 +0100

    Fix lint

commit 36c94f7
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Jan 16 14:37:06 2024 +0100

    Skip head tests

commit 994b9d0
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Jan 16 14:31:22 2024 +0100

    Fix some comments

commit e1fbeb4
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Jan 16 14:13:11 2024 +0100

    Fix strlen paren bug

commit 84a61b1
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Jan 16 14:09:38 2024 +0100

    Fix lints

commit 315d5cd
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Jan 16 14:09:31 2024 +0100

    Mark unsupported markup tests as incomplete, not skipped

commit 6035fd7
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Jan 16 14:06:10 2024 +0100

    Skip incomplete token tests

commit 3772db6
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Jan 16 14:01:29 2024 +0100

    Update ignores

commit f1b23b6
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Mon Jan 15 21:35:50 2024 +0100

    Fix HTML input processing

commit 1e01888
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Mon Jan 15 19:05:22 2024 +0100

    Use padded line number

    Allows filetering like line0001 so not line1 line10 line11…

commit 48eb7b9
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Mon Jan 15 18:31:27 2024 +0100

    Use line numbers for test IDs

    Line numbers are stable even if we skip tests

commit 91c6f73
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Fri Dec 22 17:38:19 2023 +0100

    Avoid running tests that expect anything in <head>

commit e570196
Author: Dennis Snell <dennis.snell@automattic.com>
Date:   Wed Dec 20 10:49:50 2023 -0600

    Add extra skipped tests

commit d642bfb
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Wed Dec 20 13:22:07 2023 +0100

    Fix expect/actual ordering, add test message

commit 12dcc69
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Dec 19 20:20:26 2023 +0100

    Move test data to test data dir

commit 138d21f
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Dec 19 18:32:55 2023 +0100

    Add ignores for formatting elements

commit 47d2f36
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Dec 19 18:25:56 2023 +0100

    Fix lint

commit 4c6a28b
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Dec 19 18:16:09 2023 +0100

    Add files crediting html5lib-tests project

commit f5afccb
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Dec 19 18:06:37 2023 +0100

    Add skipping of certain tests

commit 123ae09
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Dec 19 15:30:03 2023 +0100

    Remove space from test identifier, easier copy/paste filtering

commit d995e44
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Dec 19 15:20:47 2023 +0100

    Better tag finding

commit db5e68f
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Dec 19 14:07:30 2023 +0100

    Print nicer tests names

commit 257b871
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Dec 19 13:59:21 2023 +0100

    Skip doctype and comments in test dom tree

commit b31c921
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Tue Dec 19 13:58:57 2023 +0100

    1-index test case numbering

commit 10dc753
Author: Dennis Snell <dennis.snell@automattic.com>
Date:   Mon Dec 18 16:20:41 2023 -0600

    WPCS Nags

commit 087e7bb
Author: Dennis Snell <dennis.snell@automattic.com>
Date:   Mon Dec 18 15:57:53 2023 -0600

    Add line number to test case label

commit 4f5ca93
Author: Dennis Snell <dennis.snell@automattic.com>
Date:   Mon Dec 18 15:23:47 2023 -0600

    Avoid WPCS lint nags; skip tests for unsupported input or fragment context.

commit 9faf91b
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Mon Dec 18 21:38:12 2023 +0100

    Skip unhandled tests

commit 077e082
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Mon Dec 18 21:22:51 2023 +0100

    fix lints

commit bf618ac
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Mon Dec 18 21:18:04 2023 +0100

    Move html5lib tests to new class

commit 1f89fbd
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Mon Dec 18 21:13:49 2023 +0100

    Remove git files from html5lib

commit ed2f784
Author: Jon Surrell <sirreal@users.noreply.github.com>
Date:   Mon Dec 18 20:26:12 2023 +0100

    Add test cases from html5lib-tests tree-construction
  • Loading branch information
sirreal committed Jan 19, 2024
1 parent d17afcc commit 083f678
Show file tree
Hide file tree
Showing 61 changed files with 24,562 additions and 0 deletions.
34 changes: 34 additions & 0 deletions tests/phpunit/data/html5lib-tests/AUTHORS.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
Credits
=======

The ``html5lib`` test data is maintained by:

- James Graham
- Geoffrey Sneddon


Contributors
------------

- Adam Barth
- Andi Sidwell
- Anne van Kesteren
- David Flanagan
- Edward Z. Yang
- Geoffrey Sneddon
- Henri Sivonen
- Ian Hickson
- Jacques Distler
- James Graham
- Lachlan Hunt
- lantis63
- Mark Pilgrim
- Mats Palmgren
- Ms2ger
- Nolan Waite
- Philip Taylor
- Rafael Weinstein
- Ryan King
- Sam Ruby
- Simon Pieters
- Thomas Broyer
21 changes: 21 additions & 0 deletions tests/phpunit/data/html5lib-tests/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Copyright (c) 2006-2013 James Graham, Geoffrey Sneddon, and
other contributors

Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:

The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
8 changes: 8 additions & 0 deletions tests/phpunit/data/html5lib-tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
html5lib-tests
==============

This test data was taken from:

https://github.com/html5lib/html5lib-tests

The sha was `a9f44960a9fedf265093d22b2aa3c7ca123727b9`.
108 changes: 108 additions & 0 deletions tests/phpunit/data/html5lib-tests/tree-construction/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
Tree Construction Tests
=======================

Each file containing tree construction tests consists of any number of
tests separated by two newlines (LF) and a single newline before the end
of the file. For instance:

[TEST]LF
LF
[TEST]LF
LF
[TEST]LF

Where [TEST] is the following format:

Each test must begin with a string "\#data" followed by a newline (LF).
All subsequent lines until a line that says "\#errors" are the test data
and must be passed to the system being tested unchanged, except with the
final newline (on the last line) removed.

Then there must be a line that says "\#errors". It must be followed by
one line per parse error that a conformant checker would return. It
doesn't matter what those lines are, although they can't be
"\#new-errors", "\#document-fragment", "\#document", "\#script-off",
"\#script-on", or empty, the only thing that matters is that there be
the right number of parse errors.

Then there \*may\* be a line that says "\#new-errors", which works like
the "\#errors" section adding more errors to the expected number of
errors.

Then there \*may\* be a line that says "\#document-fragment", which must
be followed by a newline (LF), followed by a string of characters that
indicates the context element, followed by a newline (LF). If the string
of characters starts with "svg ", the context element is in the SVG
namespace and the substring after "svg " is the local name. If the
string of characters starts with "math ", the context element is in the
MathML namespace and the substring after "math " is the local name.
Otherwise, the context element is in the HTML namespace and the string
is the local name. If this line is present the "\#data" must be parsed
using the HTML fragment parsing algorithm with the context element as
context.

Then there \*may\* be a line that says "\#script-off" or
"\#script-on". If a line that says "\#script-off" is present, the
parser must set the scripting flag to disabled. If a line that says
"\#script-on" is present, it must set it to enabled. Otherwise, the
test should be run in both modes.

Then there must be a line that says "\#document", which must be followed
by a dump of the tree of the parsed DOM. Each node must be represented
by a single line. Each line must start with "| ", followed by two spaces
per parent node that the node has before the root document node.

- Element nodes must be represented by a "`<`" then the *tag name
string* "`>`", and all the attributes must be given, sorted
lexicographically by UTF-16 code unit according to their *attribute
name string*, on subsequent lines, as if they were children of the
element node.
- Attribute nodes must have the *attribute name string*, then an "="
sign, then the attribute value in double quotes (").
- Text nodes must be the string, in double quotes. Newlines aren't
escaped.
- Comments must be "`<`" then "`!-- `" then the data then "` -->`".
- DOCTYPEs must be "`<!DOCTYPE `" then the name then if either of the
system id or public id is non-empty a space, public id in
double-quotes, another space an the system id in double-quotes, and
then in any case "`>`".
- Processing instructions must be "`<?`", then the target, then a
space, then the data and then "`>`". (The HTML parser cannot emit
processing instructions, but scripts can, and the WebVTT to DOM
rules can emit them.)
- Template contents are represented by the string "content" with the
children below it.

The *tag name string* is the local name prefixed by a namespace
designator. For the HTML namespace, the namespace designator is the
empty string, i.e. there's no prefix. For the SVG namespace, the
namespace designator is "svg ". For the MathML namespace, the namespace
designator is "math ".

The *attribute name string* is the local name prefixed by a namespace
designator. For no namespace, the namespace designator is the empty
string, i.e. there's no prefix. For the XLink namespace, the namespace
designator is "xlink ". For the XML namespace, the namespace designator
is "xml ". For the XMLNS namespace, the namespace designator is "xmlns
". Note the difference between "xlink:href" which is an attribute in no
namespace with the local name "xlink:href" and "xlink href" which is an
attribute in the xlink namespace with the local name "href".

If there is also a "\#document-fragment" the bit following "\#document"
must be a representation of the HTML fragment serialization for the
context element given by "\#document-fragment".

For example:

#data
<p>One<p>Two
#errors
3: Missing document type declaration
#document
| <html>
| <head>
| <body>
| <p>
| "One"
| <p>
| "Two"
Loading

0 comments on commit 083f678

Please sign in to comment.