Skip to content

Anchor duplication - block-level anchor issue? #167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jacksonp opened this issue Feb 27, 2015 · 10 comments
Closed

Anchor duplication - block-level anchor issue? #167

jacksonp opened this issue Feb 27, 2015 · 10 comments
Assignees
Labels
Milestone

Comments

@jacksonp
Copy link

Input:

<!DOCTYPE html>
<title>Test</title>
<a id="Section" href="#Section"><h1>Section</h1></a>

Output of tidy5 -indent:

<!DOCTYPE html>

<html>
<head>
  <meta name="generator" content=
  "HTML Tidy for HTML5 for Linux version 4.9.17">

  <title>Test</title>
</head>

<body>
  <a id="Section" href="#Section">
  <h1><a id="Section" href="#Section">Section</a></h1></a>
</body>
</html>

Note the two anchor tags with id "Section" in the output.

geoffmcl pushed a commit that referenced this issue Feb 28, 2015
geoffmcl pushed a commit that referenced this issue Feb 28, 2015
@geoffmcl
Copy link
Contributor

@jacksonp yes, somehow this anchor tag got block and mixed added. Putting it back to just inline give the previous output...

@geoffmcl
Copy link
Contributor

@jacksonp please pull version 4.9.18 if you get a chance, and re-test... thanks...

@geoffmcl geoffmcl added the Bug label Feb 28, 2015
@geoffmcl geoffmcl added this to the 5.0.0 milestone Feb 28, 2015
@geoffmcl geoffmcl self-assigned this Feb 28, 2015
@jacksonp
Copy link
Author

@geoffmcl Thanks, that does stop the duplication issue, I now get this output:

<!DOCTYPE html>

<html>
<head>
  <meta name="generator" content=
  "HTML Tidy for HTML5 for Linux version 4.9.18">

  <title>Test</title>
</head>

<body>
  <h1><a id="Section" href="#Section">Section</a></h1>
</body>
</html>

But anchor tags can go around block-level elements in html5, so I guess I should have said I was expecting the output to be the same as the input, so:

<a id="Section" href="#Section"><h1>Section</h1></a>

@geoffmcl
Copy link
Contributor

@jacksonp yes I reverted the behaviour because the duplication of the anchor looked very bad! Will now look for a way to allow the anchor to be around the block if html5... thanks for the quick testing and report...

geoffmcl pushed a commit that referenced this issue Mar 6, 2015
Revert TidyTag_A to HTML5 mode, but allow the table to be modified if the
DOCTYPE given is found to NOT be HTML5, through a service TY_(AdjustTags).
Care is taken to clear any previous hash cached tags.

At present this only effects the anchor tag, but could be applied to
others that need to change their parsing due to an identified DOCTYPE.
@geoffmcl
Copy link
Contributor

geoffmcl commented Mar 6, 2015

@jacksonp see issue #169 for hopefully a solution to this...

geoffmcl pushed a commit that referenced this issue Mar 6, 2015
@jacksonp
Copy link
Author

jacksonp commented Mar 6, 2015

@geoffmcl With that fix I get this output:

<body>
  <a id="Section" href="#Section" name="Section">
  <h1>Section</h1></a>
</body>

It seems to now add a name attribute, which isn't supported in html5.

geoffmcl pushed a commit that referenced this issue Mar 6, 2015
geoffmcl pushed a commit that referenced this issue Mar 6, 2015
@geoffmcl
Copy link
Contributor

geoffmcl commented Mar 6, 2015

@jacksonp some further fixes on doctype and version... hope for some good news...

@jacksonp
Copy link
Author

jacksonp commented Mar 6, 2015

@geoffmcl That's great, the tags are now perfect. The indentation seems a bit off tho, here's the output I get with
tidy5 -indent -wrap 400 test.html:

<body>
  <a id="Section" href="#Section">
  <h1>Section</h1></a>
</body>

I think the anchor block should all be one line in this case (same as the input)?

@geoffmcl
Copy link
Contributor

geoffmcl commented Mar 6, 2015

@jacksonp just glad the tags are right!

To me there is a LOT wrong with the pprint.c module, especially when to, and to not add a new line... there is even a feature request to try and keep the current indents, lines, etc... very difficult...

But for now I am only concentrating on html tag bugs... Would appreciate you opening a new issue for the pprint output which I think will be attacked after we have 5.0.0 out the door... probably a 5.1 target, unless you, or others, can offer easy PR or diffs... dig into it...

Thanks for the quick testing and report...

@jacksonp
Copy link
Author

jacksonp commented Mar 6, 2015

@geoffmcl I've added a new issue for the pprint output (#179).

I hit this issue straight away when trying tidy-html5 for the first time, haven't have a chance to see what it can do yet!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants