Skip to content

Crash with malformed <meta> tag #739

Closed
@jengelh

Description

@jengelh

Input:

#include <tidy.h>
int main(void)
{
        TidyDoc tdoc = tidyCreate();
        tidyParseString(tdoc, "<!DOCTYPE HTML PUBLIC \"\"-//W3C//DTD HTML 4.0 Transitional//EN\"\"><html><head><meta content=\"\"text/html; charset=utf-8\"\" http-equiv=Content-Type>");
        tidyCleanAndRepair(tdoc);
        return 0;
}

Output:

GNU gdb (GDB; openSUSE Tumbleweed) 8.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://bugs.opensuse.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from a.out...done.
(gdb) r
Starting program: /home/jengelh/work/kc/librosie/a.out 
line 1 column 77 - Warning: <meta> attribute "text/html;" lacks value
line 1 column 77 - Info: value for attribute "charset" missing quote marks
line 1 column 77 - Info: value for attribute "http-equiv" missing quote marks
line 1 column 71 - Warning: inserting missing 'title' element
line 1 column 77 - Warning: discarding unexpected <meta>

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b31418 in prvTidyTidyMetaCharset (doc=0x602260) at /usr/src/debug/tidy-5.6.0-0.x86_64/src/clean.c:2215
2215        for (currentNode = head->content; currentNode; currentNode = currentNode->next)
(gdb) b prvTidyTidyMetaCharset
Breakpoint 1 at 0x7ffff7b30d2a: file /usr/src/debug/tidy-5.6.0-0.x86_64/src/clean.c, line 2178.
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/jengelh/work/kc/librosie/a.out 
line 1 column 77 - Warning: <meta> attribute "text/html;" lacks value
line 1 column 77 - Info: value for attribute "charset" missing quote marks
line 1 column 77 - Info: value for attribute "http-equiv" missing quote marks
line 1 column 71 - Warning: inserting missing 'title' element

Breakpoint 1, prvTidyTidyMetaCharset (doc=0x602260) at /usr/src/debug/tidy-5.6.0-0.x86_64/src/clean.c:2178
2178    {
(gdb) n
2182        Bool charsetFound = no;
(gdb) 
2183        uint outenc = cfg(doc, TidyOutCharEncoding);
(gdb) 
2184        ctmbstr enc = TY_(GetEncodingNameFromTidyId)(outenc);
(gdb) 
2186        Node *head = TY_(FindHEAD)(doc);
(gdb) 
2194        Bool add_meta = cfgBool(doc, TidyMetaCharset);
(gdb) 
2197        if (!head || !enc || !TY_(tmbstrlen)(enc))
(gdb) 
2199        if (outenc == RAW)
(gdb) 
2202        if (outenc == ISO2022)
(gdb) 
2205        if (cfgAutoBool(doc, TidyBodyOnly) == TidyYesState)
(gdb) 
2208        tidyBufInit(&charsetString);
(gdb) 
2210        tidyBufClear(&charsetString);
(gdb) 
2211        tidyBufAppend(&charsetString, "charset=", 8);
(gdb) 
2212        tidyBufAppend(&charsetString, (char*)enc, TY_(tmbstrlen)(enc));
(gdb) 
2213        tidyBufAppend(&charsetString, "\0", 1); /* zero terminate the buffer */
(gdb) 
2215        for (currentNode = head->content; currentNode; currentNode = currentNode->next)
(gdb) 
2217            if (!nodeIsMETA(currentNode))
(gdb) p currentNode
$1 = (Node *) 0x60b5f0
(gdb) p currentNode->prev
$2 = (Node *) 0x0
(gdb) n
2219            charsetAttr = attrGetCHARSET(currentNode);
(gdb) 
2220            httpEquivAttr = attrGetHTTP_EQUIV(currentNode);
(gdb) 
2221            if (!charsetAttr && !httpEquivAttr)
(gdb) 
2227            if (charsetAttr && !httpEquivAttr)
(gdb) 
2262            if (httpEquivAttr && !charsetAttr)
(gdb) 
2329            if (httpEquivAttr && charsetAttr)
(gdb) 
2332                prevNode = currentNode->prev;
(gdb) 
2333                TY_(Report)(doc, head, currentNode, DISCARDING_UNEXPECTED);
(gdb) 
line 1 column 77 - Warning: discarding unexpected <meta>
2334                TY_(DiscardElement)(doc, currentNode);
(gdb) 
2335                currentNode = prevNode;
(gdb) 
2215        for (currentNode = head->content; currentNode; currentNode = currentNode->next)
(gdb) 

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b31418 in prvTidyTidyMetaCharset (doc=0x602260) at /usr/src/debug/tidy-5.6.0-0.x86_64/src/clean.c:2215
2215        for (currentNode = head->content; currentNode; currentNode = currentNode->next)

More info:
libtidy 5.6.0.
Regression from 5.4.0 where it appeared to work fine (no crash).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions