Skip to content

Commit

Permalink
encoding/xml: fixes to enforce XML namespace standard
Browse files Browse the repository at this point in the history
All issues of golang#13400 which are not new functionalities have fixes.
There are minor incompatibilities between them due to the handling of
prefixes. Duplicating a prefix or an namespace is invalid XML.
This is now avoided. XML produced is always valid.

Tests have been added for each fix and example and previous tests
fixed as output is already more compact is some cases.

encoding/xml: fix duplication of namespace tags by encoder (golang#7535)

A tag prefix identifies the name space of the tag and not the default name
space like xmlns="...". Writing the prefix is incorrect when it is bound
to a name space using the standard xmlns:prefix="..." attribute.
This fix skips this print.

Consequences are
 - duplication is avoided in line with name space standard in reference.
 - the _xmlns declaration does not appear anymore. To keep the previous
behaviour, the prefix is printed in all other cases. Token now always
produces well-formed XML except when the explicit name space collides
with the prefix.

Made prefix handling "xmlns="space" xmlns:_xmlns="xmlns" _..." has
be removed in all wants of tests. In some cases, useless declarations like
xmlns:x="x" are still added in line with previous behavior.

encoding/xml: fix unexpected behavior of encoder.Indent("", "") (golang#13185, golang#11431)

MarshalIndent and Marshal share code. When prefix and indent are
empty, the behavior is like Marshal when it should have a minimal
indent, i.e. new line as in documentation.

A boolean is added to the local printer struct which defaults to false.
It is set to true only when MarshalIndent is used and prefix and indent
are empty.

encoding/xml: fix overriding by empty namespace (golang#7113)

The namespace defined by xmlns="value" can be overridden in
every included tag including by the empty namespace xmlns="".
Empty namespace is not authorized with a prefix (xmlns:ns="").

Method to calculate indent of XML handles depth of tag and
its associated namespace. The fix leaves the method active even
when no indent is required.

An XMLName field in a struct means that namespace must be
enforced even if empty. This occurs only on an inner tag as an
overwrite of any non-empty namespace of its outer tag.

To obtain the xmlns="" required, an attribute is added.

encoding/xml: fix panic on embedded unexported XMLName (golang#10538)

By laws of reflection, unexported fields are unreachable by .Value.
XMLName are allowed at any level of an inner struct but the struct
may not be reachable. If XMLName field is found without name,
it cannot be exported and the value is discarded like other fields.
Some comments have been to underline where the issue arises.
Another XMLName test was incorrectly set up and is fixed.
Various cases added in a specific test.

encoding/xml: fix panic of unmarshaling of anonymous structs (golang#16497)

Encoding/xml is using type "typinfo" which provides its own "value" method
using the reflection package. It fails to check for the documented possible
panics of the reflection methods. It is impossible to fix the method as the
parameter is also using reflection.

The fix is to discard anonymous structs which have no value anyway.
Encoder/xml documentation already mentions that fields are accessed using
the reflection package. A relevant test is added.

encoding/xml: fix closing tag failure (golang#20685)

Push/pop of elements name must be done using the eventual prefix together
with the tag name. The operation is moved before the translation of prefix
into its URI. One end element of a test is fixed as expecting the last used
namespace is incorrect.
After closing a tag using a namespace, the valid namespace must be taken
from the opening tag.

encoding/xml: add check of namespaces to detect field names conflicts (golang#8535, golang#11724)

Comparing namespaces of fields was missing and false conflicts were detected.

encoding/xml: fix invalid empty namespace without prefix (golang#8068)

Empty namespace is allowed only if it has no prefix. An error message is now
returned if a prefix exists. A similar case when no prefix is provided,
thus with syntax xmlns:="..." is also rejected.

encoding/xml: fix normalization of attributes values (golang#20614)

The attribute value was read as text. The existing attribute reader logic is
fixed as an attribute may have a namespace or only a prefix. Other
possibilities have been removed.

To keep the behavior of raw token which allows many faults in attributes list,
error handling is heavily using the Strict parameter of the decoder.
Specific tests have been added including list of attributes.

To keep the behavior or unmarshal, escaped characters handling has been
added but it is not symmetrical to Marshal for quotes but follows XML
specification.

encoding/xml: fix absence of detection of another : in qualified names (golang#20396)

The occurrence of second : in space and tag name is rejected.

Fixes: golang#7113, golang#7535, golang#8068, golang#8535, golang#10538, golang#11431, golang#13185, golang#16497, golang#20396, golang#20614, golang#20685

Change-Id: Ib4a60347a47d23ff59b63307cebb83b71c7c9165

Commit originally authored by: Constantin Konstantinidis <constantinkonstantinidis@gmail.com>

encoding/xml: fix printing of namespace prefix in tag names

Prefix displays in XML when defined in tag names.
The URL of the name space is also returned for an End Token as
documentation requires to improve idempotency of Marshal/Unmarshal.

Translating the prefix is popping the NS which was unavailable
for translation. Translate for an End Token has been moved inside
the pop element part.

Fixes golang#9519

Change-Id: Id7a14d07c106a76a487b5c4e28d9d563fe061c60

encoding/xml: restore changes lost in merge

See golang#43168.

encoding/xml: gofmt

encoding/xml: minimalIndent -> minIndent to shrink diff

encoding/xml: gofmt

encoding/xml: revert changes to (*Decoder).attrval from master

encoding/xml: restore (*printer).writeIndent to master

encoding/xml: remove comments that don’t change functionality of tests

encoding/xml: use t.Errorf instead of fmt.Errorf

encoding/xml: fix test case for golang#11496
  • Loading branch information
ydnar committed Oct 12, 2021
1 parent 8e36ab0 commit 9795ff1
Show file tree
Hide file tree
Showing 6 changed files with 915 additions and 77 deletions.
133 changes: 109 additions & 24 deletions src/encoding/xml/marshal.go
Original file line number Diff line number Diff line change
Expand Up @@ -304,8 +304,8 @@ type printer struct {
*bufio.Writer
encoder *Encoder
seq int
indent string
prefix string
indent string // line identation
prefix string // line prefix
depth int
indentedIn bool
putNewline bool
Expand Down Expand Up @@ -364,7 +364,7 @@ func (p *printer) createAttrPrefix(url string) string {

p.attrPrefix[url] = prefix
p.attrNS[prefix] = url

/* prints a prefix definition for the URL which had no prefix */
p.WriteString(`xmlns:`)
p.WriteString(prefix)
p.WriteString(`="`)
Expand All @@ -382,6 +382,8 @@ func (p *printer) deleteAttrPrefix(prefix string) {
delete(p.attrNS, prefix)
}

// p.prefixes contains prefixes separated by the empty string between tags
// as several prefixes can be defined at any level
func (p *printer) markPrefix() {
p.prefixes = append(p.prefixes, "")
}
Expand All @@ -391,12 +393,36 @@ func (p *printer) popPrefix() {
prefix := p.prefixes[len(p.prefixes)-1]
p.prefixes = p.prefixes[:len(p.prefixes)-1]
if prefix == "" {
break
break // end of tag is reached
}
p.deleteAttrPrefix(prefix)
}
}

// No prefix is returned if the first prefix of the list for this tag is not assigned to a namespace
func (p *printer) tagPrefix() string {
prefix := p.prefixes[len(p.prefixes)-1] // last prefix relates to current tag
i := len(p.prefixes) - 1
for i >= 0 && i < len(p.prefixes) {
if prefix == "" { // end of list of prefixes for this tag is reached
// check that previous prefix is the one of the tag
if i+1 < len(p.prefixes) { // list is not empty
if p.attrNS[p.prefixes[i+1]] != "" {
// prefix has been created, i.e. no prefix for the tag
return ""
} else {
return p.prefixes[i+1]
}
} else {
return ""
}
}
i--
prefix = p.prefixes[i]
}
return ""
}

var (
marshalerType = reflect.TypeOf((*Marshaler)(nil)).Elem()
marshalerAttrType = reflect.TypeOf((*MarshalerAttr)(nil)).Elem()
Expand Down Expand Up @@ -482,17 +508,23 @@ func (p *printer) marshalValue(val reflect.Value, finfo *fieldInfo, startTemplat
xmlname := tinfo.xmlname
if xmlname.name != "" {
start.Name.Space, start.Name.Local = xmlname.xmlns, xmlname.name
// .Space is equivalent to xmlns=".Space" so adding the attribute
if start.Name.Space != "" {
start.Attr = append(start.Attr, Attr{Name{"", xmlnsPrefix}, start.Name.Space})
}
} else {
fv := xmlname.value(val, dontInitNilPointers)
if v, ok := fv.Interface().(Name); ok && v.Local != "" {
start.Name = v
}
}
} else {
// No enforced namespace, i.e. the outer tag namespace remains valid
}
if start.Name.Local == "" && finfo != nil {
if start.Name.Local == "" && finfo != nil { // XMLName overrides tag name - anonymous struct
start.Name.Space, start.Name.Local = finfo.xmlns, finfo.name
}
if start.Name.Local == "" {
if start.Name.Local == "" { // No or empty XMLName and still no tag name
name := typ.Name()
if i := strings.IndexByte(name, '['); i >= 0 {
// Truncate generic instantiation name. See issue 48318.
Expand Down Expand Up @@ -526,6 +558,13 @@ func (p *printer) marshalValue(val reflect.Value, finfo *fieldInfo, startTemplat
}
}

/* If an xmlname was found, namespace must be overridden */
if tinfo.xmlname != nil && start.Name.Space == "" &&
len(p.tags) != 0 && p.tags[len(p.tags)-1].Space != "" {
//add attr xmlns="" to override the outer tag namespace
start.Attr = append(start.Attr, Attr{Name{"", xmlnsPrefix}, ""})
}
/* */
if err := p.writeStart(&start); err != nil {
return err
}
Expand Down Expand Up @@ -698,31 +737,70 @@ func (p *printer) writeStart(start *StartElement) error {
return fmt.Errorf("xml: start tag with no name")
}

// Pushes the value (url) of the namespace and not the eventual prefix
p.tags = append(p.tags, start.Name)
p.markPrefix()
p.markPrefix() // pushes an empty prefix to allow pop to locate the end of any prefix

p.writeIndent(1)
p.writeIndent(1) // handling relative depth of a tag
p.WriteByte('<')
p.WriteString(start.Name.Local)

if start.Name.Space != "" {
p.WriteString(` xmlns="`)
p.EscapeString(start.Name.Space)
p.WriteByte('"')
/* The attribute was not added if no XMLName field existed. */
var tagSpaceAttr Attr // the attribute will not be printed to avoid repetition
if start.Name.Space != "" { // tag starts with <.Space:.Local
// Locate an eventual prefix. If none, the XML is <tag name and no short notation is used.
prefixToPrint := ""
for _, attr := range start.Attr {
// tag .Space only contains a namespace URL and the prefix is unavailable
// (xmlns:(unavailable)=.Space)
// Attributes values which are namespaces are searched to avoid reprinting the domain
// The first attribute with a value == to start tag token will be the prefix used
// Print the space definition if the prefix is used
if start.Name.Space == attr.Value && attr.Name.Space == xmlnsPrefix && attr.Name.Local != "" {
// this is not enough if you return End tag with url in space and not the prefix
prefixToPrint = attr.Name.Local // if you skip printing, the value of the prefix does not display
p.prefixes = append(p.prefixes, prefixToPrint) // xmlns(.Space):(.Local)=(.Value)
break // the first prefix found is used
}
}
if prefixToPrint == "" { // no prefix was found for the space of the tag name
// the tag name Space is then considered as the default xmlns=".Space"
for _, attr := range start.Attr {
// attributes values which are namespaces are searched to avoid reprinting the default
if start.Name.Space == attr.Value && attr.Name.Space == "" && attr.Name.Local == xmlnsPrefix {
tagSpaceAttr = attr
}
break // the first attribute which is a default declaration that matches is kept
}
}
if prefixToPrint != "" {
// print the prefix and not the namespace value
p.WriteString(prefixToPrint)
p.WriteByte(':')
p.WriteString(start.Name.Local) // tag name
} else {
p.WriteString(start.Name.Local) // no prefix found
p.WriteString(` xmlns="`) // printing the namespace taken as default without prefix
p.EscapeString(start.Name.Space)
p.WriteByte('"')
}
} else {
p.WriteString(start.Name.Local) // tag name
}

// Attributes
for _, attr := range start.Attr {
name := attr.Name
if name.Local == "" {
continue
if attr.Name.Local == "" || attr == tagSpaceAttr {
continue // attribute already printed
}
p.WriteByte(' ')
if name.Space != "" {
p.WriteString(p.createAttrPrefix(name.Space))
if attr.Name.Space == xmlnsPrefix { // printing prefix name.Local xmlns:{.Local}={.Value}
p.WriteString(xmlnsPrefix)
p.WriteByte(':')
} else if attr.Name.Space != "" { // not a name space {.Space}:{.Local}={.Value} and .Local is not xmlns
p.WriteString(p.createAttrPrefix(attr.Name.Space)) // name.Space is not a prefix but nothing should be done
p.WriteByte(':') // about it if another one exists
}
p.WriteString(name.Local)
// When space is empty, only writing .Local=.Value which will also be xmlns=".Value"
p.WriteString(attr.Name.Local)
p.WriteString(`="`)
p.EscapeString(attr.Value)
p.WriteByte('"')
Expand All @@ -738,17 +816,24 @@ func (p *printer) writeEnd(name Name) error {
if len(p.tags) == 0 || p.tags[len(p.tags)-1].Local == "" {
return fmt.Errorf("xml: end tag </%s> without start tag", name.Local)
}
if top := p.tags[len(p.tags)-1]; top != name {
if top.Local != name.Local {
if top := p.tags[len(p.tags)-1]; top != name { // the end tag must check the prefix value when there is one
if top.Local != name.Local { // Tag names do not match
return fmt.Errorf("xml: end tag </%s> does not match start tag <%s>", name.Local, top.Local)
} // Namespaces do not match
if top.Space != name.Space { // tag prefixes do not match
return fmt.Errorf("xml: end space </%s> does not match start space <%s>", name.Space, top.Space)
}
return fmt.Errorf("xml: end tag </%s> in namespace %s does not match start tag <%s> in namespace %s", name.Local, name.Space, top.Local, top.Space)
}
p.tags = p.tags[:len(p.tags)-1]

p.writeIndent(-1)
p.WriteByte('<')
p.WriteByte('/')
endPrefix := p.tagPrefix()
if name.Space != "" && endPrefix != "" { // a prefix is available
p.WriteString(endPrefix) // print the prefix and not its value
p.WriteByte(':')
} // otherwise, xmlns=".Space" has no prefix and nothing should be printed
p.tags = p.tags[:len(p.tags)-1]
p.WriteString(name.Local)
p.WriteByte('>')
p.popPrefix()
Expand Down
41 changes: 24 additions & 17 deletions src/encoding/xml/marshal_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -629,6 +629,11 @@ var marshalTests = []struct {
{Value: &Port{Type: "ssl", Number: "443"}, ExpectXML: `<port type="ssl">443</port>`},
{Value: &Port{Number: "443"}, ExpectXML: `<port>443</port>`},
{Value: &Port{Type: "<unix>"}, ExpectXML: `<port type="&lt;unix&gt;"></port>`},
// Marshal is not symetric to Unmarshal for these oddities because &apos is written as &#39
{Value: &Port{Type: "<un'ix>"}, ExpectXML: `<port type="&lt;un&apos;ix&gt;"></port>`, UnmarshalOnly: true},
{Value: &Port{Type: "<un\"ix>"}, ExpectXML: `<port type="&lt;un&quot;ix&gt;"></port>`, UnmarshalOnly: true},
{Value: &Port{Type: "<un&ix>"}, ExpectXML: `<port type="&lt;un&amp;ix&gt;"></port>`},
{Value: &Port{Type: "<unix>"}, ExpectXML: `<port type="&lt;unix&gt;"></port>`, UnmarshalOnly: true},
{Value: &Port{Number: "443", Comment: "https"}, ExpectXML: `<port><!--https-->443</port>`},
{Value: &Port{Number: "443", Comment: "add space-"}, ExpectXML: `<port><!--add space- -->443</port>`, MarshalOnly: true},
{Value: &Domain{Name: []byte("google.com&friends")}, ExpectXML: `<domain>google.com&amp;friends</domain>`},
Expand Down Expand Up @@ -1107,10 +1112,10 @@ var marshalTests = []struct {
Value: &AnyTest{Nested: "known",
AnyField: AnyHolder{
XML: "<unknown/>",
XMLName: Name{Local: "AnyField"},
XMLName: Name{Local: "other"}, // Overriding the field name is the purpose of the test
},
},
ExpectXML: `<a><nested><value>known</value></nested><AnyField><unknown/></AnyField></a>`,
ExpectXML: `<a><nested><value>known</value></nested><other><unknown/></other></a>`,
},
{
ExpectXML: `<a><nested><value>b</value></nested></a>`,
Expand Down Expand Up @@ -2058,7 +2063,7 @@ var encodeTokenTests = []struct {
StartElement{Name{"space", "foo"}, nil},
EndElement{Name{"another", "foo"}},
},
err: "xml: end tag </foo> in namespace another does not match start tag <foo> in namespace space",
err: "xml: end space </another> does not match start space <space>",
want: `<foo xmlns="space">`,
}, {
desc: "start element with explicit namespace",
Expand All @@ -2068,7 +2073,7 @@ var encodeTokenTests = []struct {
{Name{"space", "foo"}, "value"},
}},
},
want: `<local xmlns="space" xmlns:_xmlns="xmlns" _xmlns:x="space" xmlns:space="space" space:foo="value">`,
want: `<x:local xmlns:x="space" xmlns:space="space" space:foo="value">`,
}, {
desc: "start element with explicit namespace and colliding prefix",
toks: []Token{
Expand All @@ -2078,7 +2083,8 @@ var encodeTokenTests = []struct {
{Name{"x", "bar"}, "other"},
}},
},
want: `<local xmlns="space" xmlns:_xmlns="xmlns" _xmlns:x="space" xmlns:space="space" space:foo="value" xmlns:x="x" x:bar="other">`,
// #17 Removed version was not well-formed as x is bound to "space" and to "x"
want: `<x:local xmlns:x="space" xmlns:space="space" space:foo="value" xmlns:x="x" x:bar="other">`,
}, {
desc: "start element using previously defined namespace",
toks: []Token{
Expand All @@ -2089,7 +2095,8 @@ var encodeTokenTests = []struct {
{Name{"space", "x"}, "y"},
}},
},
want: `<local xmlns:_xmlns="xmlns" _xmlns:x="space"><foo xmlns="space" xmlns:space="space" space:x="y">`,
// #18 The well-formed prefix is the only one appearing and the prefix is not in the tag as .Space is empty
want: `<local xmlns:x="space"><foo xmlns="space" xmlns:space="space" space:x="y">`,
}, {
desc: "nested name space with same prefix",
toks: []Token{
Expand All @@ -2110,7 +2117,7 @@ var encodeTokenTests = []struct {
{Name{"space2", "b"}, "space2 value"},
}},
},
want: `<foo xmlns:_xmlns="xmlns" _xmlns:x="space1"><foo _xmlns:x="space2"><foo xmlns:space1="space1" space1:a="space1 value" xmlns:space2="space2" space2:b="space2 value"></foo></foo><foo xmlns:space1="space1" space1:a="space1 value" xmlns:space2="space2" space2:b="space2 value">`,
want: `<foo xmlns:x="space1"><foo xmlns:x="space2"><foo xmlns:space1="space1" space1:a="space1 value" xmlns:space2="space2" space2:b="space2 value"></foo></foo><foo xmlns:space1="space1" space1:a="space1 value" xmlns:space2="space2" space2:b="space2 value">`,
}, {
desc: "start element defining several prefixes for the same name space",
toks: []Token{
Expand All @@ -2120,7 +2127,7 @@ var encodeTokenTests = []struct {
{Name{"space", "x"}, "value"},
}},
},
want: `<foo xmlns="space" xmlns:_xmlns="xmlns" _xmlns:a="space" _xmlns:b="space" xmlns:space="space" space:x="value">`,
want: `<a:foo xmlns:a="space" xmlns:b="space" xmlns:space="space" space:x="value">`,
}, {
desc: "nested element redefines name space",
toks: []Token{
Expand All @@ -2132,7 +2139,7 @@ var encodeTokenTests = []struct {
{Name{"space", "a"}, "value"},
}},
},
want: `<foo xmlns:_xmlns="xmlns" _xmlns:x="space"><foo xmlns="space" _xmlns:y="space" xmlns:space="space" space:a="value">`,
want: `<foo xmlns:x="space"><y:foo xmlns:y="space" xmlns:space="space" space:a="value">`,
}, {
desc: "nested element creates alias for default name space",
toks: []Token{
Expand All @@ -2144,7 +2151,7 @@ var encodeTokenTests = []struct {
{Name{"space", "a"}, "value"},
}},
},
want: `<foo xmlns="space" xmlns="space"><foo xmlns="space" xmlns:_xmlns="xmlns" _xmlns:y="space" xmlns:space="space" space:a="value">`,
want: `<foo xmlns="space"><y:foo xmlns:y="space" xmlns:space="space" space:a="value">`,
}, {
desc: "nested element defines default name space with existing prefix",
toks: []Token{
Expand All @@ -2156,7 +2163,7 @@ var encodeTokenTests = []struct {
{Name{"space", "a"}, "value"},
}},
},
want: `<foo xmlns:_xmlns="xmlns" _xmlns:x="space"><foo xmlns="space" xmlns="space" xmlns:space="space" space:a="value">`,
want: `<foo xmlns:x="space"><foo xmlns="space" xmlns:space="space" space:a="value">`,
}, {
desc: "nested element uses empty attribute name space when default ns defined",
toks: []Token{
Expand All @@ -2167,7 +2174,7 @@ var encodeTokenTests = []struct {
{Name{"", "attr"}, "value"},
}},
},
want: `<foo xmlns="space" xmlns="space"><foo xmlns="space" attr="value">`,
want: `<foo xmlns="space"><foo xmlns="space" attr="value">`,
}, {
desc: "redefine xmlns",
toks: []Token{
Expand Down Expand Up @@ -2199,7 +2206,7 @@ var encodeTokenTests = []struct {
{Name{"xmlns", "foo"}, ""},
}},
},
want: `<foo xmlns:_xmlns="xmlns" _xmlns:foo="">`,
want: `<foo xmlns:foo="">`,
}, {
desc: "attribute with no name is ignored",
toks: []Token{
Expand Down Expand Up @@ -2228,7 +2235,7 @@ var encodeTokenTests = []struct {
{Name{"space", "x"}, "value"},
}},
},
want: `<foo xmlns="space" xmlns="space"><foo xmlns="" x="value" xmlns:space="space" space:x="value">`,
want: `<foo xmlns="space"><foo xmlns="" x="value" xmlns:space="space" space:x="value">`,
}, {
desc: "nested element requires empty default name space",
toks: []Token{
Expand All @@ -2237,7 +2244,7 @@ var encodeTokenTests = []struct {
}},
StartElement{Name{"", "foo"}, nil},
},
want: `<foo xmlns="space" xmlns="space"><foo>`,
want: `<foo xmlns="space"><foo>`,
}, {
desc: "attribute uses name space from xmlns",
toks: []Token{
Expand All @@ -2259,7 +2266,7 @@ var encodeTokenTests = []struct {
EndElement{Name{"space", "baz"}},
EndElement{Name{"space", "foo"}},
},
want: `<foo xmlns="space" xmlns="space" xmlns:_xmlns="xmlns" _xmlns:bar="space" xmlns:space="space" space:baz="foo"><baz xmlns="space"></baz></foo>`,
want: `<bar:foo xmlns="space" xmlns:bar="space" xmlns:space="space" space:baz="foo"><baz xmlns="space"></baz></bar:foo>`,
}, {
desc: "default name space not used by attributes, not explicitly defined",
toks: []Token{
Expand All @@ -2271,7 +2278,7 @@ var encodeTokenTests = []struct {
EndElement{Name{"space", "baz"}},
EndElement{Name{"space", "foo"}},
},
want: `<foo xmlns="space" xmlns="space" xmlns:space="space" space:baz="foo"><baz xmlns="space"></baz></foo>`,
want: `<foo xmlns="space" xmlns:space="space" space:baz="foo"><baz xmlns="space"></baz></foo>`,
}, {
desc: "impossible xmlns declaration",
toks: []Token{
Expand Down
Loading

0 comments on commit 9795ff1

Please sign in to comment.