Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feedparser depends on order of description and content #59

Closed
mosasiru opened this issue Mar 24, 2016 · 1 comment · Fixed by #260
Closed

feedparser depends on order of description and content #59

mosasiru opened this issue Mar 24, 2016 · 1 comment · Fixed by #260

Comments

@mosasiru
Copy link

parser is affected by order by descriotion and content:encoded.
A test case is below.

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
    <title>title</title>
    <link>http://www.example.com/</link>
    <item>
        <title>title2</title>
        <description>hoge</description>
        <content:encoded><![CDATA[
                fuga
        ]]></content:encoded>
        <link>http://example.com/2.html</link>
    </item>
    <item>
        <title>title1</title>
        <content:encoded><![CDATA[
                fuga
        ]]></content:encoded>
        <description>hoge</description>
        <link>http://example.com/1.html</link>
    </item>
</channel>
</rss>

Above two entries' description and content:encoded are just same except order.
But the result is not same..

In [4]: a.entries[0].content
Out[4]: [{'base': '', 'language': None, 'type': 'text/html', 'value': 'fuga'}]
In [6]: a.entries[1].content
Out[6]:
[{'base': '', 'language': None, 'type': 'text/html', 'value': 'fuga'},
 {'base': '', 'language': None, 'type': 'text/plain', 'value': 'hoge'}]

In [5]: a.entries[0].description
Out[5]: 'hoge'
In [7]: a.entries[1].description
Out[7]: 'fuga'

It seems because
(1)content is copied to sumary
https://github.com/kurtmckee/feedparser/blob/develop/feedparser/namespaces/_base.py#L482-L483
(2)summary is set to content
https://github.com/kurtmckee/feedparser/blob/develop/feedparser/namespaces/_base.py#L428-L430
this behavior is affected by order, it seems strange to me.

@mosasiru
Copy link
Author

The copyToSummary behavior seems to me strange.
but if you think the behavior should be kept, new context variable may be needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant