Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URLs with parenthesis are broken #46

Closed
fiatjaf opened this issue Mar 24, 2015 · 6 comments
Closed

URLs with parenthesis are broken #46

fiatjaf opened this issue Mar 24, 2015 · 6 comments
Labels

Comments

@fiatjaf
Copy link

fiatjaf commented Mar 24, 2015

>>> import mistune
>>> import urllib
>>> import urlparse
>>> 
>>> url = 'https://trello-attachments.s3.amazonaws.com/550878b58559170febf8e69b/500x600/ff07160366332b3962320bdfb1693e3e/download_(2).jpg'
>>> mistune.markdown("here's a [broken URL](%s)" % url)
'<p>here\'s a <a href="https://trello-attachments.s3.amazonaws.com/550878b58559170febf8e69b/500x600/ff07160366332b3962320bdfb1693e3e/download_(2">broken URL</a>.jpg)</p>\n'
>>> urlp = urlparse.urlparse(url)
>>> mistune.markdown("now [it](%s) is not broken anymore" % (urlp.scheme + '://' + urlp.netloc + urllib.quote(urlp.path)))
'<p>now <a href="https://trello-attachments.s3.amazonaws.com/550878b58559170febf8e69b/500x600/ff07160366332b3962320bdfb1693e3e/download_%282%29.jpg">it</a> is not broken anymore</p>\n'
@lepture
Copy link
Owner

lepture commented Mar 25, 2015

I see. You have () in the URL, which is not safe in the URL.

@lepture lepture added the bug label Mar 25, 2015
@tonyseek
Copy link

It seems the URL string need to be normalized. Using werkzeug.urls may be better than scheme + '://' + netloc.

from werkzeug.urls import url_parse, url_quote


def safe_url(url):
    parsed = url_parse(url)
    path = url_quote(parsed.path, safe='/%')
    query = url_quote(parsed.query, safe='?=&')
    return parsed.replace(path=path, query=query).to_url()


url = 'https://trello-attachments.s3.amazonaws.com/550878b58559170febf8e69b/500x600/ff07160366332b3962320bdfb1693e3e/download_(2).jpg'
print mistune.markdown("here's a [URL](%s)" % safe_url(url))

There is a similar solution in douban/brownant.

lepture added a commit that referenced this issue May 29, 2015
In case that you need to render links with ")" like #46, you need to
wrap it in `<` and `>`.
@lepture
Copy link
Owner

lepture commented May 29, 2015

It is not easy to write a proper regex for this situation. I've fix the situation in this case:

[foo](<http://foo.bar.(2).jpg>)

You need to wrap the link with < and >.

@lepture
Copy link
Owner

lepture commented Jun 17, 2015

Not really solved, but a workaround.

@lepture lepture closed this as completed Jun 17, 2015
@fiatjaf
Copy link
Author

fiatjaf commented Jun 17, 2015

Can I wrap every link, no only the ones with )?

@lepture
Copy link
Owner

lepture commented Jun 17, 2015

@fiatjaf yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants