Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow a wrap_width value of None for unlimited line lengths #169

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

chrispy-snps
Copy link
Collaborator

Fixes #168.

The convert_p() function is updated to bypass the line-wrapping heuristics when wrap_width is set to None, thus allowing unlimited line lengths.

For large volumes of text content, bypassing the wrapping heuristics is about 4x faster than executing the wrapping heuristics with an arbitrarily large numerical value.

The unit tests are updated to include this case, and the README file is updated to mention this feature.

Signed-off-by: chrispy <chrispy@synopsys.com>
@AlexVonB
Copy link
Collaborator

AlexVonB commented Jan 2, 2025

Could you just set wrap=False and therefore not wrap at all?

@chrispy-snps
Copy link
Collaborator Author

@AlexVonB - when wrap=False, content newlines are left as-is (which is reasonable behavior):

import markdownify

def md(html, **options):
    return markdownify.MarkdownConverter(**options).convert(html)

html = """
<p>This is
a bunch
of text.</p>
""".strip()

print(md(html, wrap=False))
# This is
# a bunch
# of text.

But with this code change, wrapping can be used to reflow the text:

print(md(html, wrap=True, wrap_width=None))
# This is a bunch of text.

Effectively, it is a form of "wrapping" but with unlimited line lengths. The value of None explicitly bypasses the wrapping loop for better performance when doing this.

We were using a large numeric value in our content pipeline as a workaround, but this approach would be faster.

break_on_hyphens=False)
new_lines.append(line + trailing)
text = '\n'.join(new_lines)
if self.options['wrap_width'] is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to merge this with the preceding if? Disregard this comment if it does not.

Copy link
Collaborator Author

@chrispy-snps chrispy-snps Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexVonB - sharp eyes! Originally I had it on the higher-level if, but I decided to move it down here to reinforce the comment about newlines being replaced by spaces when wrap==True. It added a level of indent and made the diff bigger, but runtime is practically identical and I felt the code was a bit clearer this way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support a wrap width value of None that explicitly indicates no width limit (unlimited line length)
2 participants