Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of lines containing only indentation #162

Merged
merged 2 commits into from
Feb 4, 2020
Merged

Improve handling of lines containing only indentation #162

merged 2 commits into from
Feb 4, 2020

Conversation

alandipert
Copy link
Contributor

@alandipert alandipert commented Jan 8, 2020

Previously, after a '\n' was encountered while copying, min_indent number of characters are trimmed from the beginning of lines by seeking i characters ahead in the input string.

However, after setting i += min_indent, i was incremented again right before the while loop continued via str[j++] = xx[i++]. This caused i to point to the character after the '\n' that terminated the line, and so leading indentation on the following line was not recognized as such, and so was improperly copied to the output string.

While working on this, I also ran into a few other possible bugs:

  1. Lines containing only indentation might be shorter than min_indent and so could cause the succeeding line to contain spurious leading indentation. Currently, trim() only works properly if there's either no whitespace on the blank line, or if there is an amount of whitespace of at least min_indent.
  2. Tabs are not converted to spaces, and so output may appear to be improperly indented even after trimming if different lines contain different distributions of leading tab and space characters.

Let me know if it would be helpful for me to create issues for those possible bugs. I didn't create issues because maybe these are features 😄

It also occurred to @wch and I while working on this that we could probably be smarter about the size of the output string we allocate. Since we know that min_indent will removed from every line after the first, in the min_indent calculation pass we could tally newlines and multiply that tally by the size of min_indent to determine how much smaller the output string needs to be.

TODO

  • Update NEWS

@jimhester
Copy link
Collaborator

jimhester commented Jan 8, 2020

Thanks!

  1. seems to be a bug.
  2. Is by design, it is assumed that people will use tabs or space consistently, converting tabs to spaces would require glue to know how many spaces a tab is worth, which I didn't want to do.

@alandipert
Copy link
Contributor Author

OK I updated the NEWS and created #163, which I should be to make another PR for today. Please let me know if there's anything else I can do on this one 👍

@jimhester
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants