Skip to content

Conversation

pwaehnert
Copy link

If Geany runs on GNU/Linux but code-format tries to reformat a file with
Windows line endings, code-format will erase the first line of the source
file.

This happens since clang-format prepends the cursor line with a \n line
ending but extract_cursor() searches in the first try for \r\n line endings.

@codebrainz
Copy link
Owner

I'm not sure I understand how curly braces come into play here, can you elaborate on how they are related to the newline issue?

(Apologies, it's been a long time since I wrote/looked at this code).

@pwaehnert
Copy link
Author

No problem. As I've observed by debugging the code-format plugin, the output of clang-format contains a single line with additional information for the IDE in front of the actual reformatted code.
Those additional information contain for example the new cursor position after the reformatting took place. For example:

{ "Cursor": 35, "IncompleteFormat": false }
#include <iostream>
class test {};

The purpose of extract_cursor() now is, to extract the new cursor position from this meta information and to get rid of this extra line, since it's only meta information meant for the IDE.

The original implementation got the end of this first line by searching the first line break. But it had to consider all the different kinds of line endings. At first it searched for \r\n and if those can't be found, it tried \r and finally \n.

This algorithm works well if the output of clang-format and the actual source file use the same kind of line endings. No other kind of line ending would be found.

But I'm using code-format on GNU/Linux but have to edit and reformat files with Windows line endings. The output of clang-format would look like that:

{ "Cursor": 35, "IncompleteFormat": false }\n
#include <iostream>\r\n
class test {};\r\n

On GNU/Linux compiled, clang-format of course would prepend the meta information with a \n. But the the remaining line breaks are \r\n. The pointer nl_chars in extract_cursor() would point to the end of the second line instead of the first line and finally g_string_erase() would remove the first two lines instead of the first line.

I got around this issue by searching for the first closing brace instead of the first line break. I'm well aware of the fact, that this JSON fragment could possibly contain more than one closing brace. But in the end there isn't much documentation about this --output-replacements-xml feature. And the source of clang-format isn't very instructive either.

@codebrainz
Copy link
Owner

Thanks for the explanation, I understand now.

Since clang-format seems to unconditionally always append a single \n, regardless of platform, then couldn't we just remove the code in extract_cursor() which tries to handle \r\n and \r?

@pwaehnert
Copy link
Author

Unfortunately \n is a rather special thing and its meaning depends on the platform, see here. On Windows the "character" \n in the source code of clang-format actually means \r\n.

@codebrainz
Copy link
Owner

Unfortunately \n is a rather special thing and its meaning depends on the platform

Well yeah, that's why I tried to handle it in the current code, but as I said, clang-format unconditionally appends a single \n (byte 0xA) irrespective of platform, so that code I wrote to handle \r\n (win) and \r (old mac) doesn't seem to be needed, and as you discovered actually causes a bug.

Or are you saying that in clang-format whatever kind of stream object outs() is actually translates the \n behind the scenes to \r\n on win32? I'm not familiar with clang-format source, but I would expect it doesn't do that.

@codebrainz
Copy link
Owner

To be clear, I mean like this (untested):

diff --git a/format.c b/format.c
index 971d519..385d590 100644
--- a/format.c
+++ b/format.c
@@ -63,23 +63,11 @@ static size_t extract_cursor(GString *str)
 {
   char *first_nl, *first_line = NULL, *it;
   char num_buf[64] = { 0 };
-  const char *nl_chars;
   size_t first_len, cnt, cursor_pos;
 
-  nl_chars = "\r\n";
-  first_nl = strstr(str->str, "\r\n");
+  first_nl = strchr(str->str, '\n');
   if (!first_nl)
-  {
-    nl_chars = "\n";
-    first_nl = strchr(str->str, '\n');
-    if (!first_nl)
-    {
-      nl_chars = "\r";
-      first_nl = strchr(str->str, '\r');
-      if (!first_nl)
-        return INVALID_CURSOR;
-    }
-  }
+    return INVALID_CURSOR;
 
   first_len = first_nl - str->str;
   first_line = g_strndup(str->str, first_len);
@@ -87,7 +75,7 @@ static size_t extract_cursor(GString *str)
     return INVALID_CURSOR;
 
   // Remove the first line containing the cursor position
-  g_string_erase(str, 0, first_len + strlen(nl_chars));
+  g_string_erase(str, 0, first_len + 1);
 
   // Sample: { "Cursor": 4 }
   it = strstr(first_line, "\"Cursor\":");

If Geany runs on GNU/Linux but code-format tries to reformat a file with
Windows line endings, code-format will erase the first line of the source
file.

This happens since clang-format prepends the cursor line with a \n line
ending but extract_cursor() searches in the first try for \r\n line endings.
@pwaehnert pwaehnert force-pushed the fix-extract-cursor branch from ecf6f0f to 72f27d4 Compare May 20, 2020 07:31
@pwaehnert
Copy link
Author

As far as I understand, the special meaning of \n comes into play if used in text mode I/O, see here. So, yeah, it could be that \n is magically converted behind the scenes to \r\n. I guess that's the difference between fopen(..., "r") and fopen(..., "rb").

But you're absolutely right, that's not the case in clang-format and its out() stream. I've tested clang-format from here on Windows and got a \n line ending, regardless of the line endings in the remaining file.

Interestingly enough, reformatted lines respect the official line endings of the source file. So if clang-format thinks a new line is needed to format the source more nicely, it uses the line endings of the surrounding lines. It seems to me, that the reformatting engine in clang-format uses a different approach to add new lines than the part which prepends this meta information line.

Anyway, your approach is the right one: code-format has to search for the first \n character to determine the end of the meta information line prepended by clang-format. I've tested your proposed changes successfully and updated the fix-extract-cursor branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants