Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only pass sketch sources to ctags #156

Merged
merged 5 commits into from
Jun 20, 2016

Conversation

matthijskooijman
Copy link
Collaborator

Previously, the .ino files from the sketch were concatenated,
preprocessed and then the entire preprocessing result was passed through
ctags. Since ctags isn't a perfect C/C++ parser, this would cause it to
sometimes break on some complex constructs from library or system header
files, preventing the entire sketch from compiling.

This commit filters the result of preprocessing to only include lines
that originate from the original sketch files and filter out any lines
from included system or library header files. This filtering happens
based on the line markers that the gcc preprocessor emits.

Because of this, the CollectCTagsFromSketchFiles pass can be removed.
This pass used to apply the same filtering, but on the tags generated by
ctags. Since now only sketch lines are fed into ctags, the resulting
ctags will obviously all originate from sketch files, so no need to
further filter them.

With this change, all sketches from the test suite still compile as
expected. However, there is a fair chance that with this change, ctags
will fail to parse some sketches. In general, to parse a C or C++ file,
you need to keep track of the type environment and symbol table, since
identifiers can either be variables or types, depending on the context.
When lines from included files are cut off, this information is not
available, and ctags will have to guess (though it might be doing that
anyway, not sure if it normally does completely proper parsing).

This PR achieves the same as #131, but has a more robust implmentation. Only the last commit really belongs to this PR, the first few commits come from #150 and #155. This PR should be merged carefully, as it might break sketches. It is probably best to merge this after 1.6.10, but be prepared to revert it turns out to cause regressions.

In some cases, backslashes in a filenames were already escaped when
emitting a #line directive, but sometimes no escaping happened. In any
case, quotes in a filename should also be escaped.

This commit introduces a helper function `QuoteCppString()` to handle
quoting and escaping strings in a generic way. This new helper function
is used in the code itself, but not in testcases yet.

Note that ctags does not quite properly handle special characters and
escaping in filenames. When parsing line markers, it just ignores
escapes, so any escapes from the #line marker will be present in the
ctags output as well. Before this commit, they would be copied
unmodified into the #line directives generated from the ctags filenames.  Now that these line directives have escaping applied, the filenames read from ctags need to be unescaped, to prevent double escapes. This
unescaping already happened in CollectCTagsFromSketchFiles, but is now
moved to parseTag where it is more appropriate and applies to all users.

Since ctags does not handle escapes in line markers, it cuts the
filename off at the first double quote. If a filename contains a double
quote, the filename as output by ctags will be cut off before it,
leaving the escaping \ at the end. Before this commit, this trailing \
would be output directly into the generated #line marker, which confused
gcc and made the build fail. With this commit, this trailing \ is
escaped itself, making the output syntactically valid, so the build now
works with double quotes in a filename. However, due to this filename
cut off, arduino-builder will no longer generate prototypes for
functions in files that have a double quote in them (since it can no
longer detect them as sketch functions), so double quotes (and probably
tabs too) in filenames are still not completely supported yet.

Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
In some testcases, the generated output is checked, which contains #line
directives with filenames. Previously, this compared against the
filename with just backslashes escaped, but this would be incorrect when
there are quotes in the filename. Since the previous commit, the
arduino-builder code itself uses a new `utils.QuoteCppString` function to
apply escaping, so this applies that same escaping.

Part of the changes modify the test code directly, part of the changes
replace the `EscapeBackSlashes` "function" that was used in
interpolating some text files with a call to QuoteCppString instead.

Note that QuoteCppString already adds quotes around its result, so those
are removed from the places where this function is called.

Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
This allows properly parsing and unescaping a string literal. This
function is not currently used (it was written to prepare for better
parsing of #line directives, which might or might not be added later),
but it might be relevant for other things in the future as well.

Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Previously, the .ino files from the sketch were concatenated,
preprocessed and then the entire preprocessing result was passed through
ctags. Since ctags isn't a perfect C/C++ parser, this would cause it to
sometimes break on some complex constructs from library or system header
files, preventing the entire sketch from compiling.

This commit filters the result of preprocessing to only include lines
that originate from the original sketch files and filter out any lines
from included system or library header files. This filtering happens
based on the line markers that the gcc preprocessor emits.

Because of this, the CollectCTagsFromSketchFiles pass can be removed.
This pass used to apply the same filtering, but on the tags generated by
ctags. Since now only sketch lines are fed into ctags, the resulting
ctags will obviously all originate from sketch files, so no need to
further filter them.

With this change, all sketches from the test suite still compile as
expected. However, there is a fair chance that with this change, ctags
will fail to parse some sketches. In general, to parse a C or C++ file,
you need to keep track of the type environment and symbol table, since
identifiers can either be variables or types, depending on the context.
When lines from included files are cut off, this information is not
available, and ctags will have to guess (though it might be doing that
anyway, not sure if it normally does completely proper parsing).

Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
@ArduinoBot
Copy link
Contributor

✅ Build completed.

⬇️ Build URL: http://downloads.arduino.cc/PR/arduino-builder/arduino-builder-156.zip

ℹ️ To test this build:

  1. Replace arduino-builder binary (you can find it where you installed the IDE) with the provided one

@igrr
Copy link

igrr commented Jun 16, 2016

I have compiled all the test cases in ESP8266 Arduino core project with arduino-builder from this pull request and can confirm that nothing got broken.

This pull request also fixes the long-standing issue with the following sketch, both for SAMD core and ESP8266 core:

#undef min
#undef max
#include <vector>
#include <memory>
#include <functional>
void setup() {
  test();
}
void loop() {}
void test() {}

Previously this did not compile because ctags failed to parse something inside STL headers.

So huge 👍 from me.

@cmaglie cmaglie merged commit 81eadf3 into arduino:master Jun 20, 2016
@cmaglie cmaglie added this to the 1.3.19 milestone Jun 20, 2016
@matthijskooijman matthijskooijman deleted the ctags-sketch-only branch June 21, 2016 08:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants