-
-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python docstring counted as code #185
Comments
Thank you for this issue! Python docstrings are considered code as syntactically they are strings and it requires parsing the code into a Abstract Syntax Tree in order to correctly to determine whether the given |
If docstrings are not counted, then tokei is very misleading for python files: docstrings are ubiquitous, and generally make up for the major volume of the comments in a file. Triple quoted strings are used both as docstrings, and as multiline literal strings in code, this is very true. But docstrings are very common, while multiline literal strings are rare. I think the default should be to count the triple quoted strings as comments. If you want to be more precise, parsing the AST would be overkill. A good heuristic is to consider a docstring to be a triple-quoted string that appears at the start of a line (ignoring the blanks). That would exclude most usages of multiline literal strings, like this one: text = """hello
world
""" I you really don't want to fix this, I would advocate for at least printing a warning in the output when a python file is found. |
@olivren You can now set an option in your |
Thanks for your answer. I missed this configuration flag, and once activated it indeed gives a correct count. I am still not convinced by the current default, but at least there is a workaround. |
For this piece of simple Python code:
sloccount generates the correct result:
tokei gives:
For more information about Python docstring, check out:
https://www.python.org/dev/peps/pep-0257/
The text was updated successfully, but these errors were encountered: