Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

D403 has a very simplistic definition of word #167

Closed
jayvdb opened this issue Dec 14, 2015 · 11 comments
Closed

D403 has a very simplistic definition of word #167

jayvdb opened this issue Dec 14, 2015 · 11 comments

Comments

@jayvdb
Copy link
Member

jayvdb commented Dec 14, 2015

I've put the Pywikibot code through the new D403 rule from #164 (love it!), and it has turned up a lot of errors for cases where we have been liberal about the pep257 phrase/command first-liner.
Results can be seen at https://phabricator.wikimedia.org/T121365

Many great errors for us to improve, and two where pep257 is probably being too strict/simplistic are:

"""
Create/Edit a source.

...
"""

-> D403: First word of the first line should be properly capitalized ('Create/edit', not 'Create/Edit')

"""(Un)protect a wiki page. Requires administrator status.

...
"""

-> D403: First word of the first line should be properly capitalized ('(un)protect', not '(Un)protect')

(This error message is interesting...)

We can and will improve our docstrings due to this rule, and can live with the current simplistic split and capitalize approach by amending the above docstrings and others to be simpler.

However if there is interest in pep257 handling corner cases like those, we should collect them and determine which are able to be reasonably resolved without undue complexity.

@sigmavirus24
Copy link
Member

I think the first case is worth breaking.

Create or edit a source

(I'd also argue that functions should have a single responsibility but whatever. ;P)

The other could also be rewritten with or or written as Toggle protection for a wiki page. I'm less opinionated about that one.

I think this rule will make your docstrings better if they're all like this. If you have other examples, I'd be interested in seeing those.

@jayvdb
Copy link
Member Author

jayvdb commented Dec 14, 2015

As I indicated in the opening issue description, I am not disputing the docstrings can be better, and this change has highlighted some cases which should be improved, especially to benefit people with ESL.

I raised this to ensure we're happy the current code is implementing PEP 257 and not unintentionally over-reaching. A lot is being read into the PEP 257 phrase:

The docstring is a phrase ending in a period.  It prescribes the function or method's effect as a command ..

I can't see anything else in PEP 257 which relates to this new rule. Did I miss something?

You can see the full list of our errors on the linked task: https://phabricator.wikimedia.org/T121365 .

Another interesting one is:

    def setUpClass(cls):
        """SetUp tests."""

-> D403: First word of the first line should be properly capitalized ('Setup', not 'SetUp')

Again I repeat, I'm not opposing this recommendation; switching to 'Setup' is certainly doable, but I can see people arguing that it has slightly more meaning when the capitalisation is 'SetUp'.

@sigmavirus24 , fwiw, in your own github3.py, there is a very similar case. In fact, it is the most common error that this change will cause, with the following occurring eight (8) times:

D403: First word of the first line should be properly capitalized ('Get', not 'GET')

Again I would call this a corner case. The all-caps GET gives the reader more meaning. I expect you also would be happier with better docstrings for those. Pywikibot has lots of docstrings starting with DEPRECATED: .... because we want to drill home that the method is deprecated in every way possible, and prefixing the 'one-liner' was the most blunt way to do this in the docstring (deprecation is also reported many other ways). Those are also D403 errors now, and we need to re-evaluate abusing the one-line in this way (or use flake8-putty to whitelist these lines).

The question to consider is whether those docstrings are sufficiently legal that they shouldn't be an error. As with any new rules, there will be people complaining, and they will complain more fervently after a release so it is better to assess corner cases before the release to ensure the released solution is optimal and defensible.

Is there a list of large projects which use pep257 rather strictly, that we can use to assess the impact of new rules?

@sigmavirus24
Copy link
Member

Eh, github3.py already fails with pep257. I don't disagree that there will be corner cases, I just don't know if they'll be worth the hassle of adding work-arounds for them beyond allowing people to silence D403.

Another large project using pep257 (by way of flake8-docstrings) is keystone and other OpenStack projects are beginning to adopt that flake8 plugin as well.

@jayvdb
Copy link
Member Author

jayvdb commented Dec 14, 2015

keystone has two normal errors, and one interesting one

./keystone/common/ldap/core.py:570 in private function `_common_ldap_initialization`:
        D403: First word of the first line should be properly capitalized ('Ldap', not 'LDAP')

   570: def _common_ldap_initialization(url, use_tls=False, tls_cacertfile=None,
   571:                                 tls_cacertdir=None, tls_req_cert=None,
   572:                                 debug_level=None):
   573:     """LDAP initialization for PythonLDAPHandler and PooledLDAPHandler."""
   574:     LOG.debug("LDAP init: url=%s", url)
   575:     LOG.debug('LDAP init: use_tls=%s tls_cacertfile=%s tls_cacertdir=%s '
   576:               'tls_req_cert=%s tls_avail=%s',
        ...

That 'should' be something like "Do LDAP initialisation..." , but that change wouldnt aid understanding at all; it would be a change purely to bypass the pep257 rule.


git-review has one error, and it is quite interesting as "SSH" is IMO a very appropriate action word to begin the docstring.

./git_review/tests/__init__.py:117 in private method `_run_gerrit_cli`:
        D403: First word of the first line should be properly capitalized ('Ssh', not 'SSH')

   117:     def _run_gerrit_cli(self, command, *args):
   118:         """SSH to gerrit Gerrit server and run command there."""
   119:         return utils.run_cmd('ssh', '-p', str(self.gerrit_port),
   120:                              'test_user@' + self.gerrit_host, 'gerrit',
   121:                              command, *args)

It could be changed to something like: "Run command on the Gerrit server over a new SSH connection."

@sigmavirus24
Copy link
Member

"SSH" is IMO a very appropriate action word to begin the docstring.

So I will both agree and disagree. I agree because as a large and broad technical community (of which I've only experienced the English language portion) we absolutely do use SSH as a verb. This has everything to do with ssh being the command we use. I disagree because we should strive to help those who speak English as a second language and using SSH as a verb is probably not very conducive to being friendly to those.

So I agree, to a tool that only cares about English grammar (in spite of the fact that the English language has no such verb), SSH is a verb. How many other similar cases can we come up with to make this generalizable enough to hack around in pep257.

@Nurdok
Copy link
Member

Nurdok commented Dec 14, 2015

How about making D403 only applicable when the first "word" (i.e., everything up to the first whitespace) contains only letters and is not all-caps? I think that's reasonable as 99% of the problematic cases (and I've just been a part of a major cleanup of a large project to accommodate pep257) for capitalization are cases where the first word is all lowercase.

@Nurdok
Copy link
Member

Nurdok commented Dec 14, 2015

Also, regarding the interpretation of PEP 257, I got an answer from Guido by mail:

Well, of course that sentence should be capitalized.

@jayvdb
Copy link
Member Author

jayvdb commented Dec 15, 2015

Just some more analysis on a range of projects I have local checkouts of, not necessarily all sync'd up.

I notice that django and python-future (not users of pep257...) is also abusing the docstring, prefixing it with RemovedInDjango20Warning: and DEPRECATED: respectively.

I'm seeing quite a few cases, especially in docutils and epydoc, of the one-liner being :Parameters: or :Return:, etc., which means they have omitted the one-liner.
sage uses TEST::, TESTS::, INPUT:, EXAMPLES::

gcloud uses a API call: prefix extensively, such as API call: delete the project via a``DELETE``request.

Only one error in coveragepy:

./coverage/html.py:420 in public function `escape`:
        D403: First word of the first line should be properly capitalized ('Html-escape', not 'HTML-escape')

   420: def escape(t):
   421:     """HTML-escape the text in `t`."""
   422:     return (
   423:         t
   424:         # Convert HTML special chars into HTML entities.
   425:         .replace("&", "&amp;").replace("<", "&lt;").replace(">", "&gt;")
   426:         .replace("'", "&#39;").replace('"', "&quot;")
        ...

One error in cyordereddict

./python2/cyordereddict/test/_test_support.py:255 in public function `forget`:
        D403: First word of the first line should be properly capitalized ('"forget"', not '"Forget"')

   255: def forget(modname):
   256:     '''"Forget" a module was ever imported by removing it from sys.modules and
   257:     deleting any .pyc and .pyo files.'''
   258:     unload(modname)
   259:     for dirname in sys.path:
   260:         unlink(os.path.join(dirname, modname + os.extsep + 'pyc'))
   261:         # Deleting the .pyo file cannot be within the 'try' for the .pyc since
        ...

oauth2client has four errors:

./tests/http_mock.py:25 in public method `__init__`:
        D403: First word of the first line should be properly capitalized ('Httpmock', not 'HttpMock')

    25:     def __init__(self, headers=None):
    26:         """HttpMock constructor.
    27: 
    28:         Args:
    29:             headers: dict, header to return with response
    30:         """
    31:         if headers is None:
        ...


./tests/http_mock.py:76 in public method `__init__`:
        D403: First word of the first line should be properly capitalized ('Httpmocksequence', not 'HttpMockSequence')

    76:     def __init__(self, iterable):
    77:         """HttpMockSequence constructor.
    78: 
    79:         Args:
    80:             iterable: iterable, a sequence of pairs of (headers, body)
    81:         """
    82:         self._iterable = iterable
        ...


./oauth2client/appengine.py:868 in public method `callback_handler`:
        D403: First word of the first line should be properly capitalized ('Requesthandler', not 'RequestHandler')

   868:     def callback_handler(self):
   869:         """RequestHandler for the OAuth 2.0 redirect callback.
   870: 
   871:         Usage::
   872: 
   873:             app = webapp.WSGIApplication([
   874:                 ('/index', MyIndexHandler),
        ...


./oauth2client/appengine.py:919 in public method `callback_application`:
        D403: First word of the first line should be properly capitalized ('Wsgi', not 'WSGI')

   919:     def callback_application(self):
   920:         """WSGI application for handling the OAuth 2.0 redirect callback.
   921: 
   922:         If you need finer grained control use `callback_handler` which returns
   923:         just the webapp.RequestHandler.
   924: 
   925:         Returns:
        ...

oauth-dropins mentions the project name as the first word.

./oauth_dropins/handlers.py:104 in public method `redirect_url`:
        D403: First word of the first line should be properly capitalized ('Oauth-dropin', not 'oauth-dropin')

   104:   def redirect_url(self, state=None):
   105:     """oauth-dropin subclasses must implement this.
   106:     """
   107:     raise NotImplementedError()

python-mcollective has four errors:

./pymco/rpc.py:36 in public method `get_target`:
        D403: First word of the first line should be properly capitalized ('Mcollective', not 'MCollective')

    36:     def get_target(self):
    37:         """MCollective RPC call target.
    38: 
    39:         :return: middleware target for the request.
    40:         """
    41:         return self.connector.get_target(collective=self.collective,
    42:                                          agent=self.agent)
        ...


./pymco/rpc.py:44 in public method `get_reply_target`:
        D403: First word of the first line should be properly capitalized ('Mcollective', not 'MCollective')

    44:     def get_reply_target(self):
    45:         """MCollective RPC call reply target.
    46: 
    47:         This should build the subscription target required for listening replies
    48:         to this RPC call.
    49: 
    50:         :return: middleware target for the response.
        ...


./pymco/serializers/yaml.py:20 in public function `ruby_object_constructor`:
        D403: First word of the first line should be properly capitalized ('Yaml', not 'YAML')

    20: def ruby_object_constructor(loader, suffix, node):
    21:     """YAML constructor for Ruby objects.
    22: 
    23:     This constructor may be registered with '!ruby/object:' tag as multi
    24:     constructor supporting Ruby objects. This will handle give objects as maps,
    25:     so any non mapping based object may produce some issue.
    26:     """
        ...


./pymco/serializers/yaml.py:30 in public function `symbol_constructor`:
        D403: First word of the first line should be properly capitalized ('Yaml', not 'YAML')

    30: def symbol_constructor(loader, node):
    31:     """YAML constructor for Ruby symbols.
    32: 
    33:     This constructor may be registered with '!ruby/sym' tag in order to
    34:     support Ruby symbols serialization (you can use
    35:     :py:meth:`register_constructors` for that), so it just need return the
    36:     string scalar representation of the key (including the leading colon).
        ...

@Nurdok
Copy link
Member

Nurdok commented Dec 25, 2015

Please see #170.

@SamiHiltunen
Copy link

#170 improves the situation a lot but there are still some cases that are problematic. For example,
"""OpenID implementation.""" or """OAuth login function.""" still produce D403 errors. Both are typically written in this form and I believe should be preferred to any alternative form like 'Openid'. Minor issues for sure but it would still be great to some how be able to ignore specific errors coming from a specific line.

@jayvdb
Copy link
Member Author

jayvdb commented Feb 2, 2016

@SamiHiltunen, I suspect those should probably be rewritten with an action word as the first word. e.g. """OAuth login function.""" could be """Perform OAuth login.""". I'm not sure how """OpenID implementation.""" should be updated to start with an action word.

However if you want to exclude specific errors on specific lines, https://github.com/jayvdb/flake8-putty might be a solution.

Rather than ignoring this error for specific docstrings, it might be useful to have a configuration variable of whitelisted words that can be used at the beginning of a docstring.

One case that needs rethinking (possibly improvement to the PEP) is the docstring of properties. It is very common for the docstring of the getter to describe the value and not start with "Return ...".

fwiw, I am happy for this issue to be closed now that D403 has been released with #170, as more specific issues will be created for remaining corner cases.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants