-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
default isort && default black #3322
Conversation
Hi, thanks. I'm not opposed in principle, seems like a good idea (although TBH I'm surprised seeing This PR doesn't seem to match the coding style we find practical though. E.g. I see no end trailing comma at the end of lists. Or the > 80 char line length. Is it possible to fix this? We run automated |
The trend towards opinionated auto-formatting, and But, there's a bit of up-front cost in adjusting habits, and the big reformat diffs in the history that might make other changes hard to follow (like any big change-of-style transition). The biggest factor is how comfortable the existing devs are to the new practices. Like @piskvorky, the default-to-double-quotes is a bit different than I'd pick, & initially disorienting, but I understand the reasons for it & it's far from a deal-breaker. Line-lengths are actually one of the areas that, despite So as a matter of going-with-the-overall-open-source-Python-community-practices, I'd be mildly in favor of moving towards It'd be best to start right after a fairly major, & stable, release – so that there's more time for devs to adjust to the altered-style before any other fixes/release are urgent. While there's a good case to "get it all over with" in one leap, we could also consider "dipping our toes in" with an incremental, module-by-module switchover - perhaps starting with modules that have historically received fewer updates, to iron out any unexpected interactions with flake/CI/doc-comments/etc. Then, with comfort, set a target for a full (& applied-on-commit, enforced-by-CI) switchover. Some of the edge-case auto-reformatting might be better with a human touch, or review, that still reaches ...a comment that was just barely over the 88-char limit broke things onto 3 lines, with extra parens - whereas changing the line-ending comment to a line-before comment, sparing the parens, keeping the code on 1 line would strike me as a better style. Still, that's pretty minor (and most such cases might disappear if we alloed a longer line-length), so might not justify the extra effort of human review before going forward. |
That's a good point. The strict line length discourages inline comments. Especially if No big deal IMO. So if the But seeing how much time we all "have" lately (I had another kid), I'd probably be in favour of a single full PR switch = task done. I doubt any of us will have the capacity / attention span for incremental tinkering and checking and re-checking, even if that would be ideal. |
I understand all of your points and agree with you. I am glad that my PR has started an internal discussion. Yes, the main point is.
Also, I opened this PR and raised this discussion because:
I guess the latter could be something useful for, say, at least:
|
Let's see what @mpenkov thinks but I believe the discussion has reached its conclusion :-) @dariocurr If you can make Flake8 I'm not sure about, I think we have a few exceptions like "Yes this return value is unused but we still assign it and give it a name, because that makes the code easier to read / understand". But that's on a case-by-case basis, I don't know that we have any general "flake8 exceptions". |
This discussion about code formatting has come up multiple times before, like here. Personally, I dislike automatic code formatters like black and never use them, for two reasons. First, the cost-to-benefit ratio is not worth it. Sure, with black, your code looks consistent, but it consistently looks like crap (sure, you can tweak some parameters, but I still don't like the output). To me, the main benefit of black seems to end flame wars about code style, but gensim never suffered from that particular problem. Like @piskvorky, it's also difficult for me to see how cosmetic issues made the code so difficult to work with that it prompted this PR. If the code is that crap, then it's unlikely that formatting alone will fix it. So, I don't see the practical benefit of reformatting our entire code base, especially at this stage in the project's lifecycle. Second, I typically put effort into formatting my code to make it readable, and I don't appreciate some program stomping over that it reformatting it due to some arbitrary rules. That said, the vast majority of the gensim code wasn't written by me, so this is more of a personal point than an argument against applying black to gensim. In the end, it's @piskvorky's call. If you do go ahead with this, my recommendation would be to try it out on a reasonably-sized module like keyedvectors.py first, and then see if you like the results before applying it to the entire repo. Like @gojomo said, definitely do this after the next major release, because there'll be havoc and it'll take time to bring things back to equilibrium. On top of that, you probably want to merge existing PRs first, because it may be difficult to do once the develop branch has been reformatted. |
I don't know what to say. This is certainly your call. I raised this discussion because since @piskvorky black obviously supports custom line length, I should look into the issue of trailing commas instead. I could try to meet your needs, but I guess you should decide what the standards are first. By the way, what about import sorting? Anyway, thank you for your time and attention |
Sure, that makes sense. I think we're currently splitting the imports into three sections:
and sorting each section, but this isn't enforced anywhere by linters or otherwise. |
Hi, I am Dario, a developer at BuildNN.
I tried to give a look at the code to solve #3321 but i found the code hard to read. I just gave a default run of isort and black to ease contributions
I would like to fix the plenty of flake8 inconsistencies as well but I don't know if it's appreciated
I guess it's easier and safer to run them by yourself instead of checking thousands of lines, but if you would like to solve all flake8's inconsistencies just say it