Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporate changes from Python Improved? #51

Closed
MattDMo opened this issue Jul 3, 2015 · 26 comments
Closed

Incorporate changes from Python Improved? #51

MattDMo opened this issue Jul 3, 2015 · 26 comments

Comments

@MattDMo
Copy link
Contributor

MattDMo commented Jul 3, 2015

For those who aren't aware, I've been maintaining Python Improved for the past few years (with a lot of help and discussion from @FichteFoll and @facelessuser, among many others), and it's become pretty popular on Package Control. It started out as a side project to fix a couple of glaring issues with the syntax that ships with Sublime, but as I became more familiar with it and began rearranging and improving/consolidating/expanding things I found more and more issues to fix and features to add, and now it has definitely taken on a life of its own. Its development has been in parallel with my Neon Color Scheme, which I started soon after I discovered Sublime in the first place and dove deep into syntax highlighting and customization, as well as the Python language itself. From the start I've always entertained the thought that Sublime's packages would be open-sourced and I could merge my improvements back in to help everyone, not just those that have found my package.

At any rate, I was wondering if there was any interest in either merging Python Improved with the default Python syntax hosted here, or really overhauling the current Python definition, taking work from PI as well as some other alternate Python syntaxes, @petervaro's Python 3 being first in my mind. (On a side note, Py3 is licensed under the GPL, while PI is under the MIT License, so that may be an issue.) The designs are different - I've tried to keep PI in as similar a vein as the original as far as naming scopes goes, whilst Py3 uses an entirely different naming scheme. PI has been designed as a standalone file, coding in a .YAML-tmLanguage file, then converting it to an old-fashioned .tmLanguage PLIST using AAAPackageDev, but Peter has built Py3 and his Cython syntax using Python itself - the various scopes and regexes are defined in dicts, then he puts the whole thing together with a build script. And, of course, PI works equally well with both versions 2 and 3 of the language, whilst Py3 is 3-specific, if you couldn't tell. 😄 Different strokes for different folks.

I'm preparing for a new release of PI that, among other things, should support Unicode identifiers (source is in the unicode branch). I haven't exhaustively tested it yet, but I'm writing a script to choose random characters within and outside the ranges specified in PEP-3131 (taken from this resource), put them into varying-length "words", then write them to a file for testing in Sublime. It seems to work well enough with the characters I've picked so far from Windows' CharMap. I think this will be a real ground-breaking improvement that I'd love the default Python syntax to have as well. There are a bunch of other improvements I'd like to contribute as well, including support for function annotations, binary literals, expanded matching of all *Exception and *Error identifiers, lots of changes to function and class definitions, rearrangement of keywords, consistent highlighting of built-in functions and types, proper highlighting of escape sequences in strings, removal of SQL keyword matching (it's incomplete, random, distracting, and buggy), and more. There may some stuff in PI that possibly doesn't need to be in the main Python def, like support for highlighting the IPython In and Out fields in SublimeREPL, and maybe the expanded comment highlighting for key words like BUG, FIXME, XXX, and TODO, but that can be debated elsewhere. Like I mentioned, PI is currently written in AAAPackageDev's version of YAML, but I've been meaning to start playing around with the .sublime-syntax format, so here's my chance! Somebody just needs to write a syntax definition for it so I can highlight it properly...

So here's what I'm looking for: first and foremost, feedback on whether or not you think this is a good idea, and if you do, ideas on implementing it. Do we build a more-or-less "finished product" behind the scenes (as a branch of this repo, a new repo under @sublimehq, a new repo under someone else (I'll volunteer), a branch of PI, or something else altogether) and then submit it all as one gigantic PR, or piecemeal submit PRs to this repo and have Jon (or anybody else with commit privileges here - who exactly is that, anyway?) apply them live?

Personally, I think the best option would be to have a Python-specific branch of this repo or a new repo altogether, so the entire history can be tracked. I'm completely willing to host one, but I may not be the most objective person to decide on the quality and necessity of the PRs (although I'm a scientist IRL, so I try to be as objective as possible). More importantly, since this is a hobby and I'm not a full-time programmer, there may be times when work/family/life interferes and I'm not able to contribute for a bit. I'd certainly support having it here if the committers are willing to stay ... umm ... committed and not let things sit around for days and weeks - IMO FichteFoll has done an excellent job with the ST Issues trackers, so if s/he is interested I'd definitely nominate him/her.

So, thoughts? Suggestions? Issues?

@FichteFoll
Copy link
Collaborator

This is quite a lenghty post (I haven't read all of it yet), but I'll let you in on my thoughts on a new Python syntax definition.

Basically, I intend to re-write a .sublime-syntax file from scratch, with some inspiration from PythonImproved, most notably what I did in MattDMo/PythonImproved#17. I just haven't had the the time to do that lately because life has been rather busy. (Don't let my frequent postings on github or the forum deceive you - I do that from mobile while commuting mostly.)

Anyway, here is my roadmap for this "project":

  1. Write a nice and good YAML syntax definition, based on the YAML standard, and try to be as accurate as possible.
  2. Write a syntax definition that extends on that YAML def by including/pushing it and using with_prototype. I will have to take a deeper look at this because I only roughly know how I would tackle this.
  3. With that outta the way, start a new syntax def for Python.

By the way, the Python definition should be kept at an absolute minimum imo and only highlight based on Python's syntax, not based on comment annotations like "TODO", other frameworks like Django or the IPython REPL. These should be implemented separately and then include the python definition.

@jrappen
Copy link
Contributor

jrappen commented Jul 4, 2015

👍 on implemented separately and then included

@1st1
Copy link

1st1 commented Aug 11, 2015

I think this is a great idea. @sublimehq please consider this, current Python 3 support in ST is in a sorry state.

@NotSqrt
Copy link

NotSqrt commented Sep 14, 2015

There's also some new things in Python 3.5 syntax (async def, async for, await, ...)

@MattDMo
Copy link
Contributor Author

MattDMo commented Sep 14, 2015

@NotSqrt A new version of Python Improved will be released in the next few days that, among other things, will include support for the new Python 3.5 syntax.

@mandx
Copy link

mandx commented Oct 17, 2015

There's also MagicPython.

@wbond
Copy link
Member

wbond commented Mar 2, 2016

Not to discount all of the work and effort that you've put into PythonImproved, however I think after our experience merging in a revamped JavaScript syntax (part of build 3103), I'd be more inclined to iteratively fix/tweak the existing Python definition.

With JavaScript there were some fundamental scope changes that ended up drastically changing how color schemes applied. In general, most users did not appreciate the changes, even though there were many fixes along the way. Quite a number of users downgraded to a very old version of Sublime Text just to get back to the highlighting they were familiar with.

It obviously isn't going to be possible, or desirable, to maintain backwards compatibility 100% of the time. However, I'm not currently comfortable with a "big rewrite."

Obviously small, focused bug reports tend to be the easiest to quickly process. It makes it possible to add syntax tests and fixes. Small PRs that fix a single issue with relevant tests are also pretty easy to accept. A large changeset with many changes is the hardest since it can be easy to miss the affects of a small tweak.

Over the past few weeks we've made huge strides with the JavaScript syntax, and I think one of the biggest reasons was to have a few users who took it upon themselves to review changes and tweaks I was making and point out edge case bugs they were seeing. Granted, I imagine the JavaScript changes required were probably larger scope than Python due to all of the new ES6/ES2015 syntax changes.

So, I'd love to improve Python. However, I'd like to try by fielding issues that identify places it is broken right now. I'd greatly appreciate bug reports or small PRs that fix issues. With these changes we can build up the test suite for Python and hopefully make it easier to improve in the future. Additionally, it would be super helpful for some heavy Python users to comment on tweaks that are being made.

@MattDMo
Copy link
Contributor Author

MattDMo commented Mar 2, 2016

Like I said in my other comment in 221, I wholeheartedly agree. I don't think the scope changes in PI are anywhere near as drastic as what happened in JS, as I/we tried to build onto existing scopes as much as possible or rewrite current scopes, but there may be some changes. We'll just take them as we go...

@1st1
Copy link

1st1 commented Mar 2, 2016

@wbond FWIW you should also take a look at https://github.com/MagicStack/MagicPython.

It's now used by GitHub to highlight all Python files, it's available for ST, Atom, and VSCode, and it's modelled closely after the default Python syntax in ST3. On top of that, we have hundreds of unit tests, to make sure everything is in order.

@MattDMo
Copy link
Contributor Author

MattDMo commented Mar 2, 2016

@1st1 I've been looking at MagicPython for some enhancement ideas for PI, but haven't implemented anything yet. If you want, perhaps we could coordinate on which code we'd like to push into this repo, making it the Ultimate Python ❗ 💥

@1st1
Copy link

1st1 commented Mar 2, 2016

@MattDMo Sure, we're open for collaboration ;)

@facelessuser
Copy link

I've been using MagicPython recently, and I generally like the approach that has been taken there.

@MattDMo
Copy link
Contributor Author

MattDMo commented Mar 2, 2016

In what ways? The modularity?

@facelessuser
Copy link

String handling for one. Python Improved has always struggled with regex inside strings. And what I mean is terminating regex syntax.

For example, below you will see the raw r in front of every string is highlighted proper, but Python Improved can't seem to terminate regex highlighting at the end of the first string. It won't stop until it hits the closing round bracket in the last string:

            (r'(?<![.$])(for(\s+(parallel|series))?|in|of|while|until|'
             r'break|return|continue|'
             r'when|if|unless|else|otherwise|except\s+when|'
             r'throw|raise|fail\s+with|try|catch|finally|new|delete|'
             r'typeof|instanceof|super|run\s+in\s+parallel|'
             r'inherits\s+from)\b', Keyword),

MagicPython doesn't have this problem which is really nice. I have had syntax highlighted corrupted with PythonImproved due to its weakness in this area. I feel like MagicPython wrote this from scratch differently than the standard Python that Python Improved was originally derived from (this is my impression not necessarily fact).

Another great thing is how it can detect docstrings. It always annoyed me a little how raw doc strings would be highlighted with regex, Magic Python detects that the string is not assigned anywhere and avoids treating them with regex. That is just a couple of examples.

I feel like MagicPython, instead of patching a python that was okay, went about implementing a highlighter from the ground up to avoid the pitfalls of the default one.

In general I just feel like it was thoughtfully put together. I don't think it handles Unicode function names and things like that, but I don't use Unicode in my function names.

@wbond
Copy link
Member

wbond commented Mar 2, 2016

@facelessuser I took a very brief look at MagicPython. The tests look very comprehensive. It also has a good overall structure in terms of the statements/expressions split.

For syntaxes in the default packages we are now using .sublime-syntax, which is a superset of .tmLanguage functionality. We'd have to fork MagicPython since we'd be using constructs that I don't believe have a corollary in the TM/Atom world. Additionally, a bunch of the context sensitive functionality is implemented via lookbehinds (there are 3x as many in MagicPython as the default Python). We will be removing those to improve performance since they are not compatible with the new regex engine.

I'm most comfortable moving forward by stripping the default Python back and building up a more stateful syntax (like we've done with JavaScript) and adding features from there. Starting with something like MagicPython or Python Improved means there is more to strip back before we start moving forward again.

If the MagicStack team would be willing to contribute the MagicPython test files towards the default syntax, I think we'd be lucky to have them. We would need to convert them to the Sublime Text syntax test format.

@facelessuser
Copy link

@wbond I wasn't suggesting that Sublime use MagicPython as default. I was just giving it a shout out as I have been fairly pleased with it and it came up in the discussion.

@wbond
Copy link
Member

wbond commented Mar 2, 2016

@facelessuser Sorry if I misread you comment. I thought you were suggesting such when you said:

I feel like MagicPython, instead of patching a python that was okay, went about implementing a highlighter from the ground up to avoid the pitfalls of the default one.

I also wanted to say that I definitely appreciate everyone's feedback and thoughts on this matter. I'm sure many of you have more experience than I do in writing these syntaxes.

@vpetrovykh
Copy link

@facelessuser Basically as long as the underlying libraries are built with proper Unicode support, MagicPython has no problem with Unicode.

E.g. this is how Unicode is highlighted when the regexp libraries have been compiled correctly.
screenshot_20160302_162014

Here's the full relevant test file from the repository. What we've seen is that depending on the system this is running on there may be problems with correctly handling Unicode in regular expressions, probably due to how character classes such as [:alpha:] and [:word:] are actually handled.

@wbond You are certainly welcome to use the test files from MagicPython.

@facelessuser
Copy link

Nah. I was just saying that PythonImproved is based off the default, and I think the reason why MagicPython works better for me is because I don't think they tried to base it off the default.

I was more trying to illustrate why I think it works better. FWIW, I am happy using MagicPython currently and the default Python decision doesn't affect me. I just wanted to point out a good package and why I liked it.

I know people are looking for better syntax highlighters for Python and it is good to know of good alternatives. Sure, it would be nice to have the default as good or better than alternative options, but until that time, it is good to know there are other options.

@facelessuser
Copy link

@facelessuser Basically as long as the underlying libraries are built with proper Unicode support, MagicPython has no problem with Unicode.

Cool, I am obviously not an expert on what MagicPython can do or what has recently been added. It may have been when I first started using it it wasn't in. Or maybe I was mistaken entirely.

@wbond
Copy link
Member

wbond commented Mar 2, 2016

Our intention is to continue to invest in the syntaxes in the default packages to improve editor/indexer performance and accuracy. Ideally we want users to be able to open Sublime Text and start using most mainstream languages without running into any significant errors or having to install third-party packages.

@facelessuser
Copy link

Our intention is to continue to invest in the syntaxes in the default packages to improve editor/indexer performance and accuracy. Ideally we want users to be able to open Sublime Text and start using most mainstream languages without running into any significant errors or having to install third-party packages.

I completely agree. From a SublimeHQ and new user perspective that is definitely the way Sublime needs to go. I am obviously not a new user and am not going anywhere anytime soon :).

@vpetrovykh
Copy link

For syntaxes in the default packages we are now using .sublime-syntax, which is a superset of .tmLanguage functionality. We'd have to fork MagicPython since we'd be using constructs that I don't believe have a corollary in the TM/Atom world. Additionally, a bunch of the context sensitive functionality is implemented via lookbehinds (there are 3x as many in MagicPython as the default Python). We will be removing those to improve performance since they are not compatible with the new regex engine.

@wbond Speaking as a Sublime Text user and a co-author of MagicPython I would dearly love it if this plug-in didn't just stop working for me. Is .tmLanguage format with lookbehinds still going to be supported at all?

@wbond
Copy link
Member

wbond commented Mar 2, 2016

@vpetrovykh I apologize for any confusion. We are not removing support for .tmLanguage files or the oniguruma regex engine. We are just trying to further improve the syntaxes we ship by default.

Considering there are over 300 custom syntaxes on Package Control, it would be very unwise to remove support for the format almost all of them use. 😄

@FichteFoll
Copy link
Collaborator

Especially considering that tmLanguage files can be converted to sublime-syntax losslessly.

@wbond
Copy link
Member

wbond commented Apr 28, 2016

I've got a refactoring of Python that will be landing soon so we can address new syntax from Python 3.x.

Please open specific issues if you know of anything that needs to be fixed or improved.

@wbond wbond closed this as completed Apr 28, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants