Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port cylc to Python 3 #1874

Closed
6 tasks done
hjoliver opened this issue May 31, 2016 · 40 comments
Closed
6 tasks done

Port cylc to Python 3 #1874

hjoliver opened this issue May 31, 2016 · 40 comments
Assignees
Milestone

Comments

@hjoliver
Copy link
Member

hjoliver commented May 31, 2016

This will eventually be required, for future-proofing cylc. Probably quite straightforward, but depends on #1872 because the current Pyro-3 communications layer is not Python 3 compatible. [that's done]

Edited by @matthewrmshin:

There is now a clear road map for this. See https://github.com/cylc/admin/blob/master/cylc-8-roadmap.md for example.

Issues that will hopefully be fixed as part of our Python 3 porting effort:

@hjoliver
Copy link
Member Author

Once we get on to this can we go drop Python-2 support, or will we need to support both for a while?

@hjoliver hjoliver added this to the later milestone May 31, 2016
@matthewrmshin
Copy link
Contributor

At the moment, I think most distros include both python2 and python3 commands these days. When we port cylc to Python 3, we can modify our #! lines to run python3, then we should no longer have to support both.

@benfitzpatrick
Copy link
Contributor

Most new distros...

@matthewrmshin
Copy link
Contributor

new should no longer be that new by the time we tackle this issue.

@hjoliver
Copy link
Member Author

hjoliver commented Jun 21, 2016

[meeting]

  • can't do it yet - major sites (HPC!) don't have proper access to Python 3 yet

@hjoliver hjoliver modified the milestones: some-day, later Jun 23, 2016
@duncanwp
Copy link

I've just started trying to use Cylc and while I appreciate that not everywhere is ready to move to Python 3 completely, not supporting it at all is a bit of a pain. Would it be much work to make it version agnostic, using six where necessary?

It would also make the transition smoother for those still on 2...

@hjoliver
Copy link
Member Author

@duncanwp - six looks useful. We're very open to contributions if you want to take a crack at this yourself! Otherwise we'll definitely get to it in due course, but it has not become a top priority yet for the reasons above.

@matthewrmshin
Copy link
Contributor

Hi @duncanwp Can you explain why it is a pain, please? What platform are you using cylc on?

We do want to port cylc to Python 3 eventually, but it is not a priority at the moment, as our main user bases don't have proper access to Python 3 yet.

@duncanwp
Copy link

duncanwp commented Apr 26, 2017

@matthewrmshin I'm looking to use it on a Centos 6 cluster (which is Python 2 based) but I run my own conda (Python 3) environment to install a number of packages which aren't pre-installed. I can work around it, I've just started getting used to the Python 3 niceness ;-)

@hjoliver I'll have a look at what's involved and maybe put together a pull-request.

@duncanwp
Copy link

@matthewrmshin Which actually reminds me of another question: Are you likely to create a conda package for Cylc, possibly through conda-forge?

@matthewrmshin
Copy link
Contributor

Hi @duncanwp As a quick workaround, you may try:

  1. Only use your conda environment in applications that require the special Python 3 environment. This will ensure that any other applications implemented in Python 2 (like cylc) do not break.

  2. Alternatively, you can try modifying all #!/usr/bin/env python lines in your local installation of Cylc to use #!/usr/bin/python to ensure that it is not breakable by the version of Python in your current environment.

Are you likely to create a conda package for Cylc, possibly through conda-forge?

Yes, but not a priority at the moment.

@tleeuwenburg
Copy link

Hi all, I'm more than happy to work on this a bit. Anyone who is having trouble getting Python 3 into a legacy environment should consider using the Anaconda distribution to get in on there -- that's what we're currently planning to do. I have found not all Linux environments are necessarily packaging PyGTK for Python 2 any more, so there is an increasing need to at least support Python 3.

@tleeuwenburg
Copy link

The main stumbling block I see for packaging Cylc itself with conda is the PyGTK dependency. It seems to have a very strange status, in that it doesn't actually install properly through pip on Linux at all, despite GTK being a Linux library! It might be more tractable to package a "no gui" Cylc through conda initially, and then circle back on the PyGTK support.

@hjoliver
Copy link
Member Author

hjoliver commented May 2, 2017

@tleeuwenburg - that'd be good, bearing in mind the comments above on the need to support both Python-2 and -3 for a while from the same evolving Cylc code base - might need to use six as suggested above

@matthewrmshin
Copy link
Contributor

#411 marked as superseded by this.

@matthewrmshin matthewrmshin modified the milestones: some-day, later Apr 19, 2018
@matthewrmshin
Copy link
Contributor

I am moving the milestone to later for this issue - as it will become one of our main priorities in the not-so-distant future. However, it is worth noting that we are still unlikely to look at this until mid-2019 earliest when we are done with #1873.

@kinow
Copy link
Member

kinow commented Jun 12, 2018

Had some spare time today, and was a bit tired of Java, so started syncing my local cylc repository. 🎉 Lots of interesting things!

All the projects I've migrated from 2 to 3 were smaller, and did no required compatibility with Python 2. So never used 2to3, six, etc.

Here's an initial version that works at least for the my.suite example. Tests fails, but also fail in my environment building from master. Perhaps somebody would have some spare time to review & maybe try it with a better test scenario?

master...kinow:p2to3-take1

Here's the contents of a notepad I used for notes:

# quickly fix some files
2to3 -w -n .

# commit #1

# load anaconda python 2 with cylc modules installed
source ~/Development/python/sa2.sh
# test Python 2
find . -name "*.py" -exec python2 -m py_compile {} \;
# issue with print("", file=sys.stderr) (file=...)
# @see http://python-future.org/compatible_idioms.html
# manually patch the files on sublime
grep -r -H -e "print(.*file=" . | awk '{print $1}' | sort | uniq | xargs -I{} subl {}

# commit #2

# test Python 2
find . -name "*.py" -exec python2 -m py_compile {} \;

# issue with class PointBase(object, metaclass=ABCMeta): (metaclass=...)
# s/object, metaclass=ABCMeta):/six.with_metaclass(ABCMeta, object)):/g
# @see http://portingguide.readthedocs.io/en/latest/classes.html
subl ./lib/cylc/cycling/__init__.py

# commit #3

# test Python 2
find . -name "*.py" -exec python2 -m py_compile {} \;

# issue with urllib ... cherrypy's _cpcompat.py has a good try/catch...
# @see http://python3porting.com/stdlib.html
grep -r -H "urllib" . | grep "\.py" | awk '{print $1}' | grep -v "Binary" | grep -v "cherrypy" | grep -v "INSTALL"  | sort | uniq | xargs -I{} subl {}

# FIXME bug in _compat.py?

Was

try:
    from urllib.parse import quote_from_bytes as url_quote
except ImportError:
    from urllib.parse import quote as url_quote

Is now

try:
    from urllib.parse import quote_from_bytes as url_quote
except ImportError:
    from urllib import quote as url_quote
# issue with izip
grep -r -H "izip" | grep "\.py" | grep -v Binary

# issue with imap
grep -r -H "imap" | grep "\.py" | grep -v Binary

# Undo changes done by 2to3 in libraries
git checkout origin lib/jinja2
git checkout origin lib/cherrypy

# implement missing __nonzero__ abstract method?

# -- OK! --

# load anaconda python 3 with cylc modules installed
source ~/Development/python/sa3.sh

# test Python 3
find . -name "*.py" -exec python3 -m py_compile {} \;

# -- OK! --

# unit tests et static analysis?
pip install pycodestyle pyopenssl
export PATH="$PWD/bin:$PATH"
export CYLC_TEST_DIFF_CMD='diff -I Xlib -u' # I have an X server, but harmless?
export CYLC_TEST_RUN_PLATFORM=false
export CYLC_TEST_DEBUG=true
export PATH=$PATH:./bin
cylc test-battery --state=save -j 5 || (echo -e "\n\nRerunning Failed Tests...\n\n"; cylc test-battery --state=save,failed -j 5)

A few tests started failing, and I realized I had spent too much time on this, so decided to comment here :-)

@hjoliver
Copy link
Member Author

Hey @kinow - that seems like good news, maybe the port will not be so difficult! FYI we just discussed this yesterday (had not updated the issue yet) and decided we probably will not support Python 2 and 3 at the same time, just make a clean break when the time comes (those stuck with Python 2 - if any - will just have to use older cylc versions for a while). Note the blocker for actually doing this is the old Python 2 PyGTK GUI - first big priority is web GUI, then Python 3.

@kinow
Copy link
Member

kinow commented Jun 12, 2018

Hey @hjoliver

I looked at the existing issues, and many of them are related to the current GUI. I think moving to the web GUI would be a great milestone for cylc 🎉

And if we won't need to support both, then it might be even easier. For me the hardest part is making sure cylc is still running fine after the change.

@hjoliver
Copy link
Member Author

hjoliver commented Aug 8, 2018

@cylc/core - technically we could port the suite server program and command line clients to Python 3 any time, leaving just the old GUI to run under Python 2. I was thinking it made more sense to go all at once (with the new GUI in place) but if it turned out to be pretty easy (see @kinow's post above) there would be a "reputational" advantage to having the bulk of the code modernized.

@oliver-sanders
Copy link
Member

leaving just the old GUI to run under Python 2

It is possible but may be a bit tricky. Take #164 for instance where the GUI manipulates the suite configuration to hold visualisation settings.

As an aside we would probably have to package and distribute the old GUI separately. Here is a list of Python modules which the GUI imports from outside of the GUI and standard libraries:

cylc.cfgspec.gcylc
cylc.cfgspec.glbl_cfg
cylc.cfgspec.gscan
cylc.cylc_xdot
cylc.dump
cylc.flags
cylc.graphing
cylc.hostuserutil
cylc.mkdir_p
cylc.network.httpclient
cylc.network.port_scan
cylc.run_get_stdout
cylc.suite_logging
cylc.suite_srv_files_mgr
cylc.suite_status
cylc.task_id
cylc.task_job_logs
cylc.task_state
cylc.task_state_prop
cylc.version
cylc.wallclock
isodatetime.data
isodatetime.parsers
xdot

@kinow
Copy link
Member

kinow commented Sep 6, 2018

I also had the impression it would be easier to investigate first making it compatible with Python 2 and 3. Probably with the six module, and perhaps a compatibility layer (e.g. Pandas compat package).

But I hadn't thought about the GUI until @oliver-sanders ' comment. Normally I use Anaconda environments for Python, except for cylc gui, which I am still running on my Ubuntu Python, in order to be able to import the gtk module... will have a look over the next days to understand more about it.

I agree that doing both in parallel would be a distraction. However, i am not sure if implementing a new web layer with Flask/Tornado/etc in Python 2 would work without any hiccups. The documentation for both Flask and Tornado state that Python 2 is still supported, but that it's preferred to use the latest version of Python 3.

Quickly looking at the issues in Flask, it's possible to find a few minor issues regarding issues with Python 2, e.g.

Tornado's current release version is 5.x, but they have already announced 6.x will drop Python 2. Their current Travis CI configuration doesn't even seem to include Python 2 any longer.

I think we could run the risk of having to use an older version, or workaround limitations in the frameworks due to Python 2.

If we measure the test coverage, and include any necessary tests to reach a reasonable coverage, plus some manual tests, then maybe we could (assuming it's possible to run the GUI in Python 3, of course)

  • port whatever can be ported and is backward compatible (e.g. print "hello" to print("hello"))
  • create the compatibility layer if necessary, or try six
  • port the rest of the code to use the compatibility layer
  • run the automated and manual tests
  • leave users testing this version, and then start working on the new web layer, already with enough tests + Python 3

From a previous comment, sounds like there was already a plan with order of activities, and that in that plan Python 3 would come after the web layer. If that's a hard requirement, then all good for me. But otherwise, I would be happy to increase my dosage of caffeine and have a crack at it 🕺

Cheers
Bruno

@matthewrmshin
Copy link
Contributor

There is no hard requirement for the order of activities. The main issue has always been the lack of developers and that we do want to draw a close to some of the issues that we have promised to solve. And I'm still trying to see all current PRs merged before doing anything further.

I agree with the analysis that writing new web layer code in Python 2 sounds crazy - and this activity will have to start soon!

I'm just thinking that we may end up having to maintain a development branch for a prolonged period of time - with code that can be upgraded ported to Python 3 and web UI development built on top of this branch.

In which case, perhaps feel free to take a crack at this, but expect to have to maintain your branch for a period of time (like one of my PRs) before it gets the full attention by the rest of the team.

@hjoliver
Copy link
Member Author

hjoliver commented Sep 6, 2018

Presumably the new reverse proxy can be done in Python 3 from the outset, because it only interacts with the suites servers via the network API. But for the suite server side changes, yes we'll need to figure this out ...

@kinow
Copy link
Member

kinow commented Sep 6, 2018

In which case, perhaps feel free to take a crack at this, but expect to have to maintain your branch for a period of time (like one of my PRs) before it gets the full attention by the rest of the team.

That works for me @matthewrmshin . Happy to rebase & fix the code. If things go well, and it's not an epic task to make the code compatible with both versions, then I expect to have a few patterns of code that can be easily migrated/adjusted.

@takluyver
Copy link

When you do come to do the port, python-modernize and python-future are two tools that can automate making a lot of the easy changes for you (like print 1 -> print(1)). They're similar to 2to3, but they aim to keep the code working on Python 2 as well. I just scanned through the thread and haven't seen them mentioned, so I thought a link might be useful.

Of those two, modernize is more conservative in its changes, and adds calls to six where needed. Python-future's futurize tool aims to make a more idiomatic Python 3 codebase, and inserts its own compatibility layer to make things work on Python 2.

@kinow
Copy link
Member

kinow commented Oct 2, 2018

@takluyver thanks for the tips! I have a branch where I did the following (and seems to be going in a good direction):

For each package, first I am running locally and printing to the console, without changing the files. Then I analyzed the output to see if everything made sense... I think there was only one or two parts that I had to open another terminal and play with the code to see why something was done the way it was (so many list(...) calls).

My only concern right now is that there are a few parts of Cylc not covered by tests, and even though python-future seems to be doing a good job, there's still the risk that we may have issues later I think.

Perhaps modernize would be simpler. I jumped directly from six and 2to3 to futurize. Will have a look at modernize as well!

Thanks!

@takluyver
Copy link

Great, sounds like you're already on top of that. Modernize vs futurize is largely a matter of taste, I think: you'll still have to do similar checking with either.

The only suggestion I might have is to let futurize write its changes, and then examine them using git. Lots of GUI diff tools (like Meld) and IDEs can highlight changes from git for you. I'd find that easier than examining the patches in a terminal, but YMMV. 😃

@kinow
Copy link
Member

kinow commented Oct 2, 2018

The only suggestion I might have is to let futurize write its changes, and then examine them using git.

Here's my current workflow:

> cd $CYLC_HOME
> futurize lib/markupsafe/
> # here I compare the output, and if I find anything weird or that I don't understand, then I fire up python to play with some code. If everything looks good...
> futurize -n -w lib/markupsafe
> git diff # or gitk to have a more visual output

Then finish with git commit -am "futurize $module_name", and push to my branch.

@takluyver
Copy link

Gotcha. I was suggesting skipping the first call to futurize, and doing all the checking after it writes the files. Up to you, though!

@hjoliver
Copy link
Member Author

hjoliver commented Oct 3, 2018

@takluyver - thanks from me also, for the excellent tips. Note (not sure this is mentioned above) we've made the decision to go full-Python 3 on a long-lived development branch (largely because the old GUI can't be ported, but also some other reasons) for the new cylc web architecture. So we don't need to bother with six etc to support both 2 and 3. Hopefully that will simplify the process a bit.

@oliver-sanders
Copy link
Member

`The next question is what will be the new minimum Python version for Cylc, here is a fairly random smattering of some of the more significant features introduced since 3.0 which might be of interest to us:

3.1

  • collections.OrderedDict, collections.Counter
  • Run python from a zip file
  • str.format changes
  • unittest improvements, incl ability to skip tests

3.2

  • argparse (PEP 389)
  • concurrent.futures (PEP 3148)
  • .pyc files names extended with the python interpreter (PEP 3147)

3.3

  • virtualenv (PEP 405)
  • Changes to os and io exceptions (PEP 3151)
  • Generator improvements (PEP 380)
  • raise Exception() from None (PEP 409)
  • Unicode++ (PEPs 414, 393)
  • Method for reducing the memory taken by class instances - dict key sharing (PEP 412)

3.4

  • Python "isolated mode" - ignore PYTHONPATH etc (bpo 16499)
  • ascyncio (PEP 3156)
  • enum (PEP 435)
  • Single-dispatch generic functions (PEP 443)

3.5

  • coroutines with async and await syntax (PEP 492)
  • collections.OrderedDict is now implemented in C

3.6

  • dictionaries now use the collections.OrderedDict implementation (CPython)
  • asyncio++
  • formatted string literals (PEP 498)
  • syntax for variable annotations (PEP 526)
  • asynchronous generators and comprehensions (PEP 525, 530)
  • secrets module for generating cryptographically strong pseudo-random numbers

3.7

  • asyncio++
  • dictionaries are insertion ordered (all Python implementations)

A lot of the features of the most recent Python versions are of great interest to us, what should be our new minimum requirement. If we go down the conda packaging route requiring a more contemporary Python might not be such a big issue?

@matthewrmshin
Copy link
Contributor

matthewrmshin commented Oct 3, 2018

Yes, I think we can choose whatever we like if we go down the Conda packaging route - so the answer hopefully is going to be 3.7. (The only thing to look out for is whether there are any libraries we are going to use that may have any issues with cutting edge Python.)

@oliver-sanders
Copy link
Member

so the answer hopefully is going to be 3.7

+1

@takluyver
Copy link

For reference, various projects I work on currently have 3.3, 3.4 or 3.5 as the minimum supported version. Ubuntu 18.04 (the latest long term support release) includes Python 3.6.

Conda is popular, so preparing conda packages is a good idea. But expect pushback if you make that the only practical way to install your software: for a variety of reasons, people like to have other installation options.

@tleeuwenburg
Copy link

tleeuwenburg commented Oct 4, 2018 via email

@matthewrmshin matthewrmshin modified the milestones: later, soon Nov 14, 2018
@hjoliver hjoliver mentioned this issue Mar 7, 2019
5 tasks
@matthewrmshin matthewrmshin modified the milestones: soon, cylc-8.0.0 Mar 7, 2019
@matthewrmshin
Copy link
Contributor

Close. The list in the description should now have follow-on issues or else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants