Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zmq: use global client context #3015

Merged

Conversation

oliver-sanders
Copy link
Member

@oliver-sanders oliver-sanders commented Mar 19, 2019

@oliver-sanders oliver-sanders requested a review from kinow March 19, 2019 11:29
@oliver-sanders oliver-sanders self-assigned this Mar 19, 2019
@oliver-sanders oliver-sanders added this to the cylc-8.0.0 milestone Mar 19, 2019
@oliver-sanders oliver-sanders added the bug Something is wrong :( label Mar 19, 2019
@oliver-sanders
Copy link
Member Author

@kinow hopefully this will solve the issue you were seeing when trying to open multiple clients.

Why Jupyterhub has issue with asyncio.run I don't know.

@kinow
Copy link
Member

kinow commented Mar 19, 2019

With the command line, when I create a second client for the same suite I still get an error.

Python 3.7.2 (default, Dec 29 2018, 06:19:36) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cylc.network.client import SuiteRuntimeClient
>>> client = SuiteRuntimeClient('five')
>>> client('ping_suite')
True
>>> client = SuiteRuntimeClient('five')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kinow/Development/python/workspace/cylc/lib/cylc/network/client.py", line 242, in __new__
    timeout_handler=lambda: cls._timeout_handler(suite, host, port)
  File "/home/kinow/Development/python/workspace/cylc/lib/cylc/network/client.py", line 118, in __init__
    self.socket = CONTEXT.socket(zmq.REQ)
  File "/home/kinow/Development/python/anaconda3/lib/python3.7/site-packages/zmq/sugar/context.py", line 146, in socket
    s = self._socket_class(self, socket_type, **kwargs)
  File "/home/kinow/Development/python/anaconda3/lib/python3.7/site-packages/zmq/_future.py", line 136, in __init__
    self.io_loop = io_loop or self._default_loop()
  File "/home/kinow/Development/python/anaconda3/lib/python3.7/site-packages/zmq/asyncio/__init__.py", line 28, in _default_loop
    return asyncio.get_event_loop()
  File "/home/kinow/Development/python/anaconda3/lib/python3.7/asyncio/events.py", line 644, in get_event_loop
    % threading.current_thread().name)
RuntimeError: There is no current event loop in thread 'MainThread'.
>>> 

Jupyter Notebook still unhappy even with the first client, but may be something else.

image

@oliver-sanders
Copy link
Member Author

when I create a second client for the same suite I still get an error.

I'm unable to replicate this.

from cylc.network.client import SuiteRuntimeClient

client_1 = SuiteRuntimeClient('foo')
client_2 = SuiteRuntimeClient('foo')
client_3 = SuiteRuntimeClient('bar')

print([
    client_1('ping_suite'),
    client_2('ping_suite'),
    client_3('ping_suite')
])
$ python script.py
[True, True, True]
$ python --version
Python 3.7.2

@oliver-sanders
Copy link
Member Author

Jupyter Notebook still unhappy even with the first client

This is definitely a different issue, I don't understand what is different about notebooks to plain Python to make this happen.

@matthewrmshin
Copy link
Contributor

Have you tried a second client from a different Python process?

@kinow
Copy link
Member

kinow commented Mar 20, 2019

when I create a second client for the same suite I still get an error.

I'm unable to replicate this.

Just tried from home and it worked! I might have used a version that was not from git, but from my Anaconda environment that was installed during some tests with setup.py. Sorry, your fix worked, thanks!

This is definitely a different issue, I don't understand what is different about notebooks to plain Python to make this happen.

Sorry for insisting on the Notebook part. It is still failing here for me. I agree that Jupyter Notebook failing should not be a blocker for this issue.

But as the Notebook is a Tornado client, it is a cheap/easy test to confirm that the code also works from within a Tornado app with a main loop.

Here's a test with the current Cylc UI Server.

$ git diff
diff --git a/cylc_singleuser.py b/cylc_singleuser.py
index 3b6e144..0483c24 100644
--- a/cylc_singleuser.py
+++ b/cylc_singleuser.py
@@ -11,6 +11,18 @@ from tornado import web, ioloop
 from handlers import *
 
 
+class TestHandler(web.RequestHandler):
+
+    def get(self):
+        from cylc.network.client import SuiteRuntimeClient
+        client_1 = SuiteRuntimeClient('five')
+        response = client_1('ping_suite')
+        if response:
+            self.write("PINGED!")
+        else:
+            self.write("OPS!")
+
+
 class MainHandler(HubOAuthenticated, web.RequestHandler):
 
     # hub_users = ["kinow"]
@@ -85,6 +97,8 @@ class CylcUIServer(object):
                 (url_path_join(self._jupyter_hub_service_prefix, 'userprofile'), UserProfileHandler),
                 (url_path_join(self._jupyter_hub_service_prefix, 'suites'), CylcScanHandler),
 
+                (url_path_join(self._jupyter_hub_service_prefix, 'test'), TestHandler),
+
                 (self._jupyter_hub_service_prefix, MainHandler, {"path": self._static}),
             ],
             # FIXME: decide (and document) whether cookies will be permanent after server restart.

That creates an endpoint that calls the same code we are calling from command line and from Jupyter Notebooks, but from within our Tornado app. Calling the endpoint returns the same error I believe.

image

Maybe we are doing something wrong in our Tornado app. So if others prefer to merge it now, I can create another ticket to investigate why this happens 👍

@oliver-sanders
Copy link
Member Author

Agreed this code should work in a Notebook, let's punt this to another ticket though.

I think it might be fixable by obtaining the asyncio task loop and more manually inserting a task into it.

Copy link
Member

@kinow kinow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed my issue with multiple clients. Tested locally.

@oliver-sanders
Copy link
Member Author

oliver-sanders commented Mar 20, 2019

Have you tried a second client from a different Python process?

This has always worked.

from multiprocessing import Process
from time import sleep

from cylc.network.client import SuiteRuntimeClient

def ping(suite):
    print('#', suite)
    client = SuiteRuntimeClient(suite)
    sleep(3)
    print('%', suite, client('ping_suite'))

procs = [
    Process(target=ping, args=(suite,))
    for suite in ['foo', 'foo', 'bar']
]
for proc in procs:
    proc.start()
for proc in procs:
    proc.join()
$ python my_script.py
# foo
# foo
# bar
% bar True
% foo True
% foo True

@kinow
Copy link
Member

kinow commented Mar 21, 2019

FWIW, I spent some 30 minutes looking around how to either re-use Tornado's main loop in the ZMQ client, or other ways to integrate both.

PyZMQ seems to mention the Tornado loop in some of its documentation, and even include a code that seems could have worked here. I tried adding the following to cylc.network.client just before where the global context is created:

from zmq.eventloop import ioloop
ioloop.install()

Same error.

Then decided to read about it in JupyterHub and Jupyter Notebook issues... but they would have had the same problem with some user, at some point: jupyter/notebook#3397 (comment)

In that discussion, there was one solution mentioned that was harder on users/devs. But then one guy created nest_asyncio, for nesting asyncio loops. The main code is in one file in the project repo. Quite interesting, like a sophisticated mokey-patching 😁

Anyhoo, adding the following to the same place in the client.py file:

diff --git a/lib/cylc/network/client.py b/lib/cylc/network/client.py
index c8c88209d..cb31c987e 100644
--- a/lib/cylc/network/client.py
+++ b/lib/cylc/network/client.py
@@ -34,6 +34,9 @@ from cylc.suite_srv_files_mgr import (
     SuiteSrvFilesManager, SuiteServiceFileError)
 
 
+import nest_asyncio
+nest_asyncio.apply()
+
 CONTEXT = zmq.asyncio.Context()

After installing nest asyncio via pip fixed the problem (there are two lines of code installing another loop, but that's left-over that was not working, and did not work untill nest_async was installed).

Screenshot from 2019-03-21 15-43-23

This also works if I add it to the notebook, and as the maintainer of pyzmq/jupyter/etc said:

So my inclination is for now, recommend nest_asyncio at the user-level and finish shipping ipykernel 5 / ipython 7 with top-level await / mainloop. We could even add a special exception handler for this RuntimeError to point folks at nest_asyncio or autoawait:

we can add it as a comment in the created ZMQClient instance or some webpage, saying something like "if you use the client with an existing asyncio loop, you may have issues due to blabla, please refer to this or that page about it, and use this workaround" 👍

image

This last test had no changes to cylc/cylc code. Only to the Jupyter Notebook client. Which means this way we don't need to do anything else here, except for maybe some documentation for people instantiating the client. We can then adjust the code in Cylc UI Server 👍

@oliver-sanders oliver-sanders force-pushed the zmq-global-client-context branch from 65a2ae2 to 80d7d8a Compare March 21, 2019 09:55
@oliver-sanders oliver-sanders modified the milestones: cylc-8.0.0, cylc-8.0a1 Mar 26, 2019
@oliver-sanders oliver-sanders force-pushed the zmq-global-client-context branch from 80d7d8a to 574b7ea Compare April 5, 2019 10:20
@oliver-sanders
Copy link
Member Author

Deconflicted.

@kinow
Copy link
Member

kinow commented Apr 5, 2019

👍 LGTM. If Travis fails probably just a flaky test.

@matthewrmshin matthewrmshin merged commit 4952239 into cylc:master Apr 5, 2019
@oliver-sanders oliver-sanders deleted the zmq-global-client-context branch May 27, 2020 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is wrong :(
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants