Skip to content

Commit

Permalink
Add task- and run-local storage.
Browse files Browse the repository at this point in the history
  • Loading branch information
njsmith committed May 28, 2017
1 parent 9a7ed6c commit 0b0f3be
Show file tree
Hide file tree
Showing 7 changed files with 376 additions and 5 deletions.
80 changes: 77 additions & 3 deletions docs/source/reference-core.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1105,10 +1105,84 @@ Result objects
create and access :class:`Result` objects from any thread you like.


Task-local storage and run-local storage
----------------------------------------
Task-local storage
------------------

Suppose you're writing a server that responds to network requests, and
you log some information about each request as you process it. If the
server is busy and there are multiple requests being handled at the
same time, then you might end up with logs like this:

.. code-block:: none
Request handler started
Request handler started
Request handler finished
Request handler finished
In this log, it's hard to know which lines came from which
request. (Did the request that started first also finish first, or
not?) One way to solve this is to assign each request a unique
identifier, and then include this identifier in each log message:

.. code-block:: none
request 1: Request handler started
request 2: Request handler started
request 2: Request handler finished
request 1: Request handler finished
This way we can see that request 1 was slow: it started before request
2 but finished afterwards. (You can also get `much fancier
<http://opentracing.io/documentation/>`__, but this is enough for an
example.)

Now, here's the problem: how does the logging code know what the
request identifier is? One approach would be to explicitly pass it
around to every function that might want to emit logs... but that's
basically every function, because you never know when you might need
to add a ``log.debug(...)`` call to some utility function buried deep
in the call stack, and when you're in the middle of a debugging a
nasty problem that last thing you want is to have to stop first and
refactor everything to pass through the request identifier! It would
be much more convenient if we could store the identifier in a global
variable, so that the logging function could look it up whenever it
needed it. Except... a global variable can only have one value at a
time, so if we have multiple handlers running at once then this isn't
going to work. What we need is something that's *like* a global
variable, but that can have different values depending on which
request handler is accessing it.

That's what :class:`trio.TaskLocal` gives you:

.. autoclass:: TaskLocal

And here's a toy example demonstrating how to use :class:`TaskLocal`:

.. literalinclude:: reference-core/tasklocal-example.py

Example output (yours may differ slightly):

.. code-block:: none
`Not implemented yet! <https://github.com/python-trio/trio/issues/2>`__
request 1: Request handler started
request 2: Request handler started
request 0: Request handler started
request 2: Helper task a started
request 2: Helper task b started
request 1: Helper task a started
request 1: Helper task b started
request 0: Helper task b started
request 0: Helper task a started
request 2: Helper task b finished
request 2: Helper task a finished
request 2: Request received finished
request 0: Helper task a finished
request 1: Helper task a finished
request 1: Helper task b finished
request 1: Request received finished
request 0: Helper task b finished
request 0: Request received finished
.. _synchronization:
Expand Down
39 changes: 39 additions & 0 deletions docs/source/reference-core/tasklocal-example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
import random
import trio

request_info = trio.TaskLocal()

# Example logging function that tags each line with the request identifier.
def log(msg):
# Read from task-local storage:
request_tag = request_info.tag

print("request {}: {}".format(request_tag, msg))

# An example "request handler" that does some work itself and also
# spawns some helper tasks to do some concurrent work.
async def handle_request(tag):
# Write to task-local storage:
request_info.tag = tag

log("Request handler started")
await trio.sleep(random.random())
async with trio.open_nursery() as nursery:
nursery.spawn(concurrent_helper, "a")
nursery.spawn(concurrent_helper, "b")
await trio.sleep(random.random())
log("Request received finished")

async def concurrent_helper(job):
log("Helper task {} started".format(job))
await trio.sleep(random.random())
log("Helper task {} finished".format(job))

# Spawn several "request handlers" simultaneously, to simulate a
# busy server handling multiple requests at the same time.
async def main():
async with trio.open_nursery() as nursery:
for i in range(3):
nursery.spawn(handle_request, i)

trio.run(main)
6 changes: 4 additions & 2 deletions docs/source/reference-hazmat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -141,8 +141,10 @@ TODO: these are currently more of a sketch than anything real. See
:with: queue


System tasks
============
Global state: system tasks and run-local storage
================================================

.. autoclass:: RunLocal

.. autofunction:: spawn_system_task

Expand Down
2 changes: 2 additions & 0 deletions trio/_core/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ def _hazmat(fn):
from ._unbounded_queue import *
__all__ += _unbounded_queue.__all__

from ._local import *
__all__ += _local.__all__

if hasattr(_run, "wait_readable"):
import socket as _stdlib_socket
Expand Down
86 changes: 86 additions & 0 deletions trio/_core/_local.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# Task- and Run-local storage

from . import _hazmat, _run

__all__ = ["TaskLocal", "RunLocal"]

# Our public API is intentionally almost identical to that of threading.local:
# the user allocates a trio.{Task,Run}Local() object, and then can attach
# arbitrary attributes to it. Reading one of these attributes later will
# return the last value that was assigned to this attribute *by code running
# inside the same task or run*.

# This is conceptually a method on _LocalBase, but given the way we're playing
# with attribute access making it a free-standing function is simpler:
def _local_dict(local_obj):
locals_type = object.__getattribute__(local_obj, "_locals_key")
try:
refobj = getattr(_run.GLOBAL_RUN_CONTEXT, locals_type)
except AttributeError:
raise RuntimeError("must be called from async context") from None
return refobj._locals.setdefault(local_obj, {})


# Ughhh subclassing I feel so dirty
class _LocalBase:
__slots__ = ("__dict__",)

def __getattribute__(self, name):
ld = _local_dict(self)
if name == "__dict__":
return ld
try:
return ld[name]
except KeyError:
raise AttributeError(name) from None

def __setattr__(self, name, value):
_local_dict(self)[name] = value

def __delattr__(self, name):
try:
del _local_dict(self)[name]
except KeyError:
raise AttributeError(name) from None

def __dir__(self):
return list(_local_dict(self))


class TaskLocal(_LocalBase):
"""Task-local storage.
Instances of this class have no particular attributes or methods. Instead,
they serve as a blank slate to which you can add whatever attributes you
like. Modifications made within one task will only be visible to that task
– with one exception: when you ``spawn`` a new task, then any
:class:`TaskLocal` attributes that are visible in the spawning task will
be inherited by the child. This inheritance takes the form of a shallow
copy: further changes in the parent will *not* affect the child, and
changes in the child will not affect the parent. (If you're familiar with
how environment variables are inherited across processes, then
:class:`TaskLocal` inheritance is somewhat similar.)
If you're familiar with :class:`threading.local`, then
:class:`trio.TaskLocal` is very similar, except adapted to work with tasks
instead of threads, and with the added feature that values are
automatically inherited across tasks.
"""
__slots__ = ()
_locals_key = "task"


@_hazmat
class RunLocal(_LocalBase):
"""Run-local storage.
:class:`RunLocal` objects are very similar to :class:`trio.TaskLocal`
objects, except that attributes are shared across all the tasks within a
single call to :func:`trio.run`. They're also very similar to
:class:`threading.local` objects, except that :class:`RunLocal` objects
are automatically wiped clean when :func:`trio.run` returns.
"""
__slots__ = ()
_locals_key = "runner"
13 changes: 13 additions & 0 deletions trio/_core/_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -371,6 +371,9 @@ class Task:
_next_send = attr.ib(default=None)
_abort_func = attr.ib(default=None)

# Task-local values, see _local.py
_locals = attr.ib(default=attr.Factory(dict))

# XX maybe these should be exposed as part of a statistics() method?
_cancel_points = attr.ib(default=0)
_schedule_points = attr.ib(default=0)
Expand Down Expand Up @@ -518,6 +521,9 @@ class Runner:
instruments = attr.ib()
io_manager = attr.ib()

# Run-local values, see _local.py
_locals = attr.ib(default=attr.Factory(dict))

runq = attr.ib(default=attr.Factory(deque))
tasks = attr.ib(default=attr.Factory(set))
r = attr.ib(default=attr.Factory(random.Random))
Expand Down Expand Up @@ -656,6 +662,13 @@ def spawn_impl(
scope._add_task(task)
coro.cr_frame.f_locals.setdefault(
LOCALS_KEY_KI_PROTECTION_ENABLED, ki_protection_enabled)
if nursery is not None:
# Task locals are inherited from the spawning task, not the
# nursery task. The 'if nursery' check is just used as a guard to
# make sure we don't try to do this to the root task.
parent_task = current_task()
for local, values in parent_task._locals.items():
task._locals[local] = dict(values)
self.instrument("task_spawned", task)
# Special case: normally next_send should be a Result, but for the
# very first send we have to send a literal unboxed None.
Expand Down
Loading

0 comments on commit 0b0f3be

Please sign in to comment.