Run local code remotely on a worker #4003

mrocklin · 2020-07-31T14:46:28Z

I find myself often wanting to run code on a worker, rather than on my local client. This happens in a few settings:

My workers have access to a data store that I don't, so I need to call something like dd.read_parquet remotely (cc @martindurant @jcrist )
My workers are far away from my client, so client-heavy operations like joblib or Hyperband incur a serious bottleneck from client-scheduler communication (cc @stsievert )
My workers have hardware or libraries like GPUs/RAPIDS that I don't have locally (cc @quasiben @kkraus14)

Today I can do this by writing a function and submitting that function as a task

def f():
    import dask_cudf
    df = dask_cudf.read_parquet("s3://...")
    return df.sum().compute()

result = client.submit(f).result()

It might make sense to provide syntax around this to make it more magical (or it might not). We might do something like the following:

with dask.distributed.remote as result:
    import dask_cudf
    df = dask_cudf.read_parquet("s3://...")
    result = df.sum().compute()

I know that @eriknw has done magic like this in the past. We could enlist this help. However, we may not want to do this due to the magical and novel behavior.

The text was updated successfully, but these errors were encountered:

martindurant · 2020-07-31T14:49:47Z

"explicit is better than implicit". :)

This feels something like working in a remote ipython kernel, which I suppose does still work (although I haven't seen anyone do it for quite some time).

stsievert · 2020-07-31T15:13:07Z

What advantages would the additional syntax provide over submitting a function? Could it supply any debugging utility?

eriknw · 2020-08-10T04:19:52Z

I took the bait and may have gotten carried away. This probably isn't quite what you're looking for, but it's... something ;)

https://github.com/eriknw/innerscope

drorspei · 2020-08-10T20:15:51Z

Sounds a lot like ipyparallel, specifically the sync imports context manager and the px magic

quasiben · 2020-08-12T20:29:49Z

What advantages would the additional syntax provide over submitting a function ?

The advantages here would be ergonomic. In the past I've done some work where I am submitting work to a remote cluster where GPUs are available and the egdge/login node has no GPU. Having some syntactic sugar here would be helpful

mrocklin · 2021-07-04T20:22:23Z

I took the bait and may have gotten carried away. This probably isn't quite what you're looking for, but it's... something ;)

@eriknw It certainly shows that magic is possible. I'm curious, do you have any thoughts on how this might be used to provide a context-manager like experience? My objective here is to get away from constructing and then calling functions, and towards remote execution.

The closest thing we have today in common usage are IPython magics (indeed, I think that IPyParallel had remote execution cell magics that were awesome). I'd love to find some way to achieve this, but in a way that didn't require an IPython kernel.

eriknw · 2021-07-05T01:54:20Z

As always, the big challenge with context managers is that they don't provide a separate ~~block of the code~~ frame (edit: it is a separate block within the frame, but not a separate frame) that runs within the context.

Nevertheless, here are some (probably totally doable) syntax ideas that use context managers:

with get_result as rv, remotely:
    ...
    rv = ...
# rv is a Future

# specify the number or names of results we want to keep
with get_results(2) as (x, y), remotely:
    ...
    x = ...
    y = ...
# or
with get_results('x', 'y') as (x, y), remotely:
    ...
    x = ...
    y = ...
# x, y are now futures

The key here is context manager chaining. remotely will raise an exception that is caught by the first context manager, and then we can do magic.

We can come up with variations of this spelling, such as:

with running_remotely as futures, get_results('x', 'y'):
    ...
# futures.x and futures.y are now Futures

The magic is easier if we actually put the code we want to run inside of a function body, such as:

@with_running_remotely
def rv():
    ...
    rv = ...
# rv (obtained from the name of the function) is a Future


@with_running_remotely('x', 'y')
def futures():
    ...
    x = ...
    y = ...
# futures.x and futures.y are now Future objects

Maybe something along one of these lines would provide for a convenient user experience. Thoughts? Reactions? I'm open to suggestions for how to improve any particular spelling to make it clearer.

eriknw · 2021-07-05T16:41:35Z

I've started to whip this up. I don't think it'll be too hard or take too long, but I have other things I'm doing today, so please be patient :)

Here's the spelling I'm targeting:

from somewhere import runcode

with runcode() as result, remotely:
    import dask_cudf
    df = dask_cudf.read_parquet("s3://...")
    result = df.sum().compute()

Here, we get the name to retrieve, result, from the target of the as statement. I kind of like , remotely:, which makes the intent explicit (even if it's rather nonstandard usage of syntax).

We can also be more specific and ask to get multiple results:

x = 1

with runcode('y', 'result') as (y, result), remotely:
    y = 2
    result = x + y

# Or
with runcode('y', 'result') as results, remotely:
    y = 2
    result = x + y

Questions

What do you want to do about variables such as x above that are used within the context, but defined outside it?

We have no constraints. We can send them to the remote worker, do the right thing with dask futures, or raise. Or let the user specify the desired behavior.

Do you want the results to be dask futures or the final result (e.g., future.result())?

Futures seem more natural to me most of the time. Perhaps we could find suitable names if both behaviors are wanted.

More variations and brainstorming

As a function decorator, use the return value:

@runcode.remotely
def result():
    import dask_cudf
    df = dask_cudf.read_parquet("s3://...")
    return df.sum().compute()

Specify where to run programmatically:

if run_locally:
    where = somewhere.locally
else:
    where = somewhere.remotely

with runcode() as result, where:
    ....

Also support locally, which may be useful for testing and debugging the magic:

with runcode() as result, locally:
    ...

@runcode.locally
def result():
    ...

Recall remotely and locally don't need to be defined when used as context managers, and we can distinguish them by name.

The common misspelling (missing remotely) can run the code as normal and emit a warning, or raise before running code :

with runcode() as result:
    ...

If you only want to run the code and don't need the result, simply do:

with runcode(), remotely:
    ...

I'm not sold on the name runcode. Suggestions?

Also, this will be pretty magical and experimental. It would probably be best for it to live outside dask and distributed for now.

If anybody hates (or loves) this spelling (and I'm sure some people will), feel free to chime in even if you don't think you can do so constructively 😎

eriknw · 2021-07-06T05:22:26Z

Started: https://github.com/eriknw/afar

mrocklin · 2021-07-06T19:43:58Z

Sorry for the slow response. I'm on vacation this week and checking github infrequently. (my apologies for restarting this conversation and then ghosting by the way).

I'm excited by this. I have a couple of questions:

Do we need the locally/remotely addition, or is it possible to do this with just a single context manager?
It seems like there is an open question about how to ask for a result. In afar it looks like you're returning all named variables. This seems potentially challenging because there might be many intermediate variables

Personally, I would probably use either
1. the expression on the last line
2. a standard name, like result
3. Optionally with a list of names of variables to collect that defaults to result, but could be overridden, like def runcode(output=["result"])

Qustions from Erik

What do you want to do about variables such as x above that are used within the context, but defined outside it?

We have no constraints. We can send them to the remote worker, do the right thing with dask futures, or raise. Or let the user specify the desired behavior.

I think that it's likely that people will want to reach outside of local scope for local variables. Sending state somehow, either as pickled state or as Dask futures seems fine to me.

Do you want the results to be dask futures or the final result (e.g., future.result())?
Futures seem more natural to me most of the time. Perhaps we could find suitable names if both behaviors are wanted.

So, this is interesting I think and gets to a larger question of how we handle repeated computations

with runcode() as df:
    df = dd.read_parquet(...)
    result = df

with runcode() as result:
    result = df.groupby(...).mean().compute()

In this case returning the result as a Dask future seems best. This saves us from having to pull back a result that we may not be able to deal with locally (because, for example, we don't have cudf installed or an appropriate GPU device locally). How to pass handles between remote blocks is an interesting question though.

eriknw · 2021-07-06T21:46:15Z

Thanks for the reply and interest!

Do we need the locally/remotely addition, or is it possible to do this with just a single context manager?

Yes/maybe. We currently need , remote: given the magic I use. A single context manager's __enter__ can't both return a value (to be used by as result) and raise an exception to interrupt execution. There is likely other (darker?) magic that we could use to circumvent this limitation. For now, adding , remotely: is easiest and lets us continue exploring.

It seems like there is an open question about how to ask for a result. In afar it looks like you're returning all named variables. This seems potentially challenging because there might be many intermediate variables

I agree: returning all named variables is not the end target. That was the easiest thing to do that let me demonstrate functionality and upload a package to PyPI that isn't name-squatting.

I don't have a strong opinion on how to specify which result to get, or whether the result in as result should be a single future or collection of futures. I can try your suggestions. I'll let you know when I have something working with dask that you can play around with.

eriknw · 2021-07-07T15:49:46Z

Actually, as results is completely unnecessary. We can do the following:

with afar.run, remotely:
    x = 1

with afar.run, remotely:
    y = 2
    z = x + y

result = z.result()

I like your idea of using the final assignment as the only value to "return" by default.

Note that x and y are dask futures when used in the main scope, and x.result() is needed to pull the result locally. However, note how the value of x is used in the second context. It's as if we did client.submit(func, x), which is indeed what we'll do.

We can still support with afar.run("x", "y") as results, but even here as results is unnecessary.

Given your initial example, we could execute the code on a worker, and copy the result locally. Is this syntax clear for that behavior?

with afar.run, locally:
    import dask_cudf
    df = dask_cudf.read_parquet("s3://...")
    result = df.sum().compute()

Because we used locally, result is a value, not a dask Future. Perhaps locally is not the right word to use here, because the code was still executed on a dask worker, it's only the result we copied locally.

We can also put arguments for client.submit in remotely (my preference) or run, such as:

from afar import remotely
on_gpus = remotely(resources={"GPU": 1})

with afar.run, on_gpus:
    ...

with afar.run(resources={GPU": 1}), remotely:
    ...

I'm feeling a little better about this. There may be something useful (or at least convenient) here.

mrocklin · 2021-07-07T16:10:44Z

This looks really slick to me:

with afar.run, remotely:
    x = 1

with afar.run, remotely:
    y = 2
    z = x + y

result = z.result()

We can also put arguments for client.submit in remotely (my preference) or run, such as:

I agree that having a place to support constraints would be good. I don't currently have thoughts on where is best.

mrocklin · 2021-07-12T16:39:13Z

This code is highly experimental and magical!

I like the warning by the way :)

It's also impressive that it's 99 lines of code (although I suppose that innerscope handles a bit)

eriknw · 2021-07-16T05:07:17Z

oh, the trouble you get me into with your encouragement ;)

I have this minimally working. You can pip install afar to try it out. See my simple (and pretty much only) test here.

I'm sure there are sharp corners and severe/weird limitations with afar. Please report any that are encountered.

It would be super-duper handy if we could easily and reliably modify frame.f_locals. This may be coming as soon as Python 3.11. See PEP 558 and the PR. afar probably isn't what they have in mind for this change :)

mrocklin · 2021-07-16T20:20:34Z

Question: does the remote side need to have afar installed?

mrocklin · 2021-07-16T20:24:49Z

I was getting an error to that effect, however I suspect that it may be caused by trying to pull in and serialize local state

eriknw · 2021-07-16T21:31:22Z

It shouldn't need to, but, well, it may need to right now. innerscope is currently required locally and remotely. If afar is worth pursuing, I could make it (and test it) such that neither are required remotely.

Thanks for giving afar a try!

eriknw · 2021-07-16T21:37:35Z

Oh wait, yeah, afar is required remotely right now, because we need to run this function on the worker.

mrocklin · 2021-07-16T21:45:27Z

OK. Good to know.

Also, big +1 on the name.

mrocklin · 2021-07-16T21:46:29Z

Thanks for giving afar a try!

It's fun! And it's scratching a long-standing itch of mine.

oh, the trouble you get me into with your encouragement ;)

You are a particularly easy mark it turns out :)

eriknw · 2021-07-17T00:18:21Z

:-P

I'm also fond of the name. from afar import ... sounds almost poetic to me.

Anyway, I added two more features to 0.1.1:

afar.get automatically gathers the data locally. No need for .result()
remotely(**kwargs) passes keyword arguments to client.submit

Your initial example can now be:

import afar

on_gpus = afar.remotely(resources={"GPU": 1})

from afar.get, on_gpus:
    import dask_cudf
    df = dask_cudf.read_parquet("s3://...")
    result = df.sum().compute()

# Now use `result` directly.  No need for `result.result()`!

I don't know if afar.get is the write word to use, but we'll see. It's short.

I think my itch is satisfied. I'll probably broadcast this out a bit more, but I don't plan to work much more on it. Unless, you know, I'm encouraged to ;)

mrocklin · 2021-07-17T00:20:07Z

I'll play around and see if I can find reasons to keep you engaged. Hopefully not. I may also broadcast it out myself.

…

On Fri, Jul 16, 2021 at 7:18 PM Erik Welch ***@***.***> wrote: :-P I'm also fond of the name. from afar import ... sounds almost poetic to me. Anyway, I added two more features to 0.1.1: - afar.get automatically gathers the data locally. No need for .result() - remotely(**kwargs) passes keyword arguments to client.submit Your initial example can now be: import afar on_gpus = afar.remotely(resources={"GPU": 1}) from afar.get, on_gpus: import dask_cudf df = dask_cudf.read_parquet("s3://...") result = df.sum().compute() # Now use `result` directly. No need for `result.result()`! I don't know if afar.get is the write word to use, but we'll see. It's short. I think my itch is satisfied. I'll probably broadcast this out a bit more, but I don't plan to work much more on it. Unless, you know, I'm encouraged to ;) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#4003 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTGRDLHFE5VFRAOMUKTTYDD5TANCNFSM4PQL6M2Q> .

mrocklin · 2021-07-17T00:20:10Z

*if you're ok with that that is

…

On Fri, Jul 16, 2021 at 7:19 PM Matthew Rocklin ***@***.***> wrote: I'll play around and see if I can find reasons to keep you engaged. Hopefully not. I may also broadcast it out myself. On Fri, Jul 16, 2021 at 7:18 PM Erik Welch ***@***.***> wrote: > :-P > > I'm also fond of the name. from afar import ... sounds almost poetic to > me. > > Anyway, I added two more features to 0.1.1: > > - afar.get automatically gathers the data locally. No need for > .result() > - remotely(**kwargs) passes keyword arguments to client.submit > > Your initial example can now be: > > import afar > on_gpus = afar.remotely(resources={"GPU": 1}) > from afar.get, on_gpus: > import dask_cudf > df = dask_cudf.read_parquet("s3://...") > result = df.sum().compute() > # Now use `result` directly. No need for `result.result()`! > > I don't know if afar.get is the write word to use, but we'll see. It's > short. > > I think my itch is satisfied. I'll probably broadcast this out a bit > more, but I don't plan to work much more on it. Unless, you know, I'm > encouraged to ;) > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#4003 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AACKZTGRDLHFE5VFRAOMUKTTYDD5TANCNFSM4PQL6M2Q> > . >

mrocklin · 2021-07-17T02:18:45Z

I recommend using different keynames, maybe afar-run and afar-get?

eriknw · 2021-07-17T02:27:27Z

Will do! Give me a few minutes.

eriknw · 2021-07-17T04:35:05Z

Done. I also set up CI, which took a bit longer than expected (b/c had to fix some things). But, tests now pass for Python 3.7, 3.8, 3.9, and PyPy--hooray! afar 0.1.2 released.

eriknw · 2021-07-17T16:05:56Z

0.1.3 released.

Since we can't always update the locals of a frame (and I'm not sure I want to write the hack to do so), it may be more convenient at times to use our own mapping. For example:

def some_func():
    ...
    run = afar.run(data={"a": 1})
    with run, remotely:
        b = a + 1
    # b doesn't exist locally, because we can't update `frame.f_locals`
    assert run.data["b"].result() == 2

    # continue using the data...
    with run, remotely:
        c = a + b

Note that the singleton afar.run doesn't keep data around, because I'm paranoid.

And for warm-fuzzies, I now test on Linux, OS X, and Windows, and with pip and conda (I was seeing differences between these two).

mrocklin · 2021-07-17T21:37:56Z

Another suggestion, if the last statement isn't an assignment, but just an expression like the following:

with afar.run, remotely:
    df

Maybe we should call repr or repr_html (context dependently) and print the result out. I'm actually not sure how systems like IPython and Jupyter handle context managers when repr'ing. Maybe this would be atypical.

eriknw · 2021-07-17T22:56:21Z

Yeah, that's pretty atypical, but so is afar: eriknw/afar#2

This shows I can return the final expression and display it (but it's not 100% reliable yet). What I don't know is how to compute the appropriate repr remotely and copy that instead of the original object.

Would you expect anything else do be done with the final statement? Should this be made available to the user somehow, or is the repr enough?

eriknw · 2021-07-18T00:15:05Z

Also, I'd wait a bit before broadcasting this out. I'm testing and fixing bugs.

mrocklin · 2021-07-18T00:16:28Z

Yup. No problem. I'm iterating with a GPU workload on Coiled and it's interesting getting both smooth at the same time.

FWIW this is already a very useful tool for me. I'm finding that GPUs-on-the-cloud feel closer at hand already.

eriknw · 2021-07-20T17:51:02Z

Great to hear!

afar 0.2.0 released. It's more reliable and trustworthy for me. Hopefully it is for you too. Please let me know if it's not.

See release notes here: https://github.com/eriknw/afar/releases

Notably, you can now look at afar.run.context_body to see the source lines that it uses.

Also, added with afar.run, later:, which doesn't execute the context block.

Note that the main limitation of afar is that it requires the source lines to be available, which isn't always the case. This makes things so much easier. We could try to manipulate bytecode instead, but doing so is very dirty and likely to be a headache for each new Python release. As it is, the current approach actually seems pretty reliable. It does peek at bytecode a little bit, so afar probably only works with CPython and PyPy. I am continually amazed by PyPy.

eriknw · 2021-08-30T17:36:24Z

Update: afar 0.5 now supports IPython magics!

%load_ext afar

%%afar
import dask_cudf
df = dask_cudf.read_parquet("s3://...")
result = df.sum().compute()

instead of the original

def f():
    import dask_cudf
    df = dask_cudf.read_parquet("s3://...")
    return df.sum().compute()

result = client.submit(f).result()

More examples:

%%afar x, y  # save both x and y as Dask Futures
x = 1
y = x + 1

z = %afar x + y

or

%afar z = x + y

I think this is starting to get pretty nice.

jrbourbeau · 2021-08-30T17:38:37Z

Ah great, I look forward to taking them for a spin. Thanks @eriknw!

jakirkham mentioned this issue Jul 31, 2020

[FEA] Import cudf on non-CUDA enabled machine rapidsai/cudf#3661

Closed

quasiben mentioned this issue Sep 13, 2020

Deploying RAPIDS coiled/feedback#39

Closed

Run local code remotely on a worker #4003

Run local code remotely on a worker #4003

Comments

mrocklin commented Jul 31, 2020

martindurant commented Jul 31, 2020

stsievert commented Jul 31, 2020

eriknw commented Aug 10, 2020

drorspei commented Aug 10, 2020 • edited Loading

quasiben commented Aug 12, 2020

mrocklin commented Jul 4, 2021

eriknw commented Jul 5, 2021 • edited Loading

eriknw commented Jul 5, 2021

Questions

More variations and brainstorming

eriknw commented Jul 6, 2021

mrocklin commented Jul 6, 2021

Qustions from Erik

eriknw commented Jul 6, 2021

eriknw commented Jul 7, 2021

mrocklin commented Jul 7, 2021

mrocklin commented Jul 12, 2021

eriknw commented Jul 16, 2021

mrocklin commented Jul 16, 2021

mrocklin commented Jul 16, 2021

eriknw commented Jul 16, 2021

eriknw commented Jul 16, 2021

mrocklin commented Jul 16, 2021

mrocklin commented Jul 16, 2021

eriknw commented Jul 17, 2021

mrocklin commented Jul 17, 2021 via email

mrocklin commented Jul 17, 2021 via email

mrocklin commented Jul 17, 2021

eriknw commented Jul 17, 2021

eriknw commented Jul 17, 2021

eriknw commented Jul 17, 2021

mrocklin commented Jul 17, 2021

eriknw commented Jul 17, 2021

eriknw commented Jul 18, 2021

mrocklin commented Jul 18, 2021

eriknw commented Jul 20, 2021

eriknw commented Aug 30, 2021

jrbourbeau commented Aug 30, 2021

drorspei commented Aug 10, 2020 •

edited

Loading

eriknw commented Jul 5, 2021 •

edited

Loading