feat(sandbox): Candidate Implementation of Sandbox Plugin to Support Jupyter #1255

xingyaoww · 2024-04-20T17:22:51Z

As discussed in #1251, we need a better way (instead of the not ideal attempt in #1215) to incorporate different plugins/tools (e.g., jupyter kernel for interactive python execution).

The goal is "If the agent implementer wants to make any assumptions about the environment beyond "it's bash", the agent needs to make sure everything is set up." The current implementation assumes different behavior for each sandbox that may go against the goal.

In this PR, I attempt to implement one that satisfies this goal by allowing any arbitrary sandbox to be initialized with plugins (a setup.sh that automatically sets everything needed). I added PluginMixin, which can be mixed with all Sandbox that leverage their .execute method to complete the setup process.

I also mapped opendevin/.cache to ~/.cache inside the sandbox so that the user doesn't need to download the pip install packages everytime they re-start the container -- these will be automatically cached. The user can also clean-up these caches (e.g., if they want to switch to a different sandbox) by running make clean.

I have only implemented this for DockerSSHBox yet, and i hope to get some feedback before proceeding to include this Mixin for all boxes :)

Once setup.sh is completed, you can send code to jupyter kernel for execution by using echo "import math" | execute_cli.

rbren

The one thing I don't see here is: how does the agent send a python command to jupyter?

opendevin/sandbox/plugins/mixin.py

opendevin/sandbox/docker/ssh_box.py

rbren · 2024-04-20T18:24:14Z

opendevin/sandbox/docker/ssh_box.py

+            # chown the home directory
+            exit_code, logs = self.container.exec_run(
+                ['/bin/bash', '-c', 'chown opendevin:root /home/opendevin'],
+                workdir=SANDBOX_WORKSPACE_DIR,
+            )
+            if exit_code != 0:
+                raise Exception(
+                    f'Failed to chown home directory for opendevin in sandbox: {logs}')


Maybe we want to pull this out in to the core Sandbox implementation, as initialize_homedir. Some sandboxes might override it, but most are going to do exactly this.

(TBH there's probably a lot of logic we could deduplicate between SSH, Exec, and Local)

I agree! I was also thinking heavily about this while implementing this, maybe we should make a BaseClass to keep all the common logic (including this one!)

well... just realized this might just work on DockerSSHBox and ExecBox, since in LocalBox we are just re-using the current user.

Hmm good point. I could see putting this logic in the base class, and then having a pass override for the LocalBox that makes it a no-op

I guess once we deprecate ExecBox, this logic will not be duplicated since DockerSSHBox, LocalBox and E2BBox will likely to be very different from each other in terms of implementation. So maybe we can leave this as is, and then try to deprecate ExecBox in the next few days/weeks?

rbren · 2024-04-20T18:24:50Z

opendevin/sandbox/plugins/jupyter/execute_cli

@@ -0,0 +1,13 @@
+#!/bin/bash


is this script getting used anywhere?

To answer this & your previous question: "how does the agent send a python command to jupyter?"

opendevin/sandbox/plugins will be mounted to /opendevin/plugins/ inside the sandbox container, and the setup.sh script will then add /opendevin/plugins/jupyter to $PATH, making sure that the sandbox .execute function has access to execute_cli tool.

Whenever the agent wants to execute Python, they can just do echo "import math" | execute_cli in bash and get the response in stdout (this part will likely be implemented by agent-specific logic, rather than prompting agents to do it.

Oh nice! Maybe just add a note to this file that it's a convenience for the agent?

Should the setup script copy this file over to the sandbox environment?

huh, do you mean we should just copy the execute_cli into some ..bin folder, instead of adding the entire /opendevin/plugins/jupyter into $PATH? That could be another alternative too!

Exactly! That way we don't have to mount additional directories. That'll make it easier to e.g. support jupyter on e2b

done! i added copy_to method to the Sandbox interface, and supports how Plugins works through copying

opendevin/sandbox/docker/ssh_box.py

rbren · 2024-04-20T18:26:44Z

That was super fast! This is looking awesome. I think you're definitely on the right track

Co-authored-by: Robert Brennan <accounts@rbren.io>

xingyaoww · 2024-04-21T08:41:48Z

@rbren After trying to extend this Plugin capability to different sandboxes, I realized that it could be challenging since different sandbox would assume different "user" and levels of system access:

ExecBox can potentially work, but its .execute is not stateful, which make this Sandbox different (deviating from our original goal that each Sandbox should has a consistent interface?)
LocalBox is executed locally, which means it may not have sudo commands available; Also, it is not stateful (e.g., environment variables set in the previous turn is no longer accessible in subsequent turns).

If we managed to get the DockerSSHBox working stably with the DOOD route, do you think if we can deprecate LocalBox and ExecBox and solely focus on SSHBox for future development? It could saves a tons of efforts (but of course we need to double-check Windows user can run this DOOD without issues)

rbren · 2024-04-21T19:00:30Z

@xingyaoww I definitely want to get rid of ExecBox, for the reason you stated--it's not stateful, and we want something that's stateful. I think with the --add-host trick we might be ready?

LocalBox seems potentially useful still. And we want to keep supporting e2b.

I think so long as we make sure the plugin script fails with a good error (e.g. "Failed to install $PLUGIN on $SANDBOX_TYPE: $ERROR_MESSAGE"), we can make things more consistent over time. Really what I'd like is for sandbox maintainers (e.g. mlejva for e2b) to be responsible for ensuring their sandbox env is as consistent as possible, with SSHBox being our "north star".

…e-jupyter

xingyaoww · 2024-04-22T06:59:21Z

opendevin/sandbox/docker/exec_box.py

@@ -122,6 +124,37 @@ def run_command(container, command):
                return -1, f'Command: "{cmd}" timed out'
        return exit_code, logs.decode('utf-8')

+    def copy_to(self, host_src: str, sandbox_dest: str, recursive: bool = False):


Code duplication from DockerSSHBox which is not ideal, but i guess we can leave them temporarily for this PR and plan to deprecate ExecBox in the next few days.

xingyaoww · 2024-04-22T07:00:46Z

opendevin/sandbox/docker/ssh_box.py

+            # chown the home directory
+            exit_code, logs = self.container.exec_run(
+                ['/bin/bash', '-c', 'chown opendevin:root /home/opendevin'],
+                workdir=SANDBOX_WORKSPACE_DIR,
+            )
+            if exit_code != 0:
+                raise Exception(
+                    f'Failed to chown home directory for opendevin in sandbox: {logs}')


I guess once we deprecate ExecBox, this logic will not be duplicated since DockerSSHBox, LocalBox and E2BBox will likely to be very different from each other in terms of implementation. So maybe we can leave this as is, and then try to deprecate ExecBox in the next few days/weeks?

xingyaoww · 2024-04-22T07:03:19Z

opendevin/sandbox/e2b/sandbox.py

@@ -61,6 +61,10 @@ def execute(self, cmd: str) -> Tuple[int, str]:
        assert process_output.exit_code is not None
        return process_output.exit_code, logs_str

+    def copy_to(self, host_src: str, sandbox_dest: str, recursive: bool = False):
+        # FIXME
+        raise NotImplementedError('Copying files to E2B sandbox is not implemented yet')


I guess this shouldn't be so hard to implement for E2B since we already have .filesystem ready? cc @mlejva

@xingyaoww yes, shouldn't be hard. It's possible to upload files (up to 100MB) to the sandbox or one can just write to a file

@rbren let me quickly implement this before merging

xingyaoww · 2024-04-22T07:18:34Z

@xingyaoww I definitely want to get rid of ExecBox, for the reason you stated--it's not stateful, and we want something that's stateful. I think with the --add-host trick we might be ready?

LocalBox seems potentially useful still. And we want to keep supporting e2b.

I think so long as we make sure the plugin script fails with a good error (e.g. "Failed to install $PLUGIN on $SANDBOX_TYPE: $ERROR_MESSAGE"), we can make things more consistent over time. Really what I'd like is for sandbox maintainers (e.g. mlejva for e2b) to be responsible for ensuring their sandbox env is as consistent as possible, with SSHBox being our "north star".

Agree! Btw, I think ExecBox will suffer from the exact same issue of --network=host, since it also relies on docker, right? i.e., flask server started by ExecBox also cannot be accessed from outside the container.

Also, just realizing the --network=host is supported with the newest version of Docker Desktop! If this works, i guess we can probably deprecate the ExecBox, and having DOOD with DockerSSHBox as the default setting (if they also work well on Linux).

Version Docker Desktop 4.29.0
Settings ⇒ Features in development ⇒ Enable host networking
Host networking allows containers that are started with --net=host to use localhost to connect to TCP and UDP services on the host. It will automatically allow software on the host to use localhost to connect to TCP and UDP services in the container. Sign in required.

Originally posted by @4-FLOSS-Free-Libre-Open-Source-Software in docker/roadmap#238 (comment)

rbren · 2024-04-22T14:35:05Z

opendevin/config.py

+_cache_dir = config.get('CACHE_DIR')
+if _cache_dir:
+    pathlib.Path(_cache_dir).mkdir(parents=True, exist_ok=True)


I'm not entirely certain of the purpose of cache dir--do we still need this?

We need this (at least for now) to cache the pip install commands in Jupyter requirement dependency - without this, you need to re-download the whl for every sandbox initialization, which can be way too slow. I still felt it is a bit slow with cache (you still need to spend the CPU hours)... we probably need to figure out a way to cache it in the future further (e.g., save the docker image with the requirement installed and re-use it as much as possible)

rbren

LGTM! I think this is ready to go. Just one question

mlejva · 2024-04-23T00:00:33Z

@xingyaoww @rbren I implemented the copy_to method for E2BBox in a separate PR #1298

xingyaoww added 12 commits April 20, 2024 11:23

initialize plugin definition

7c76759

initialize plugin definition

ba07415

simplify mixin

fdc5109

further improve plugin mixin

1ada575

add cache dir for pip

9bed95f

support clean up cache

abf0fbb

add script for setup jupyter and execution server

bbac1a6

integrate JupyterRequirement to ssh_box

18cf4ba

source bashrc at the end of plugin load

b7f2f60

add execute_cli that accept code via stdin

cf33615

make JUPYTER_EXEC_SERVER_PORT configurable via env var

3908ae7

increase background cmd sleep time

fadb609

xingyaoww requested a review from rbren April 20, 2024 17:30

rbren reviewed Apr 20, 2024

View reviewed changes

xingyaoww and others added 6 commits April 21, 2024 10:31

Update opendevin/sandbox/plugins/mixin.py

d3c829c

Co-authored-by: Robert Brennan <accounts@rbren.io>

add mixin to base class

9481c7b

make jupyter requirement a dataclass

456d9cb

source plugins only when >0 requirements

0d788ef

add sandbox_plugins for each agent & have controller take care of it

a436c8b

update build.sh to make logs available in /opendevin/logs

bac492a

xingyaoww added 2 commits April 21, 2024 06:45

switch to use config for lib and cache dir

3f5030d

fix permission issue with /workspace

a7b4f2e

xingyaoww added 4 commits April 21, 2024 14:34

use python to implement execute_cli to avoid stdin escape issue

6be6ac8

wait until jupyter is avaialble

c1a4e56

Merge commit '2242702cf94eab7275f2cb148859135018d9b280' into integrat…

76b5cfa

…e-jupyter

support plugin via copying instead of mounting

da1b3ba

xingyaoww commented Apr 22, 2024

View reviewed changes

rbren reviewed Apr 22, 2024

View reviewed changes

rbren approved these changes Apr 22, 2024

View reviewed changes

rbren mentioned this pull request Apr 22, 2024

fix(sandbox): Extend the sandbox execute method to support python instead of using additional method #1228

Closed

xingyaoww mentioned this pull request Apr 22, 2024

feat(CodeActAgent): Support Agent-User Interaction during Task Execution and the Full Integration of CodeActAgent #1290

Merged

mlejva mentioned this pull request Apr 22, 2024

Integrate copy_to in E2BBox #1298

Merged

xingyaoww merged commit fc5e075 into All-Hands-AI:main Apr 23, 2024

xingyaoww mentioned this pull request Apr 23, 2024

Add SWE-agent tools as sandbox plugins #1305

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sandbox): Candidate Implementation of Sandbox Plugin to Support Jupyter #1255

feat(sandbox): Candidate Implementation of Sandbox Plugin to Support Jupyter #1255

xingyaoww commented Apr 20, 2024 •

edited

Loading

rbren left a comment

rbren Apr 20, 2024

xingyaoww Apr 21, 2024

xingyaoww Apr 21, 2024

rbren Apr 21, 2024

xingyaoww Apr 22, 2024

rbren Apr 20, 2024

xingyaoww Apr 21, 2024

rbren Apr 21, 2024

xingyaoww Apr 21, 2024

rbren Apr 21, 2024

xingyaoww Apr 22, 2024

rbren commented Apr 20, 2024

xingyaoww commented Apr 21, 2024 •

edited

Loading

rbren commented Apr 21, 2024

xingyaoww Apr 22, 2024

xingyaoww Apr 22, 2024

xingyaoww Apr 22, 2024

mlejva Apr 22, 2024 •

edited

Loading

mlejva Apr 22, 2024

xingyaoww commented Apr 22, 2024 •

edited

Loading

rbren Apr 22, 2024

xingyaoww Apr 22, 2024 •

edited

Loading

rbren left a comment

mlejva commented Apr 23, 2024

feat(sandbox): Candidate Implementation of Sandbox Plugin to Support Jupyter #1255

feat(sandbox): Candidate Implementation of Sandbox Plugin to Support Jupyter #1255

Conversation

xingyaoww commented Apr 20, 2024 • edited Loading

rbren left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rbren commented Apr 20, 2024

xingyaoww commented Apr 21, 2024 • edited Loading

rbren commented Apr 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mlejva Apr 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xingyaoww commented Apr 22, 2024 • edited Loading

Choose a reason for hiding this comment

xingyaoww Apr 22, 2024 • edited Loading

Choose a reason for hiding this comment

rbren left a comment

Choose a reason for hiding this comment

mlejva commented Apr 23, 2024

xingyaoww commented Apr 20, 2024 •

edited

Loading

xingyaoww commented Apr 21, 2024 •

edited

Loading

mlejva Apr 22, 2024 •

edited

Loading

xingyaoww commented Apr 22, 2024 •

edited

Loading

xingyaoww Apr 22, 2024 •

edited

Loading