Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sandbox): Candidate Implementation of Sandbox Plugin to Support Jupyter #1255

Merged
merged 24 commits into from
Apr 23, 2024

Conversation

xingyaoww
Copy link
Collaborator

@xingyaoww xingyaoww commented Apr 20, 2024

As discussed in #1251, we need a better way (instead of the not ideal attempt in #1215) to incorporate different plugins/tools (e.g., jupyter kernel for interactive python execution).

The goal is "If the agent implementer wants to make any assumptions about the environment beyond "it's bash", the agent needs to make sure everything is set up." The current implementation assumes different behavior for each sandbox that may go against the goal.

In this PR, I attempt to implement one that satisfies this goal by allowing any arbitrary sandbox to be initialized with plugins (a setup.sh that automatically sets everything needed). I added PluginMixin, which can be mixed with all Sandbox that leverage their .execute method to complete the setup process.

I also mapped opendevin/.cache to ~/.cache inside the sandbox so that the user doesn't need to download the pip install packages everytime they re-start the container -- these will be automatically cached. The user can also clean-up these caches (e.g., if they want to switch to a different sandbox) by running make clean.

I have only implemented this for DockerSSHBox yet, and i hope to get some feedback before proceeding to include this Mixin for all boxes :)

image

Once setup.sh is completed, you can send code to jupyter kernel for execution by using echo "import math" | execute_cli.
image

@xingyaoww xingyaoww requested a review from rbren April 20, 2024 17:30
Copy link
Collaborator

@rbren rbren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one thing I don't see here is: how does the agent send a python command to jupyter?

opendevin/sandbox/plugins/mixin.py Outdated Show resolved Hide resolved
opendevin/sandbox/docker/ssh_box.py Outdated Show resolved Hide resolved
Comment on lines +143 to +150
# chown the home directory
exit_code, logs = self.container.exec_run(
['/bin/bash', '-c', 'chown opendevin:root /home/opendevin'],
workdir=SANDBOX_WORKSPACE_DIR,
)
if exit_code != 0:
raise Exception(
f'Failed to chown home directory for opendevin in sandbox: {logs}')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we want to pull this out in to the core Sandbox implementation, as initialize_homedir. Some sandboxes might override it, but most are going to do exactly this.

(TBH there's probably a lot of logic we could deduplicate between SSH, Exec, and Local)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! I was also thinking heavily about this while implementing this, maybe we should make a BaseClass to keep all the common logic (including this one!)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well... just realized this might just work on DockerSSHBox and ExecBox, since in LocalBox we are just re-using the current user.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm good point. I could see putting this logic in the base class, and then having a pass override for the LocalBox that makes it a no-op

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess once we deprecate ExecBox, this logic will not be duplicated since DockerSSHBox, LocalBox and E2BBox will likely to be very different from each other in terms of implementation. So maybe we can leave this as is, and then try to deprecate ExecBox in the next few days/weeks?

@@ -0,0 +1,13 @@
#!/bin/bash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this script getting used anywhere?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer this & your previous question: "how does the agent send a python command to jupyter?"

opendevin/sandbox/plugins will be mounted to /opendevin/plugins/ inside the sandbox container, and the setup.sh script will then add /opendevin/plugins/jupyter to $PATH, making sure that the sandbox .execute function has access to execute_cli tool.

Whenever the agent wants to execute Python, they can just do echo "import math" | execute_cli in bash and get the response in stdout (this part will likely be implemented by agent-specific logic, rather than prompting agents to do it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice! Maybe just add a note to this file that it's a convenience for the agent?

Should the setup script copy this file over to the sandbox environment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, do you mean we should just copy the execute_cli into some ..bin folder, instead of adding the entire /opendevin/plugins/jupyter into $PATH? That could be another alternative too!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! That way we don't have to mount additional directories. That'll make it easier to e.g. support jupyter on e2b

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done! i added copy_to method to the Sandbox interface, and supports how Plugins works through copying

opendevin/sandbox/docker/ssh_box.py Show resolved Hide resolved
@rbren
Copy link
Collaborator

rbren commented Apr 20, 2024

That was super fast! This is looking awesome. I think you're definitely on the right track

@xingyaoww
Copy link
Collaborator Author

xingyaoww commented Apr 21, 2024

@rbren After trying to extend this Plugin capability to different sandboxes, I realized that it could be challenging since different sandbox would assume different "user" and levels of system access:

  • ExecBox can potentially work, but its .execute is not stateful, which make this Sandbox different (deviating from our original goal that each Sandbox should has a consistent interface?)
  • LocalBox is executed locally, which means it may not have sudo commands available; Also, it is not stateful (e.g., environment variables set in the previous turn is no longer accessible in subsequent turns).

If we managed to get the DockerSSHBox working stably with the DOOD route, do you think if we can deprecate LocalBox and ExecBox and solely focus on SSHBox for future development? It could saves a tons of efforts (but of course we need to double-check Windows user can run this DOOD without issues)

@rbren
Copy link
Collaborator

rbren commented Apr 21, 2024

@xingyaoww I definitely want to get rid of ExecBox, for the reason you stated--it's not stateful, and we want something that's stateful. I think with the --add-host trick we might be ready?

LocalBox seems potentially useful still. And we want to keep supporting e2b.

I think so long as we make sure the plugin script fails with a good error (e.g. "Failed to install $PLUGIN on $SANDBOX_TYPE: $ERROR_MESSAGE"), we can make things more consistent over time. Really what I'd like is for sandbox maintainers (e.g. mlejva for e2b) to be responsible for ensuring their sandbox env is as consistent as possible, with SSHBox being our "north star".

@@ -122,6 +124,37 @@ def run_command(container, command):
return -1, f'Command: "{cmd}" timed out'
return exit_code, logs.decode('utf-8')

def copy_to(self, host_src: str, sandbox_dest: str, recursive: bool = False):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code duplication from DockerSSHBox which is not ideal, but i guess we can leave them temporarily for this PR and plan to deprecate ExecBox in the next few days.

Comment on lines +143 to +150
# chown the home directory
exit_code, logs = self.container.exec_run(
['/bin/bash', '-c', 'chown opendevin:root /home/opendevin'],
workdir=SANDBOX_WORKSPACE_DIR,
)
if exit_code != 0:
raise Exception(
f'Failed to chown home directory for opendevin in sandbox: {logs}')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess once we deprecate ExecBox, this logic will not be duplicated since DockerSSHBox, LocalBox and E2BBox will likely to be very different from each other in terms of implementation. So maybe we can leave this as is, and then try to deprecate ExecBox in the next few days/weeks?

@@ -61,6 +61,10 @@ def execute(self, cmd: str) -> Tuple[int, str]:
assert process_output.exit_code is not None
return process_output.exit_code, logs_str

def copy_to(self, host_src: str, sandbox_dest: str, recursive: bool = False):
# FIXME
raise NotImplementedError('Copying files to E2B sandbox is not implemented yet')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this shouldn't be so hard to implement for E2B since we already have .filesystem ready? cc @mlejva

Copy link
Contributor

@mlejva mlejva Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xingyaoww yes, shouldn't be hard. It's possible to upload files (up to 100MB) to the sandbox or one can just write to a file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rbren let me quickly implement this before merging

@xingyaoww
Copy link
Collaborator Author

xingyaoww commented Apr 22, 2024

@xingyaoww I definitely want to get rid of ExecBox, for the reason you stated--it's not stateful, and we want something that's stateful. I think with the --add-host trick we might be ready?

LocalBox seems potentially useful still. And we want to keep supporting e2b.

I think so long as we make sure the plugin script fails with a good error (e.g. "Failed to install $PLUGIN on $SANDBOX_TYPE: $ERROR_MESSAGE"), we can make things more consistent over time. Really what I'd like is for sandbox maintainers (e.g. mlejva for e2b) to be responsible for ensuring their sandbox env is as consistent as possible, with SSHBox being our "north star".

Agree! Btw, I think ExecBox will suffer from the exact same issue of --network=host, since it also relies on docker, right? i.e., flask server started by ExecBox also cannot be accessed from outside the container.

Also, just realizing the --network=host is supported with the newest version of Docker Desktop! If this works, i guess we can probably deprecate the ExecBox, and having DOOD with DockerSSHBox as the default setting (if they also work well on Linux).

Version Docker Desktop 4.29.0
Settings ⇒ Features in development ⇒ Enable host networking
Host networking allows containers that are started with --net=host to use localhost to connect to TCP and UDP services on the host. It will automatically allow software on the host to use localhost to connect to TCP and UDP services in the container. Sign in required.

Enable host networking

Originally posted by @4-FLOSS-Free-Libre-Open-Source-Software in docker/roadmap#238 (comment)

Comment on lines +151 to +153
_cache_dir = config.get('CACHE_DIR')
if _cache_dir:
pathlib.Path(_cache_dir).mkdir(parents=True, exist_ok=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely certain of the purpose of cache dir--do we still need this?

Copy link
Collaborator Author

@xingyaoww xingyaoww Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this (at least for now) to cache the pip install commands in Jupyter requirement dependency - without this, you need to re-download the whl for every sandbox initialization, which can be way too slow. I still felt it is a bit slow with cache (you still need to spend the CPU hours)... we probably need to figure out a way to cache it in the future further (e.g., save the docker image with the requirement installed and re-use it as much as possible)

Copy link
Collaborator

@rbren rbren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I think this is ready to go. Just one question

@mlejva
Copy link
Contributor

mlejva commented Apr 23, 2024

@xingyaoww @rbren I implemented the copy_to method for E2BBox in a separate PR #1298

@xingyaoww xingyaoww merged commit fc5e075 into All-Hands-AI:main Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants