-
Notifications
You must be signed in to change notification settings - Fork 34
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: improve
code sandbox
documentation
- Loading branch information
1 parent
1668779
commit f68162a
Showing
2 changed files
with
189 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,191 @@ | ||
# Code Sandbox | ||
|
||
<!-- Placeholder --> | ||
`tablegpt-agent` directs `tablegpt` to generate Python code for data analysis. However, the generated code may contain potential vulnerabilities or unexpected errors. Running such code directly in a production environment could threaten the system's stability and security. | ||
|
||
`Code Sandbox` is designed to address this challenge. By leveraging sandbox technology, it confines code execution to a controlled environment, effectively preventing malicious or unexpected behaviors from impacting the main system. This provides an isolated and reliable space for running code safely. | ||
|
||
`Code Sandbox` built on the [pybox](https://github.com/edwardzjl/pybox) library and supports three main execution modes: | ||
|
||
- **Local Environment**: Executes code in a local sandbox for quick *deployment* and *validation*. | ||
- **Remote Environment**: Create remote environments through `Jupyter Enterprise Gateway` to achieve shared computing. | ||
- **Cluster Environment**: Bypassing the need for proxy services such as `Jupyter Enterprise Gateway` by communicating directly with kernel pods. | ||
|
||
Code Sandbox is designed based on the following key principles: | ||
|
||
- **Security**: Limits code access using sandbox technology to ensure a safe and reliable execution environment. | ||
- **Isolation**: Provides independent execution environments for each task, ensuring strict separation of resources and data. | ||
- **Scalability**: Adapts to diverse computing environments, from local setups to Kubernetes clusters, supporting dynamic resource allocation and efficient task execution. | ||
|
||
|
||
## Local Environment | ||
|
||
In a local environment, Code Sandbox utilizes the `pybox` library to create and manage sandbox environments, providing a secure code execution platform. By isolating code execution from the host system's resources and imposing strict permission controls, it ensures safety and reliability. This approach is especially suitable for **development** and **debugging** scenarios. | ||
|
||
If you want to run `tablegpt-agent` in a local environment, you can enable the **local mode**. Below are the installation steps and a detailed operation guide. | ||
|
||
### Installing | ||
|
||
To use `tablegpt-agent` in local mode, install the library with the following command: | ||
|
||
```sh | ||
pip install tablegpt-agent[local] | ||
``` | ||
|
||
### Configuring | ||
|
||
`tablegpt-agent` comes with several built-in features, such as auxiliary methods for data analysis and support for displaying Chinese fonts. **These features are automatically added to the sandbox environment by default**. If you need advanced customization (e.g., adding specific methods or fonts), refer to the [TableGPT IPython Kernel Configuration Documentation](https://github.com/tablegpt/tablegpt-agent/tree/main/ipython) for further guidance. | ||
|
||
### Creating and Running | ||
|
||
The following code demonstrates how to use the pybox library to set up a sandbox, execute code, and retrieve results in a local environment: | ||
|
||
```python | ||
from uuid import uuid4 | ||
from pybox import LocalPyBoxManager, PyBoxOut | ||
|
||
# Initialize the local sandbox manager | ||
pybox_manager = LocalPyBoxManager() | ||
|
||
# Assign a unique Kernel ID for the sandbox | ||
kernel_id = str(uuid4()) | ||
|
||
# Start the sandbox environment | ||
box = pybox_manager.start(kernel_id) | ||
|
||
# Define the test code to execute | ||
test_code = """ | ||
import math | ||
result = math.sqrt(16) | ||
result | ||
""" | ||
|
||
# Run the code in the sandbox | ||
out: PyBoxOut = box.run(code=test_code) | ||
|
||
# Print the execution result | ||
print(out) | ||
``` | ||
|
||
### Example Output | ||
|
||
After running the above code, the system will return the following output, indicating successful execution with no errors: | ||
```text | ||
data=[{'text/plain': '4.0'}] error=None | ||
``` | ||
|
||
With `Code Sandbox` in local execution mode, developers can enjoy the safety of sandbox isolation at minimal cost while maintaining flexibility and efficiency. This lays a solid foundation for more complex remote or cluster-based scenarios. | ||
|
||
|
||
## Remote Environment | ||
|
||
In a remote environment, `Code Sandbox` uses the `pybox` library and its `RemotePyBoxManager` to create and manage sandbox environments. The remote mode relies on the [Enterprise Gateway](https://github.com/jupyter-server/enterprise_gateway) service to dynamically create and execute remote sandboxes. This mode allows multiple services to connect to the same remote environment, enabling shared access to resources. | ||
|
||
### Configuring | ||
|
||
If `tablegpt-agent` is used in **remote mode**, the first step is to start the `enterprise_gateway` service. You can refer to the [Enterprise Gateway Deployment Guide](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/operators/index.html#deploying-enterprise-gateway) for detailed instructions on configuring and starting the service. | ||
|
||
Once the service is up and running, ensure that the service address is accessible. For example, assume the `enterprise_gateway` service is available at `http://example.com`. | ||
|
||
### Creating and Running | ||
|
||
The following code demonstrates how to create a remote sandbox using `RemotePyBoxManager` and execute code within it: | ||
|
||
```python | ||
from uuid import uuid4 | ||
from pybox import RemotePyBoxManager, PyBoxOut | ||
|
||
# Initialize the remote sandbox manager, replacing with the actual Enterprise Gateway service address | ||
pybox_manager = RemotePyBoxManager(host="http://example.com") | ||
|
||
# Assign a unique Kernel ID | ||
kernel_id = str(uuid4()) | ||
|
||
# Start the remote sandbox environment | ||
box = pybox_manager.start(kernel_id) | ||
|
||
# Define the test code | ||
test_code = """ | ||
import math | ||
result = math.sqrt(16) | ||
result | ||
""" | ||
|
||
# Run the code in the sandbox | ||
out: PyBoxOut = box.run(code=test_code) | ||
|
||
# Print the execution result | ||
print(out) | ||
``` | ||
|
||
### Example Output | ||
|
||
After executing the above code, the system will return the following output, indicating successful execution without any errors: | ||
|
||
```plaintext | ||
data=[{'text/plain': '4.0'}] error=None | ||
``` | ||
|
||
### Advanced Environment Configuration | ||
|
||
The `RemotePyBoxManager` provides the following advanced configuration options to allow for flexible customization of the sandbox execution environment: | ||
|
||
1. **`env_file`**: Allows you to load environment variables from a file to configure the remote sandbox. | ||
2. **`kernel_env`**: Enables you to pass environment variables directly as key-value pairs, simplifying the setup process. | ||
|
||
To learn more about the parameters and configuration options, refer to the [Kernel Environment Variables](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/users/kernel-envs.html) documentation. | ||
|
||
|
||
## Cluster Environment | ||
|
||
In a Kubernetes cluster, `Code Sandbox` leverages the `KubePyBoxManager` provided by the `pybox` library to create and manage sandboxes. Unlike the `remote environment`, the cluster environment **communicates directly with Kernel Pods** created by the [Jupyter Kernel Controller](https://github.com/edwardzjl/jupyter-kernel-controller), eliminating the need for an intermediary service like `Enterprise Gateway`. | ||
|
||
### Configuring | ||
|
||
Before using the cluster environment, you need to deploy the `jupyter-kernel-controller` service. You can quickly create the required CRDs and Deployments using the [Deploy Documentation](https://github.com/edwardzjl/jupyter-kernel-controller?tab=readme-ov-file#build-run-deploy). | ||
|
||
### Creating and Running | ||
|
||
Once the `jupyter-kernel-controller` service is successfully deployed and running, you can create and run a cluster sandbox using the following code: | ||
|
||
```python | ||
from uuid import uuid4 | ||
from pybox import KubePyBoxManager, PyBoxOut | ||
|
||
# Initialize the cluster sandbox manager, replacing with actual paths and environment variable configurations | ||
pybox_manager = KubePyBoxManager( | ||
env_file="YOUR_ENV_FILE_PATH", # Path to the environment variable file | ||
kernel_env="YOUR_KERNEL_ENV_DICT", # Kernel environment variable configuration | ||
) | ||
|
||
# Assign a unique Kernel ID | ||
kernel_id = str(uuid4()) | ||
|
||
# Start the cluster sandbox environment | ||
box = pybox_manager.start(kernel_id) | ||
|
||
# Define the test code | ||
test_code = """ | ||
import math | ||
result = math.sqrt(16) | ||
result | ||
""" | ||
|
||
# Run the code in the sandbox | ||
out: PyBoxOut = box.run(code=test_code) | ||
|
||
# Print the execution result | ||
print(out) | ||
``` | ||
|
||
### Example Output | ||
|
||
After executing the code above, the following output will be returned, indicating successful execution without any errors: | ||
|
||
```plaintext | ||
data=[{'text/plain': '4.0'}] error=None | ||
``` | ||
|
||
**NOTE:** The `env_file` and `kernel_env` parameters required by `KubePyBoxManager` are essentially the same as those for `RemotePyBoxManager`. For detailed information about these parameters, please refer to the [RemotePyBoxManager Advanced Environment Configuration](#advanced-environment-configuration). | ||
|
||
|
||
With the above configuration, you can efficiently manage secure and reliable sandboxes in a Kubernetes cluster, supporting flexible control and extension of execution results. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,3 @@ | ||
# Incluster Code Execution | ||
|
||
The `tablegpt-agent` directs `tablegpt` to generate Python code for data analysis. This code is then executed within a sandbox environment to ensure system security. The execution is managed by the [pybox](https://github.com/edwardzjl/pybox) library, which provides a simple way to run Python code outside the main process. | ||
|
||
## Usage | ||
|
||
If you're using the local executor (pybox.LocalPyBoxManager), follow these steps to configure the environment: | ||
|
||
1. Install the dependencies required for the `IPython Kernel` using the following command: | ||
|
||
```sh | ||
pip install -r ipython/requirements.txt | ||
``` | ||
|
||
2. Copy the code from the `ipython/ipython-startup-scripts` folder to the `$HOME/.ipython/profile_default/startup/` directory. | ||
|
||
This folder contains the functions and configurations needed to perform data analysis with `tablegpt-agent`. | ||
|
||
Note: The `~/.ipython` directory must be writable for the process launching the kernel, otherwise there will be a warning message: `UserWarning: IPython dir '/home/jovyan/.ipython' is not a writable location, using a temp directory.` and the startup scripts won't take effects. |