Skip to content

Commit

Permalink
Merge pull request #1545 from cyzus/sela-readme
Browse files Browse the repository at this point in the history
indentation on readme, renaming
  • Loading branch information
garylin2099 authored Oct 29, 2024
2 parents cf03c5d + 37698b3 commit df51f45
Show file tree
Hide file tree
Showing 16 changed files with 288 additions and 277 deletions.
279 changes: 48 additions & 231 deletions metagpt/ext/sela/README.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,26 @@
# SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning



## 1. Data Preparation

- Download Datasets:https://deepwisdom.feishu.cn/drive/folder/RVyofv9cvlvtxKdddt2cyn3BnTc?from=from_copylink
- Download and prepare datasets from scratch:
```
cd data
python dataset.py --save_analysis_pool
python hf_data.py --save_analysis_pool
```
You can either download the datasets from the link or prepare the datasets from scratch.
- **Download Datasets:** [Dataset Link](https://deepwisdom.feishu.cn/drive/folder/RVyofv9cvlvtxKdddt2cyn3BnTc?from=from_copylink)
- **Download and prepare datasets from scratch:**
```bash
cd data
python dataset.py --save_analysis_pool
python hf_data.py --save_analysis_pool
```

## 2. Configs
## 2. Configurations

### Data Config

`datasets.yaml` Provide base prompts, metrics, target columns for respective datasets

- Modify `datasets_dir` to the root directory of all the datasets in `data.yaml`

- **`datasets.yaml`:** Provide base prompts, metrics, and target columns for respective datasets.
- **`data.yaml`:** Modify `datasets_dir` to the base directory of all prepared datasets.

### LLM Config

```
```yaml
llm:
api_type: 'openai'
model: deepseek-coder
Expand All @@ -32,237 +29,57 @@ llm:
temperature: 0.5
```

### Budget
Experiment rollouts k = 5, 10, 20


### Prompt Usage

- Use the function `generate_task_requirement` in `dataset.py` to get task requirement.
- If the method is non-DI-based, set `is_di=False`.
- Use `utils.DATA_CONFIG` as `data_config`


## 3. SELA

### Run SELA

#### Setup
In the root directory,

```
```bash
pip install -e .
cd expo
cd metagpt/ext/sela
pip install -r requirements.txt
```

#### Run

- Examples
```
python run_experiment.py --exp_mode mcts --task titanic --rollouts 10
python run_experiment.py --exp_mode mcts --task house-prices --rollouts 10 --low_is_better
```


- `--rollouts` - The number of rollouts

- `--use_fixed_insights` - In addition to the generated insights, include the fixed insights saved in `expo/insights/fixed_insights.json`

- `--low_is_better` - If the dataset has reg metric, remember to use `--low_is_better`

- `--from_scratch` - Do not use pre-processed insight pool, generate new insight pool based on dataset before running MCTS, facilitating subsequent tuning to propose search space prompts

- `--role_timeout` - The timeout for the role
- This feature limits the duration of a single simulation, making the experiment duration more controllable (for example, if you do ten rollouts and set role_timeout to 1,000, the experiment will stop at the latest after 10,000s)
#### Running Experiments


- `--max_depth` - The maximum depth of MCTS, default is 4 (nodes at this depth directly return the previous simulation result without further expansion)

- `--load_tree` - If MCTS was interrupted due to certain reasons but had already run multiple rollouts, you can use `--load_tree`.
- For example:
```
- **Examples:**
```bash
python run_experiment.py --exp_mode mcts --task titanic --rollouts 10
python run_experiment.py --exp_mode mcts --task house-prices --rollouts 10 --low_is_better
```
- If this was interrupted after running three rollouts, you can use `--load_tree`:
```
python run_experiment.py --exp_mode mcts --task titanic --rollouts 7 --load_tree
```
#### Ablation Study
**DI RandomSearch**
- Single insight
`python run_experiment.py --exp_mode rs --task titanic --rs_mode single`
- Set insight
`python run_experiment.py --exp_mode rs --task titanic --rs_mode set`
## 4. Evaluation
Each baseline needs to produce `dev_predictions.csv`和`test_predictions.csv`. Each csv file only needs a `target` column.
- Use the function `evaluate_score` to evaluate.
#### MLE-Bench
**Note: mle-bench requires python 3.11 or higher**
```
git clone https://github.com/openai/mle-bench.git
cd mle-bench
pip install -e .
```
```
mlebench prepare -c <competition-id> --data-dir <dataset-dir-save-path>
```
Enter the following command to run the experiment:
```
python run_experiment.py --exp_mode mcts --custom_dataset_dir <dataset-dir-save-path/prepared/public> --rollouts 10 --from_scratch --role_timeout 3600
```
## 5. Baselines
### AIDE
#### Setup
The version of AIDE we use is dated September 30, 2024
```
git clone https://github.com/WecoAI/aideml.git
git checkout 77953247ea0a5dc1bd502dd10939dd6d7fdcc5cc
```

Modify `aideml/aide/utils/config.yaml` - change `k_fold_validation`, `code model`, and `feedback model` as follows:
```yaml
# agent hyperparams
agent:
# how many improvement iterations to run
steps: 10
# whether to instruct the agent to use CV (set to 1 to disable)
k_fold_validation: 1
# LLM settings for coding
code:
model: deepseek-coder
temp: 0.5
# LLM settings for evaluating program output / tracebacks
feedback:
model: deepseek-coder
temp: 0.5
# hyperparameters for the tree search
search:
max_debug_depth: 3
debug_prob: 0.5
num_drafts: 5
```

Since Deepseek is compatible to OpenAI's API, change `base_url` into `your own url``api_key` into `your api key`

```
export OPENAI_API_KEY="your api key"
export OPENAI_BASE_URL="your own url"
```

Modify `aideml/aide/backend/__init__.py`'s line 30 and below:

```python
model_kwargs = model_kwargs | {
"model": model,
"temperature": temperature,
"max_tokens": max_tokens,
}
if "claude-" in model:
query_func = backend_anthropic.query
else:
query_func = backend_openai.query
```

Since deepseekV2.5 no longer supports system message using function call, modify `aideml/aide/agent.py`'s line 312:

```python
response = cast(
dict,
query(
system_message=None,
user_message=prompt,
func_spec=review_func_spec,
model=self.acfg.feedback.model,
temperature=self.acfg.feedback.temp,
),
)
```

Modify and install:

```
cd aideml
pip install -e .
```

#### Run

Run the following script to get the running results, a `log` folder and a `workspace` folder will be generated in the current directory
The `log` folder will contain the experimental configuration and the generated scheme, and the `workspace` folder will save the final results generated by aide

```
python experimenter/aide.py
```

### Autogluon
#### Setup
```
pip install -U pip
pip install -U setuptools wheel
pip install autogluon==1.1.1
```

For Tabular data:
```
python run_expriment.py --exp_mode autogluon --task {task_name}
```
For Multimodal data:
```
python run_expriment.py --exp_mode autogluon --task {task_name} --is_multimodal
```
Replace {task_name} with the specific task you want to run.


### AutoSklearn
#### System requirements
auto-sklearn has the following system requirements:

- Linux operating system (for example Ubuntu)

- Python (>=3.7)

- C++ compiler (with C++11 supports)

In case you try to install Auto-sklearn on a system where no wheel files for the pyrfr package are provided (see here for available wheels) you also need:

- SWIG [(get SWIG here).](https://www.swig.org/survey.html)

For an explanation of missing Microsoft Windows and macOS support please check the Section [Windows/macOS compatibility](https://automl.github.io/auto-sklearn/master/installation.html#windows-macos-compatibility).

#### Setup
```
pip install auto-sklearn==0.15.0
```

#### Run
```
python run_experiment.py --exp_mode autosklearn --task titanic
```
#### Parameters

- **`--rollouts`:** The number of rollouts.
- **`--use_fixed_insights`:** Include fixed insights saved in `expo/insights/fixed_insights.json`.
- **`--low_is_better`:** Use this if the dataset has a regression metric.
- **`--from_scratch`:** Generate a new insight pool based on the dataset before running MCTS.
- **`--role_timeout`:** Limits the duration of a single simulation (e.g., `10 rollouts with timeout 1,000` = max 10,000s).
- **`--max_depth`:** Set the maximum depth of MCTS (default is 4).
- **`--load_tree`:** Load an existing MCTS tree if the previous experiment was interrupted.
- Example:
```bash
python run_experiment.py --exp_mode mcts --task titanic --rollouts 10
```
- To resume:
```bash
python run_experiment.py --exp_mode mcts --task titanic --rollouts 7 --load_tree
```

### Ablation Study

**RandomSearch**

- **Use a single insight:**
```bash
python run_experiment.py --exp_mode rs --task titanic --rs_mode single
```

### Base DI
For setup, check 4.
- `python run_experiment.py --exp_mode base --task titanic --num_experiments 10`
- Specifically instruct DI to use AutoGluon: `--special_instruction ag`
- Specifically instruct DI to use the stacking ensemble method: `--special_instruction stacking`
- **Use a set of insights:**
```bash
python run_experiment.py --exp_mode rs --task titanic --rs_mode set
```
2 changes: 1 addition & 1 deletion metagpt/ext/sela/data.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
datasets_dir: "path/to/datasets" # path to the datasets directory
work_dir: ../../workspace # path to the workspace directory
work_dir: ../../../workspace # path to the workspace directory
role_dir: storage/SELA # path to the role directory
2 changes: 1 addition & 1 deletion metagpt/ext/sela/data/custom_task.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import os

from metagpt.ext.sela.data.dataset import SPECIAL_INSTRUCTIONS
from metagpt.ext.sela.experimenter.mle_bench.instructions import (
from metagpt.ext.sela.runner.mle_bench.instructions import (
ADDITIONAL_NOTES,
INSTRUCTIONS,
INSTRUCTIONS_OBFUSCATED,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ async def wrapper(self, *args, **kwargs):
return decorator


class ResearchAssistant(DataInterpreter):
class Experimenter(DataInterpreter):
node_id: str = "0"
start_task_id: int = 1
state_saved: bool = False
Expand All @@ -78,7 +78,7 @@ def change_next_instruction(self, new_instruction):
self.planner.plan.task_map[str(self.start_task_id)].instruction = new_instruction
self.remap_tasks()

def update_til_start_task(self, role: ResearchAssistant, backward: bool = True):
def update_til_start_task(self, role: Experimenter, backward: bool = True):
if backward:
# make sure the previous task instructions are matched
assert (
Expand Down
Loading

0 comments on commit df51f45

Please sign in to comment.