[WIP] Update v3 #150

U-C4N · 2024-08-31T10:58:16Z

Optimize and refactor mle/cli.py, mle/model.py, and mle/agents/coder.py

mle/cli.py:

Move check_config function to the top of the file for better organization
Use CONFIG_FILE constant in check_config for consistency
Set SEARCH_API_KEY environment variable immediately after user input in 'new' command
Use a variable for config file path in 'new' command for better readability

mle/model.py:

Add type hints for better code readability and maintainability
Move dependency import logic to separate methods in each model class
Simplify the load_model function
Improve error messages for missing dependencies

mle/agents/coder.py:

Add type hints for better code readability and maintainability
Move system prompt generation to a separate method for better organization
Simplify the process_summary function using f-strings
Use more consistent naming for variables and methods
Improve error handling and type checking

These changes maintain the current structure and functionality while improving code readability, maintainability, and slightly enhancing performance.

Closes #

What has been done to verify that this works as intended?

Why is this the best possible solution? Were any other approaches considered?

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

Do we need any specific form for testing your changes? If so, please attach one.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

Before submitting this PR, please make sure you have:

confirmed all checks still pass OR confirm CI build passes.
verified that any code or assets from external sources are properly credited in comments and/or in
the credit file.

Optimize and refactor mle/cli.py, mle/model.py, and mle/agents/coder.py mle/cli.py: - Move check_config function to the top of the file for better organization - Use CONFIG_FILE constant in check_config for consistency - Set SEARCH_API_KEY environment variable immediately after user input in 'new' command - Use a variable for config file path in 'new' command for better readability mle/model.py: - Add type hints for better code readability and maintainability - Move dependency import logic to separate methods in each model class - Simplify the load_model function - Improve error messages for missing dependencies mle/agents/coder.py: - Add type hints for better code readability and maintainability - Move system prompt generation to a separate method for better organization - Simplify the process_summary function using f-strings - Use more consistent naming for variables and methods - Improve error handling and type checking These changes maintain the current structure and functionality while improving code readability, maintainability, and slightly enhancing performance.

huangyz0918 · 2024-08-31T16:26:40Z

Looks good to me! @U-C4N Thank you for the contribution!
@leeeizhang Could you please test the changes?

HuaizhengZhang · 2024-08-31T18:23:16Z

@U-C4N Hi, could you add a [MRG] or [WIP] in your title? so we can know if this PR is ready to be reviewed

thanks a lot for the contribution.

leeeizhang · 2024-09-01T02:22:43Z

Hi, @U-C4N. Thank you very much for all your contributions! This pull request significantly improves our code quality.

Before we merge it, please ensure the following two steps are completed:

Complete the docstrings for all functions and classes.
Conduct a full workflow test to ensure your code works as expected.

Thanks again!

mle/model.py

U-C4N · 2024-09-01T15:23:54Z

`import os
import yaml
import json
import importlib.util
from abc import ABC, abstractmethod
from typing import List, Dict, Any, Optional
from mle.utils import process_function_name # Adjust the import path as needed

MODEL_OLLAMA = 'Ollama'
MODEL_OPENAI = 'OpenAI'

class Model(ABC):
def init(self):
self.model_type: Optional[str] = None

@abstractmethod
def query(self, chat_history: List[Dict[str, str]], **kwargs) -> str:
    pass

@abstractmethod
def stream(self, chat_history: List[Dict[str, str]], **kwargs) -> str:
    pass

class OllamaModel(Model):
def init(self, model: str = 'llama3', host_url: Optional[str] = None):
super().init()
self.model = model
self.model_type = MODEL_OLLAMA
self.ollama = self._import_ollama()
self.client = self.ollama.Client(host=host_url)

def _import_ollama(self):
    dependency = "ollama"
    spec = importlib.util.find_spec(dependency)
    if spec is None:
        raise ImportError(
            "Ollama Python package is not installed. "
            "Please install it to use Ollama-related features. "
            "More information: https://github.com/ollama/ollama-python"
        )
    return importlib.import_module(dependency)

def query(self, chat_history: List[Dict[str, str]], **kwargs) -> str:
    return self.client.chat(model=self.model, messages=chat_history)['message']['content']

def stream(self, chat_history: List[Dict[str, str]], **kwargs) -> str:
    for chunk in self.client.chat(model=self.model, messages=chat_history, stream=True):
        yield chunk['message']['content']

class OpenAIModel(Model):
def init(self, api_key: str, model: str = 'gpt-3.5-turbo', temperature: float = 0.7):
super().init()
self.model = model
self.model_type = MODEL_OPENAI
self.temperature = temperature
self.openai = self._import_openai()
self.client = self.openai(api_key=api_key)

def _import_openai(self):
    dependency = "openai"
    spec = importlib.util.find_spec(dependency)
    if spec is None:
        raise ImportError(
            "OpenAI Python package is not installed. "
            "Please install it to use OpenAI-related features. "
            "More information: https://openai.com/product"
        )
    return importlib.import_module(dependency).OpenAI

def query(self, chat_history: List[Dict[str, str]], **kwargs) -> str:
    completion = self.client.chat.completions.create(
        model=self.model,
        messages=chat_history,
        temperature=self.temperature,
        stream=False,
        **kwargs
    )

    resp = completion.choices[0].message
    if resp.function_call:
        function_name = process_function_name(resp.function_call.name)
        arguments = json.loads(resp.function_call.arguments)
        print("[MLE FUNC CALL]: ", function_name)
        result = get_function(function_name)(**arguments)
        chat_history.append({"role": "function", "content": result, "name": function_name})
        return self.query(chat_history, **kwargs)
    else:
        return resp.content

def stream(self, chat_history: List[Dict[str, str]], **kwargs) -> str:
    arguments = ''
    function_name = ''
    for chunk in self.client.chat.completions.create(
            model=self.model,
            messages=chat_history,
            temperature=self.temperature,
            stream=True,
            **kwargs
    ):
        delta = chunk.choices[0].delta
        if delta.function_call:
            if delta.function_call.name:
                function_name = process_function_name(delta.function_call.name)
            if delta.function_call.arguments:
                arguments += delta.function_call.arguments

        if chunk.choices[0].finish_reason == "function_call":
            result = get_function(function_name)(**json.loads(arguments))
            chat_history.append({"role": "function", "content": result, "name": function_name})
            yield from self.stream(chat_history, **kwargs)
        else:
            yield delta.content

def load_model(project_dir: str, model_name: str) -> Optional[Model]:
with open(os.path.join(project_dir, 'project.yml'), 'r') as file:
data = yaml.safe_load(file)
if data['platform'] == MODEL_OPENAI:
return OpenAIModel(api_key=data['api_key'], model=model_name)
if data['platform'] == MODEL_OLLAMA:
return OllamaModel(model=model_name)
return None
`
Can you try ?

U-C4N · 2024-09-01T15:26:52Z

Should be work

HuaizhengZhang · 2024-09-01T21:16:58Z

is this PR now [MRG]?

U-C4N · 2024-09-02T05:54:11Z

No , leeizhang didn't test yet last model.py

leeeizhang · 2024-09-02T12:08:20Z

No , leeizhang didn't test yet last model.py

Hi, @U-C4N. Many thanks for your updates!

Here are some suggestions for the current code:

It seems that you have removed many docstrings in your PR. Please retain these so that others can easily understand the function meanings and types of the arguments. :)
I have noticed that you implemented the Claude model. Have you run any tests for this model? We also implemented the Claude model yesterday ([MRG] add claude model support #154), and you could also rebase your PR once we merge it.

Thanks again!

leeeizhang · 2024-09-02T12:18:31Z

mle/model.py

-
-from mle.function import get_function, process_function_name
+from typing import List, Dict, Any, Optional
+from mle.utils import process_function_name # Adjust the import path as needed


you may remove the comments here :)

from mle.utils import process_function_name -> from mle.function import process_function_name

leeeizhang · 2024-09-02T12:21:46Z

mle/model.py

-from mle.function import get_function, process_function_name
+from typing import List, Dict, Any, Optional
+from mle.utils import process_function_name # Adjust the import path as needed
+import anthropic


it is suggested to import anthropic module in the python runtime. you may refer to OpenAIModel, which loads the openai package by importlib.

https://github.com/MLSysOps/MLE-agent/blob/main/mle/model.py#L90

leeeizhang · 2024-09-02T12:23:23Z

mle/model.py

@@ -155,18 +112,39 @@ def stream(self, chat_history, **kwargs):
 else:
 yield delta.content

+class AnthropicModel(Model):


you may add this model into the cli.py

https://github.com/MLSysOps/MLE-agent/blob/main/mle/cli.py#L113

leeeizhang · 2024-09-02T12:25:22Z

mle/model.py

 with open(os.path.join(project_dir, 'project.yml'), 'r') as file:
 data = yaml.safe_load(file)
 if data['platform'] == MODEL_OPENAI:
 return OpenAIModel(api_key=data['api_key'], model=model_name)
 if data['platform'] == MODEL_OLLAMA:
 return OllamaModel(model=model_name)
+ if data['platform'] == 'Anthropic':


MODEL_CLAUDE = 'Anthropic'
if data['platform'] == MODEL_CLAUDE:
...

leeeizhang · 2024-09-02T12:38:59Z

mle/model.py

- model (str): The model with version.
- temperature (float): The temperature value.
- """
+ def __init__(self, api_key: str, model: str = 'gpt-3.5-turbo', temperature: float = 0.7):


Traceback (most recent call last):
File "/root/miniconda3/envs/mle/bin/mle", line 33, in
sys.exit(load_entry_point('mle-agent', 'console_scripts', 'mle')())
File "/root/miniconda3/envs/mle/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/root/miniconda3/envs/mle/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/root/miniconda3/envs/mle/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/root/miniconda3/envs/mle/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/root/miniconda3/envs/mle/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/root/workspace/MLE-agent/mle/cli.py", line 68, in start
return baseline(os.getcwd(), model)
File "/root/workspace/MLE-agent/mle/workflow/baseline.py", line 67, in baseline
advisor_report = advisor.interact("[green]User Requirement:[/green] " + ml_requirement + "\n" + ask_data(dataset))
File "/root/workspace/MLE-agent/mle/agents/advisor.py", line 131, in interact
self.report = self.suggest(requirement)
File "/root/workspace/MLE-agent/mle/agents/advisor.py", line 113, in suggest
text = self.model.query(
File "/root/workspace/MLE-agent/mle/model.py", line 85, in query
result = get_function(function_name)(**arguments)
NameError: name 'get_function' is not defined

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Aug 31, 2024

huangyz0918 requested review from huangyz0918 and leeeizhang August 31, 2024 16:25

leeeizhang reviewed Sep 1, 2024

View reviewed changes

mle/model.py Show resolved Hide resolved

leeeizhang assigned U-C4N Sep 1, 2024

leeeizhang changed the title ~~Update v3~~ [WIP] Update v3 Sep 1, 2024

Update model.py

62b8091

U-C4N added 2 commits September 1, 2024 21:43

Claude 3.5 added

889e201

Update model.py

efa809d

leeeizhang reviewed Sep 2, 2024

View reviewed changes

[fix] agent can not understand a blur dataset name #159

91fa91d

U-C4N closed this Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Update v3 #150

[WIP] Update v3 #150

U-C4N commented Aug 31, 2024

huangyz0918 commented Aug 31, 2024

HuaizhengZhang commented Aug 31, 2024

leeeizhang commented Sep 1, 2024

U-C4N commented Sep 1, 2024

U-C4N commented Sep 1, 2024

HuaizhengZhang commented Sep 1, 2024

U-C4N commented Sep 2, 2024

leeeizhang commented Sep 2, 2024 •

edited

Loading

leeeizhang Sep 2, 2024

leeeizhang Sep 2, 2024

leeeizhang Sep 2, 2024

leeeizhang Sep 2, 2024 •

edited

Loading

leeeizhang Sep 2, 2024

leeeizhang Sep 2, 2024

[WIP] Update v3 #150

[WIP] Update v3 #150

Conversation

U-C4N commented Aug 31, 2024

What has been done to verify that this works as intended?

Why is this the best possible solution? Were any other approaches considered?

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

Do we need any specific form for testing your changes? If so, please attach one.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

Before submitting this PR, please make sure you have:

huangyz0918 commented Aug 31, 2024

HuaizhengZhang commented Aug 31, 2024

leeeizhang commented Sep 1, 2024

U-C4N commented Sep 1, 2024

U-C4N commented Sep 1, 2024

HuaizhengZhang commented Sep 1, 2024

U-C4N commented Sep 2, 2024

leeeizhang commented Sep 2, 2024 • edited Loading

leeeizhang Sep 2, 2024

Choose a reason for hiding this comment

leeeizhang Sep 2, 2024

Choose a reason for hiding this comment

leeeizhang Sep 2, 2024

Choose a reason for hiding this comment

leeeizhang Sep 2, 2024 • edited Loading

Choose a reason for hiding this comment

leeeizhang Sep 2, 2024

Choose a reason for hiding this comment

leeeizhang Sep 2, 2024

Choose a reason for hiding this comment

leeeizhang commented Sep 2, 2024 •

edited

Loading

leeeizhang Sep 2, 2024 •

edited

Loading