Skip to content

Kota-Yamaguchi/dify-codegenerator-evaluator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code Generator Evaluator

This is a tool designed to evaluate Dify's Code Generator functionality. alt text

Use Cases

Prompt Improvement

Use this tool to evaluate and improve Code Generator prompts by:

  • Testing different prompt variations
  • Analyzing code generation accuracy
  • Identifying areas for prompt optimization

LLM Performance Comparison

Compare code generation accuracy across different LLM models to:

  • Measure code quality metrics
  • Assess performance across different programming languages(Python and JavaScript)

Getting Started

1. Move to the evaluator directory

git clone git@github.com:Kota-Yamaguchi/dify-codegenerator-evaluator.git
cd dify-codegenerator-evaluator

2. Set up your .env file with required variables

cp .env.example .env

3. Start Dify Backend API

Note: For instructions on setting up the Dify Backend API locally, please refer to the Dify Self-hosted Installation Guide

4. Build Prerequisites

  • Ensure Go 1.20 or higher is installed
  • Verify GOPATH is properly configured
  • Grant execution permissions to the build script:
chmod +x build.sh

5. Build Process

./build.sh

6. Run the Evaluator

Execute the evaluator based on your platform:

# For Linux
./bin/evaluate-code-linux

# For macOS (Intel)
./bin/evaluate-code-mac

# For macOS (Apple Silicon)
./bin/evaluate-code-mac-arm64

# For Windows
./bin/evaluate-code.exe

Adding Test Cases

To add new test cases, please add them to testdata/testcases.json.

Planned Features

  • Generate comparative accuracy graphs for different LLM models
  • Output accuracy metrics based on code complexity levels

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published