FareRanker

This is an implementation of the model described in: Functionality-Aware Rankers for Code Generation via Contrastive Learning

Multi-sampling is a popular approach utilized in code generation to improve the likelihood of generating correct code.Despite its effectiveness, multi-sampling generates a vast number of candidate solutions, which can be challenging for users to evaluate.Therefore, it is imperative to train high-quality rankers to enable users to rapidly identify the best solution.Unfortunately, candidates generated by generation models tend to be highly homogeneous, which requires rankers must understand code functionality rather than simply rely on appearances.This paper proposes FareRanker, a novel contrastive learning-based Functionality-aware Ranker for code generation.

We introduce three components to assist the ranker in comprehending program functionality thoroughly.

(1) A novel hard negative sample construction strategy, FareSample, to inject typical generator errors into correct code.

(2) A novel contrastive object, FareObject, to align correct codes of consistent functionality when disturbed by FareSample while simultaneously ensure correct codes receive the highest ranking scores.

(3) A novel functionality-aware neural code ranking framework, FareRanker, to fully use the learned functional (in)-consistency introduced by FareObject, enabling the ranker to capture subtle errors and provide a more robust score in inference.

Requirements

python3
tree-sitter
torch
transformers==4.8.1

Dataset

You should first download the APPS dataset.

Obtain the Code Generation Model

We use the checkpoint for CodeRL model. And for GPT-Neo-125M, please first finetune the model for 2 epoch by running:

cd finetune
bash train.sh

Sample Code Candidates

You can sample code candidates for APPS training set and test set by running (please change the arguments for dataset path bu yourself):

cd ../sample
bash generate_coderl_apps.sh

Run the Unit Tests

To execute the unit tests and obtain test outcomes, we adapt the official implementation of the APPS benchmark. You can run the following commands by configuring the parameters as you need:

cd ../run_test
bash test_one_solution.sh

Obtain the Final Datasets

Then, you can build the positive and negative samples and get the final training and test datasets by runing:

cd ../data_preprocess
bash data_preprocess.sh
cd ../generate_final_dataset
python dataset.py

Train FareRanker

Finally, you can train FareRanker by running:

cd ../train_ranker
bash run_train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FareRanker

Requirements

Dataset

Obtain the Code Generation Model

Sample Code Candidates

Run the Unit Tests

Obtain the Final Datasets

Train FareRanker

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data_preprocess		data_preprocess
finetune		finetune
generate_final_dataset		generate_final_dataset
run_test		run_test
sample		sample
train_ranker		train_ranker
README.md		README.md

MrBlack0220/FareRanker

Folders and files

Latest commit

History

Repository files navigation

FareRanker

Requirements

Dataset

Obtain the Code Generation Model

Sample Code Candidates

Run the Unit Tests

Obtain the Final Datasets

Train FareRanker

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages