AdvEval

This repo contains the official code of ACL24 findings paper "Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models".

Setup

Install Environment

conda create -n adveval python=3.9
conda activate adveval
pip install -r requirments

Configure API Keys

export OPENAI_API_KEY=[YOUR_OPENAI_API_KEY]
export PALM_API_KEY=[YOUR_PALM_API_KEY]

Evaluation

Example command to evaluate UniEval on dialogue response generation task:

# To obtain R+ samples
python main.py \
    --task_name response_dailydialog \
    --gold_evaluator_name palm_dialog \
    --victim_evaluator_name unieval_dialog \
    --optimizer_name optimizer_dialog_palm

# To obtain R- samples
python main.py \
    --task_name response_dailydialog \
    --gold_evaluator_name palm_dialog \
    --victim_evaluator_name unieval_dialog \
    --optimizer_name optimizer_dialog_palm \
    --negative_optimization_goal

How to Cite

Please cite our paper:

@article{chen2024adveval,
  title={Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models},
  author={Chen, Yiming and Zhang, Chen and Luo, Danqing and D'Haro, Luis Fernando and Tan, Robby T and Li, Haizhou},
  journal={arXiv preprint arXiv:2405.14646},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AdvEval

Setup

Install Environment

Configure API Keys

Evaluation

How to Cite

About

Releases

Packages

Languages

MatthewCYM/AdvEval

Folders and files

Latest commit

History

Repository files navigation

AdvEval

Setup

Install Environment

Configure API Keys

Evaluation

How to Cite

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages