An implementation for MetGen: A Module-Based Entailment Tree Generation Framework for Answer Explanation.
- Python 3.8
- Ubuntu 21.04
- Python Packages
conda create -n metgen python=3.8
conda activate metgen
pip install -r requirements.txt
Download EntailmentBank dataset.
Download our processed data.
The data/ folder should contain four folders:
data
├── entailment_trees_emnlp2021_data_v2 # the EntailmentBank dataset
├── wiki_match # the synthetic data for module training
├── Steps # the annotated/pseudo step data for module training
└── Controller_data # the processed data for controller training
Follow the ./scripts/train_module.sh to train the prefixed module.
- Step1: train the module with synthetic data;
- Step2: train the module with Train-pseudo data.
The trained module would be saved in the /exp/Module_all/para_etree_all folder.
Follow the ./scripts/train_controller.sh to train the reasoning controller.
- Step1: make controller training data based on the orginal dataset and the trained module;
- Step2: train the controller with the data.
We train two controllers, one for Task 1 and one for Task 2/3.
The trained controller would be save in the /exp/Controller_task1 and /exp/Controller_task2 folders.
Follow the ./scripts/test_task1.sh, ./scripts/test_task2.sh, and ./scripts/test_task3.sh to obtain the predictions based on the trained module and controllers.
- Step1: select the checkpoint and the hyperparameters of reasoning algorithm using the dev split;
- Step2: run the reasoning algorithm with the selected checkpoint and hyperparameters on the test split.
The predictions would be save in .json and .csv files.
Use the .csv file and follow the offical evaluation code of EntailmentBank to evaluate automatically.
For Task 1 and Task 2, we also provide the evaluation metrics by our implementation (code/evaluate_metric.py).
We provide the trained models (entailment module and reasoning controller) for direct reproduction.
Unzip the files and place them in exp/ folder.
Run the following commands to reproduce the results.
cd scripts
sh test_task1.sh
sh test_task2.sh
sh test_task3.sh
Please contact Ruixin Hong for questions and suggestions.
@inproceedings{DBLP:conf/naacl/HongZYZ22,
author = {Ruixin Hong and
Hongming Zhang and
Xintong Yu and
Changshui Zhang},
editor = {Marine Carpuat and
Marie{-}Catherine de Marneffe and
Iv{\'{a}}n Vladimir Meza Ru{\'{\i}}z},
title = {{M}et{G}en: {A} Module-Based Entailment Tree Generation Framework for
Answer Explanation},
booktitle = {Findings of the Association for Computational Linguistics: {NAACL}
2022, Seattle, WA, United States, July 10-15, 2022},
pages = {1887--1905},
publisher = {Association for Computational Linguistics},
year = {2022},
url = {https://aclanthology.org/2022.findings-naacl.145},
timestamp = {Mon, 18 Jul 2022 17:13:00 +0200},
biburl = {https://dblp.org/rec/conf/naacl/HongZYZ22.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}

