Agentic-R: Learning to Retrieve for Agentic Search

If you like our project, please give us a star ⭐ on GitHub.

📣 Latest News

[Apr 6, 2026]: 🔔 Our paper has been accepted to the ACL 2026 (Findings)!
[Jan 26, 2026]: 🚀 We released our trained 🤗search agent and 🤗wikipedia corpus.
[Jan 15, 2026]: 🚀 We released our full codebase and our retriever model 🤗Agentic-R_e5.

1. Agentic-R Introduction

💡 1.1 Overview

Agentic-R is a dense retriever tailored for agentic search. To train it, we first design a novel approach to measure the passage utility in agentic search and then propose an Agent-Retriever iterative optimization approach.

📊 1.2 Overall Performance

⚡ 2. Quick Start for testing Agentic-R

📘 2.1 Environment and Preparation

Environment

In this step, we will describe the required packages for inferencing with Agentic-R. We strongly recommend using a separate conda environment.

# ---------------------------------- create env ----------------------------------
conda create -n agentic-r python=3.10 -y
source ~/.bashrc
conda activate agentic-r
# ---------------------------------- install packages ----------------------------------
cd FlashRAG
pip install -e .
pip install vllm==0.10.1
pip install sentence-transformers
pip install pyserini
pip install GPUtil
pip install nvitop
pip install termcolor
pip install numpy==1.26
pip install deepspeed==0.18.0
pip install qwen_omni_utils
pip install modelscope
pip install faiss_gpu==1.7.3
pip install transformers==4.57.1

Preparation

a. After installing the necessary packages, remember to update the WORKSPACE_DIR and PROJECT_DIR (both should be absolute paths) in config.py. These two parameters will be used both in our inference codes and training codes. Here is a recommended directory structure:

{WORKSPACE_DIR}
├── trained_models
│   ├── Agentic-R_e5
│   └── triviaqa_hotpotqa_train-search-r1-ppo-qwen2.5-7b-em-iter1
│
├── data
│   └── FlashRAG_Dataset
│       ├── nq
│       ├── hotpotqa
│       ├── retrieval_corpus
│       └── ...
│
└── {PROJECT_DIR}  (i.e., Agentic-R)
    ├── FlashRAG
    ├── Search-R1
    ├── tevatron
    └── config.py

b. Download the datasets for testing (such as nq, hotpotqa, ...) from FlashRAG_Dataset and put them under directory {WORKSPACE_DIR}/data/FlashRAG_Dataset/. Download our trained search agent and put it under directory {WORKSPACE_DIR}/trained_models/. Download the retrieval corpus and put it under directory {WORKSPACE_DIR}/data/FlashRAG_Dataset/.

c. Download Agentic-R and put it under directory {WORKSPACE_DIR}/trained_models/ and build the wikipedia index based on the following code:

conda activate agentic-r
model_name=Agentic-R_e5
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m flashrag.retriever.index_builder \
    --retrieval_method ${model_name} \
    --model_path {WORKSPACE_DIR}/trained_models/${model_name} \
    --corpus_path {WORKSPACE_DIR}/data/FlashRAG_Dataset/retrieval_corpus/wiki18_100w.jsonl \
    --save_dir {WORKSPACE_DIR}/data/FlashRAG_Dataset/retrieval_corpus/ \
    --use_fp16 \
    --max_length 256 \
    --batch_size 128 \
    --faiss_type Flat \
    --sentence_transformer \
    --instruction "passage: "

3.1.2 Testing Agentic-R based on our trained Agent

conda activate agentic-r
cd FlashRAG/examples/methods
bash run_exp.sh

Note: For our Agentic-R, the parameter agentic_retriever_input is set as True, which uses 'Question [SEP] query' for retrieval.

🔥 3. How to train Agentic-R?

In our work, we design a Agent-Retriever iterative optimization framework that iteratively optimizes the search agent and our Agentic-R. Next, we will use the first iteration as an example to introduce the training codes of our search agent and Agentic-R.

📘 3.1 Environment and Preparation

a. Install environment

We strongly recommend using a separate conda environment for agent training (following Search-R1).

# ---------------------------------- create env ----------------------------------
conda create -n searchr1 python=3.10 -y
source ~/.bashrc
conda activate searchr1

# ---------------------------------- install packages ----------------------------------
pip install torch==2.4.0+cu118
pip3 install vllm==0.6.3
cd Search-R1
pip install -e .
pip install wandb
pip install flash_attn==2.7.3
pip install triton==3.0.0
pip install xformers==0.0.27.post2+cu118

We recommend using another separate conda environment for retriever training.

# ---------------------------------- create env ----------------------------------
conda create -n tevatron python=3.10 -y
conda activate tevatron
cd tevatron
pip install -e .

# ---------------------------------- install packages ----------------------------------
pip install deepspeed==0.18.0
pip install accelerate
pip install transformers==4.57.1
pip install qwen_omni_utils
pip install peft
pip install torch==2.7.0
pip install faiss_gpu==1.7.3
pip install numpy==1.26.0
pip install uvicorn fastapi

b. Download the wiki corpus wiki18_100w.jsonl from retrieval_corpus and put these files in {WORKSPACE_DIR}/data/FlashRAG_Dataset/retrieval_corpus/.

c. Use e5-base-v2 to build the wikipedia index based on the following script:

conda activate tevatron
model_name=e5-base-v2
model_path={WORKSPACE_DIR}/llm/$model_name
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m flashrag.retriever.index_builder \
    --retrieval_method ${model_name} \
    --model_path $model_path \
    --corpus_path {WORKSPACE_DIR}/data/FlashRAG_Dataset/retrieval_corpus/wiki18_100w.jsonl \
    --save_dir {WORKSPACE_DIR}/data/FlashRAG_Dataset/retrieval_corpus/ \
    --use_fp16 \
    --max_length 256 \
    --batch_size 256 \
    --faiss_type Flat \
    --sentence_transformer \

🔥 3.2 Search Agent Training

3.2.1 Launch the retriever

Before training the search agent, run the following code to launch a retriever (we use E5 in the first iteration):

conda activate tevatron
cd Search-R1
bash retrieval_launch.sh

3.2.2 Train the Agent

By default, we use hotpotqa and triviaqa as our training data. The training and testing datasets are generated by scripts qa_search_train_merge.py and qa_search_test_merge.py under directory Search-R1/scripts/data_process. You could also download our pre-processed data (training.parquet and test.parquet) from here and put them under directory Search-R1/scripts/data_process/data/.

conda activate searchr1
bash train_ppo.sh

Note: This scirpt also includes codes for training agent based on our Agentic-R, which sends 'Question [SEP] query' to the retriever as the query. In this code, the parameter retriever.agentic_retriever_input is set as true.

🔥 3.3 Agentic-R Training

3.3.1 Generate Training Data

a. Generate trajectory of the search agent

conda activate agentic-R
cd FlashRAG/examples/methods
bash step1_generate_trajectory.sh

b. Generate candidate passages

In this part, for each query generated by the search agent, we use dense retriever to retrieve training passages (for the first iteration, the retriever is E5, for the second-iteration, the retriever is trained Agentic-R after the first iteration).

conda activate agentic-R
bash step2_generate_passage_candidates.sh

c-1. generate local relevance (passage utility 1)

conda activate agentic-R
# first generate the sub-answer using Qwen-72B-Instruct
bash step3-0_generate_subanswer.sh
# then score the candidate passages
bash step3-1_generate_local_utility.sh

c-2. generate final answer correctness (passage utility 2)

conda activate agentic-R
bash step3-2_generate_global_utility.sh

d. construct retriever training data

python step4_construct_retriever_data.py

We also provide the final training data for the first iteration training, you could download it from here and put it under directory FlashRAG/examples/methods/training_data/ .

3.3.2 Retriever Training

cd tevatron/scripts/
bash train_agentic-R.sh
# the parameter agentic_retriever_input is set as True, which controls the query input of the retriever.

After training, use the following code to build the index:

cd FlashRAG/scripts/
bash build_index_after_train.sh

📄 Citation

If you find this work helpful, please cite our papers:

@article{liu2026agentic,
  title={Agentic-R: Learning to Retrieve for Agentic Search},
  author={Liu, Wenhan and Ma, Xinyu and Zhu, Yutao and Li, Yuchen and Shi, Daiting and Yin, Dawei and Dou, Zhicheng},
  journal={arXiv preprint arXiv:2601.11888},
  year={2026}
}

🤝 Acknowledge

Our codes are build upon FlashRAG, Search-R1 and tevatron. Our work is based on the Qwen2.5 model series, and we sincerely thank the Qwen team for their outstanding contributions to the open-source community.

📄 License

This project is released under the MIT License.

📞 Contact

For any questions or feedback, please reach out to us at lwh@ruc.edu.cn.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
FlashRAG		FlashRAG
Search-R1		Search-R1
__pycache__		__pycache__
tevatron		tevatron
README.md		README.md
config.py		config.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic-R: Learning to Retrieve for Agentic Search

If you like our project, please give us a star ⭐ on GitHub.

📣 Latest News

1. Agentic-R Introduction

💡 1.1 Overview

📊 1.2 Overall Performance

⚡ 2. Quick Start for testing Agentic-R

📘 2.1 Environment and Preparation

Environment

Preparation

3.1.2 Testing Agentic-R based on our trained Agent

🔥 3. How to train Agentic-R?

📘 3.1 Environment and Preparation

🔥 3.2 Search Agent Training

3.2.1 Launch the retriever

3.2.2 Train the Agent

🔥 3.3 Agentic-R Training

3.3.1 Generate Training Data

3.3.2 Retriever Training

📄 Citation

🤝 Acknowledge

📄 License

📞 Contact

Star History

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agentic-R: Learning to Retrieve for Agentic Search

If you like our project, please give us a star ⭐ on GitHub.

📣 Latest News

1. Agentic-R Introduction

💡 1.1 Overview

📊 1.2 Overall Performance

⚡ 2. Quick Start for testing Agentic-R

📘 2.1 Environment and Preparation

Environment

Preparation

3.1.2 Testing Agentic-R based on our trained Agent

🔥 3. How to train Agentic-R?

📘 3.1 Environment and Preparation

🔥 3.2 Search Agent Training

3.2.1 Launch the retriever

3.2.2 Train the Agent

🔥 3.3 Agentic-R Training

3.3.1 Generate Training Data

3.3.2 Retriever Training

📄 Citation

🤝 Acknowledge

📄 License

📞 Contact

Star History

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages