Zhen Liu ResearchAgents

Zhen Liu

PhD Candidate in Computer Science (D-INFK), ETH Zurich
Research focus: Multimodal Large Language Models (MLLMs), Vision-Language Reasoning, and Efficient VLM Systems.

About

I build lightweight, reproducible tools for multimodal research workflows, with an emphasis on:

retrieval and grounding for document-centric QA
evaluation pipelines for VLM experiments
efficiency-oriented methods for visual token reduction

Growth Snapshot (2024-2026)

2024: Started building compact research utilities for multimodal retrieval and evaluation.
2025: Expanded to reusable CLI tools, testable pipelines, and benchmark-style experimentation.
2026: Focusing on robust multimodal systems for long-context documents and efficient inference.

Selected Open-Source Projects

multimodal-doc-rag: citation-aware multimodal retrieval and context building toolkit.
vlm-eval-lite: minimal and reproducible multimodal QA evaluation runner.
sparse-vl: simulation toolkit for visual-token sparsification strategies in VLM inference.

Current Interests

grounded multimodal RAG for PDFs and technical reports
long-context VLM evaluation and failure analysis
practical methods for reducing multimodal serving cost

Last updated: February 2026.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zhen Liu ResearchAgents

Highlights

Block or report ResearchAgents

Zhen Liu

About

Growth Snapshot (2024-2026)

Selected Open-Source Projects

Current Interests

Popular repositories Loading

Uh oh!