We demonstrate ConceptBank's ability to handle real-world distribution shifts where the original SAM3 fails.
Scenario: In the COCO Object dataset, the class "mouse" strictly refers to a computer mouse. Observation:
- Original SAM3: Suffers from Concept Drift. It relies on pre-trained open-world knowledge, incorrectly segmenting the animal mouse because it cannot distinguish the dataset-specific definition.
- ConceptBank (Ours): Correctly aligns with the target domain statistics, segmenting only the computer mouse and ignoring the animal mouse.
Scenario: Applying the model to iSAID (Remote Sensing), a domain with significant visual distribution shifts compared to natural scenes. Observation:
- Original SAM3: Misses objects and hallucinates incorrect labels due to Data Drift.
- ConceptBank (Ours): Anchors to visual prototypes from the support set, successfully segmenting all planes without false positives.
The recent introduction of SAM3 has revolutionized Open-Vocabulary Segmentation (OVS) through promptable concept segmentation, which grounds pixel predictions in flexible concept prompts. However, this reliance on pre-defined concepts makes the model vulnerable: when visual distributions shift (data drift) or conditional label distributions evolve (concept drift) in the target domain, the alignment between visual evidence and prompts breaks down.
In this work, we present ConceptBank, a parameter-free calibration framework to restore this alignment on the fly. Instead of adhering to static prompts, we construct a dataset-specific concept bank from the target statistics. Our approach:
- Anchors target-domain evidence via class-wise visual prototypes.
- Mines representative supports to suppress outliers under data drift.
- Fuses candidate concepts to rectify concept drift.
We demonstrate that ConceptBank effectively adapts SAM3 to distribution drifts, including challenging natural-scene and remote-sensing scenarios, establishing a new baseline for robustness and efficiency in OVS.
This project relies on PyTorch, MMSegmentation, and the SAM3 codebase.
- Python: 3.12 (Recommended)
- CUDA: 12.6
We strictly tested the code with the following library versions:
torch 2.9.1+cu126torchvision 0.24.1+cu126openmim 0.3.9mmengine 0.10.7mmcv 2.2.0(Must be built from source)mmsegmentation 1.2.2
- Create a virtual environment:
conda create -n cb4ovs python=3.12 -y
conda activate cb4ovs- Install PyTorch:
pip install torch==2.9.1+cu126 torchvision==0.24.1+cu126 --extra-index-url https://download.pytorch.org/whl/cu126- Install MMLab Dependencies: Install basic tools:
pip install openmim==0.3.9
mim install mmengine==0.10.7Install MMCV (Build from Source):
Since we use a specific PyTorch version, mmcv must be compiled from source to ensure CUDA operator compatibility.
For more details, refer to the MMCV Build Guide.
git clone https://github.com/open-mmlab/mmcv.git
cd mmcv
git checkout v2.2.0
# MMCV_WITH_OPS=1 ensures CUDA ops are compiled
MMCV_WITH_OPS=1 pip install -e .
cd ..Install MMSegmentation:
mim install mmsegmentation==1.2.2Our method is based on SAM3. You must download the official SAM3 repository and weights.
- Clone SAM3:
Download the SAM3 code and place the
sam3folder in the root of this project.
# Ensure the folder is named 'sam3' and sits next to our scripts
git clone https://github.com/facebookresearch/sam3.git- Download Checkpoints:
Download
sam3.ptfrom the Hugging Face repository. Move the checkpoint tosam3/assets/.
# Move your downloaded sam3.pt here
mv path/to/sam3.pt sam3/assets/Please organize your datasets in a data/ directory.
We evaluate on 8 benchmarks. Please follow the MMSegmentation Dataset Preparation Guide for standard datasets:
Note on COCO-Object: For the COCO-Object dataset conversion, we follow the format used in SCLIP. Please refer to their repository for the conversion scripts:
We evaluate on 4 benchmarks: LoveDA, Potsdam, Vaihingen, and iSAID. Ensure these are formatted compatibly with MMSegmentation conventions.
We provide pre-constructed Concept Banks to facilitate direct reproduction of our results. These files are located in:
configs/concept_bank/cb_sam3_ns.pt(for Natural Scene) downloadconfigs/concept_bank/cb_sam3_rs.pt(for Remote Sensing) download
You can use the provided scripts to evaluate these banks directly or regenerate them from scratch.
By default, the scripts cb4ovs_ns.sh and cb4ovs_rs.sh are configured to load the provided Concept Banks and run the evaluation immediately.
Natural Scene Evaluation:
# Usage: bash cb4ovs_ns.sh [NGPUS] [LOGFILE]
bash cb4ovs_ns.sh 4 logs_ns.txtRemote Sensing Evaluation:
# Usage: bash cb4ovs_rs.sh [NGPUS] [LOGFILE]
bash cb4ovs_rs.sh 4 logs_rs.txtIf you wish to re-build the Concept Banks from scratch (e.g., to reproduce the generation process), simply uncomment the concept generation section (the sam3_concept_bank.py command) in the corresponding shell scripts (cb4ovs_ns.sh or cb4ovs_rs.sh) before running them.
We provide a web-based demo for interactive testing.
Install gradio:
pip install gradio==6.2.0Concept Bank Demo:
python app.pyWe provide the quantitative comparison with state-of-the-art methods on both natural scene and remote sensing benchmarks.
Natural Scene Benchmarks
Remote Sensing Benchmarks
Natural Scene Visualization
Remote Sensing Visualization
We explicitly thank the authors of the following excellent open-source projects, which were instrumental in this work:
- SAM3: https://github.com/facebookresearch/sam3
- SCLIP: https://github.com/wangf3014/SCLIP
- TCL: https://github.com/kakaobrain/tcl
- GroupViT: https://github.com/NVlabs/GroupViT
- SegEarth-OV: https://github.com/likyoo/SegEarth-OV
















