Brainmarks is an open evaluation suite for fMRI foundation models.
pip install brainmarks
# or
uv add brainmarksModel wrappers for third-party encoders are optional extras:
pip install "brainmarks[brain-jepa,brainlm,swift,brainharmonix,brain-semantoks,neurostorm]"To install the latest development version from GitHub:
pip install "brainmarks @ git+https://github.com/MedARC-AI/brainmarks"From source:
git clone https://github.com/MedARC-AI/brainmarks
cd brainmarks
uv sync --python 3.11Brainmarks has two main evaluation modes.
Probe — trains a frozen-backbone classifier head (linear, attention, or MLP):
python -m brainmarks.main_probe <model> <representation> <classifier> <dataset>
# e.g.
python -m brainmarks.main_probe brainlm_vitmae_111m patch attn nsd_cococlipLogistic — extracts embeddings once and fits a logistic regression:
python -m brainmarks.main_logistic <model> <representation> <dataset>
# e.g.
python -m brainmarks.main_logistic brainlm_vitmae_111m patch aabc_sexrepresentation selects which embedding type the model exposes to the head: cls, reg (registers), or patch. Pass --help to either command to see the full list of available models and datasets. Use --config to pass a YAML config file and --overrides key=value for per-run overrides.
# e.g.
python -m brainmarks.main_logistic \
brainlm_vitmae_111m \
patch \
aabc_sex \
--overrides \
batch_size=16 \
num_workers=4 \
device=cpuAll available options are documented in the default configs: default_probe.yaml, default_logistic.yaml.
Benchmark datasets are distributed in Huggingface Arrow format hosted in the MedARC R2 bucket. To request access, fill out this form.
Once you have credentials, configure them as environment variables:
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_ENDPOINT_URL_S3=... # Cloudflare R2 endpointDatasets are downloaded automatically on first use and saved in the Huggingface dataset cache.
Brainmarks uses namespace package plugin discovery. To add a model from your own repo without modifying this one:
-
Install
brainmarksas a dependency in your project environment. -
Create a
brainmarksnamespace package in your repo:mkdir -p my_repo/src/brainmarks/models
-
Copy
src/brainmarks/models/template.pyas a starting point and implementModelWrapper,ModelTransform, and a@register_modelconstructor. -
Validate with the smoke test:
python -m brainmarks.models.test_models my_model
See template.py for more details.
Adding a dataset involves two parts: curation scripts that preprocess raw data into Arrow shards, and a loader module that registers the dataset with Brainmarks.
Curation scripts live in datasets/, one subdirectory per source dataset. See datasets/HCP-YA/ for a reference example — it contains metadata, preprocessing scripts, and a README describing the raw data layout and curation steps.
Loader modules live in src/brainmarks/datasets/. Each module defines one or more functions decorated with @register_dataset that load Arrow shards (local or from S3) into an HFDataset. See src/brainmarks/datasets/hcpya.py as a reference.
Dataset loader modules are discovered via the same namespace package plugin mechanism as models, so they can live in an external repo.
For help with any issues, reach out to us on MedARC Discord in the #neuro-fm channel.
@article{lane2025scaling,
title = {Scaling Vision Transformers for Functional {MRI} with Flat Maps},
author = {Lane, Connor and Tripathy, Mihir and Murali, Leema Krishna and
Grandhi, Ratna Sagari and Yang, Shamus Sim Zi and Gijsen, Sam and
Das, Debojyoti and Ram, Manish and Singh, Utkarsh Kumar and
Villanueva, Cesar Kadir Torrico and Wei, Yuxiang and Beddow, Will and
Cort\'{e}s, Gianfranco and Cho, Suin and Kaplan, Daniel Z. and
Warner, Benjamin and Abraham, Tanishq Mathew and Scotti, Paul S.},
journal = {arXiv preprint arXiv:2510.13768},
year = {2025},
url = {https://arxiv.org/abs/2510.13768}
}