This repository is the artifact accompanying the OSDI '26 paper Efficient and Scalable Synchronization via Generalized Cache Coherence. Soul is an end-to-end realization of the Generalized cache-Coherence Protocol (GCP) atop a state-of-the-art Ethernet-based disaggregated shared-memory platform (MIND). Soul exposes standard lock APIs through a user-space library, delivering 1–2 orders of magnitude better real-world application performance than prior disaggregated locks while keeping storage overhead < 8 %.
The artifact bundles four code components:
soul_linux/— the modified Linux kernel, user-space lock libraries, benchmarks and scripts that run inside compute and memory VMs.ctrl_scripts/— the control server scripts to facilitate easy experiments. It includes push-button scripts to launch experiments, cluster config, and experiment config.soul_switch/— the Tofino programmable-switch sources: the C++ control-plane program, the P4 data-plane program, and a launcher script for the control plane.soul_gem5/— a SynchroTrace + gem5 simulator used to produce the simulation-only paper figures.
Soul targets a rack-scale disaggregated cluster. This artifact requires:
| Component | Quantity | Notes |
|---|---|---|
| Intel Tofino programmable switch | 1 | Tofino 1 or Tofino 2 with BF-SDE 9.7.0 installed on the switch host. |
| Compute servers | ≥ 2 | x86_64, Mellanox ConnectX-5 (or newer) RoCE NIC, ≥ 32 GB DRAM. Mellanox OFED for the RoCE NIC. KVM/libvirt to host VMs. Ubuntu 18.04 / 20.04 inside the VMs. |
| Memory servers | ≥ 1 | Same hardware class as compute servers. |
| Control server | 1 | Anything that can SSH into the cluster and the switch host. python3, pyyaml, requests, unbuffer, sftp. |
.
├── README.md (this file)
├── env.sh SITE-SPECIFIC PATH MACROS — EDIT BEFORE RUNNING ANYTHING
├── LICENSE
├── soul_linux/ Soul Linux kernel + user-space lock library
│ ├── test_programs/
│ │ ├── 07_lock_micro_benchmark/ User-space lock library + benchmarks
│ │ │ ├── lock_backends.hpp User-space lock library
│ │ │ └── ... Benchmarks + auxiliary source
│ │ └── ... Benchmarks + tests not present in paper
│ ├── kernel/lock_disagg.c Kernel implementation of GCP: wait queue, shared memory list, and lock manager
│ └── ... Kernel source
├── ctrl_scripts/
│ ├── scripts/
│ │ ├── run_commands.py Control server entry point
│ │ ├── config.yaml Cluster configs
│ │ └── profiles/*.yaml Experiment configs
│ └── ...
├── soul_switch/ Switch source + scripts
│ ├── dataplane/ Data plane source (P4)
│ ├── launch_mind_base_switch.sh Launcher for the control plane binary
│ └── ... Control plane source (C++ + controller/, configs/)
└── soul_gem5/ SynchroTrace + gem5 simulator
All site-specific paths come from the top-level env.sh; cluster topology
(per-VM IPs, MACs, ssh keys, switch-port assignments) lives in a handful of
config files that ship with placeholder values. Edit both before running
anything.
Every shell script, YAML profile, CMakeLists.txt and Makefile resolves its
paths through variables defined here. source env.sh on every machine that
runs Soul code (control server, compute / memory VMs, switch host).
| Variable | Purpose |
|---|---|
SOUL_PATH |
Artifact location on the compute / memory VMs (default: $HOME/Soul). |
SOUL_SRC_PATH |
Artifact location on the switch host. Switch C++ build reads $ENV{SOUL_SRC_PATH} for kernel headers (default: $HOME/Downloads/Soul). |
BF_SDE |
Tofino BF-SDE root on the switch host (default: $HOME/Downloads/bf-sde-9.7.0). |
BF_SDE_INSTALL |
Compiled SDE artifacts (default: ${BF_SDE}/install). |
YCSB_DIR |
YCSB workload files for 07b / 07d profiles (default: $HOME/Downloads/ycsb_workloads). |
TPCC_DIR |
TPC-C workload files for 07c (default: $HOME/Downloads/tpcc). |
TMP_LOG_DIR |
Per-VM intermediate logs (default: $HOME/Downloads/tmp_kern_logs). |
EVAL_OUT_DIR |
Final aggregated eval output on the control server (default: $HOME/Downloads/evaluation). |
SOUL_GEM5_SRC |
Root of the soul_gem5/ checkout (default: ${SOUL_PATH}/soul_gem5). |
GEM5_RESULT_DIR |
Captured-trace inputs + gem5 outputs (default: $HOME/gem5_results). |
PYTHON2, SCONS |
Binaries used to build gem5 (defaults: bare names on $PATH). |
| File | What to fill in |
|---|---|
ctrl_scripts/scripts/config.yaml |
Control-server view of the cluster: per-VM control ip, cluster ip, mac, vm name, user, key, nic. |
soul_switch/configs/config_switch1.json |
Switch's view: per-VM {port, ip, mac} for compute and memory VMs. |
soul_linux/test_programs/99_nic_scripts/*.sh |
Per-VM ARP / IP. Replace the <comp_N_mac> / <mem_N_mac> / <vmhost_mac> placeholders with the RoCE NIC MACs of your cluster (same values you put in config.yaml and config_switch1.json). |
# Step 0 in practice:
vi env.sh # path macros
source env.sh
vi ctrl_scripts/scripts/config.yaml # control-server topology
vi soul_switch/configs/config_switch1.json # switch port/IP/MAC map
vi soul_linux/test_programs/99_nic_scripts/*.sh # per-VM ARP tables (replace <comp_N_mac> / <mem_N_mac>)The full Soul pipeline runs on the cluster described above. The high-level stages are: (1) bring up the switch, (2) bring up the VMs and load kernel modules, (3) launch the workload from the control server, (4) collect logs and aggregate.
- Install Intel BF-SDE 9.7.0 following Intel's instructions, and set
BF_SDE/BF_SDE_INSTALLinenv.shto match. - Build the data plane. Drop the SOUL P4 sources into the SDE's
p4-examples tree and let
p4studiobuild them:# On the switch host source env.sh cp -R soul_switch/dataplane/* "${BF_SDE}/pkgsrc/p4-examples/p4_16_programs/" cd "${BF_SDE}/p4studio" && sudo ./p4studio build tna_disagg_mind_switch_base
- Build the C++ control plane.
cd "${SOUL_SRC_PATH}/soul_switch" mkdir build && cd build && cmake .. && make
- Launch the control plane.
cd "${SOUL_SRC_PATH}/soul_switch" ./launch_mind_base_switch.sh
# On the control server
source env.sh
cd ctrl_scripts/scripts
# 2a. Push ssh keys to all hosts (one-time).
python3 run_commands.py --profile=profiles/00_init_ssh_conn.yaml
# 2b. Boot / reboot the VMs.
python3 run_commands.py --profile=profiles/01_restart_vms.yaml# On all compute VMs
source env.sh
cd ${SOUL_PATH}/soul_linux/test_programs/07_lock_micro_benchmark
make kvs # for 07b_lock_kvs.yaml
make kc # for 07d_lock_kc.yaml
make bench # for 07_lock_micro_bench.yamlImportantly, each make overwrites bin/test_mltthrd, so re-make whenever you switch workloads.
The KVS / Kyoto Cabinet workloads consume pre-parsed trace files; the raw output of the upstream YCSB / TPC-C generators is not directly compatible. You will need to:
- Download YCSB (https://github.com/brianfrankcooper/YCSB) and TPC-C (any standard generator) and produce a workload trace.
- Convert each trace into the format that the workload binaries expect. See the
parse_workload()function insoul_linux/test_programs/07_lock_micro_benchmark/{kvs,kc}.cppfor the exact format. - Drop the converted files into the
${YCSB_DIR}and${TPCC_DIR}directories declared inenv.sh.
# On the control server
source env.sh
cd ctrl_scripts/scripts
python3 run_commands.py --profile=profiles/07b_lock_kvs.yamlThe final per-run output lands on the control server, at (paths come from env.sh):
${EVAL_OUT_DIR}/soul_micro/<node_num>_<thread_num>_<num_locks>_<rw_ratio>_<lock_type>/
kern.node_0_of_<n>.log
kern.node_1_of_<n>.log
...
<node_num>_<thread_num>_<num_locks>_<rw_ratio>_<lock_type>.stat
Each figure in the paper is produced by one evaluation config (profile) and the corresponding workload binaries as summarized below. To run different baselines and workloads, edit the evaluation config file and re-run.
| Paper figure | Evaluation config | Workload binary |
|---|---|---|
| KVS (Fig. 9) | 07b_lock_kvs.yaml |
make kvs |
| Kyoto Cabinet (Fig. 10) | 07d_lock_kc.yaml |
make kc |
| Micro-benchmark (Fig. 11) | 07_lock_micro_bench.yaml |
make bench |
| Optimization breakdown (Fig. 12) | 07_lock_micro_bench.yaml |
make bench |
Fig. 13, 14, 15, 16 are produced by the SynchroTrace + gem5 simulator under soul_gem5/. To reproduce them, first follow the SynchroTrace instructions to capture traces for the KVS and Kyoto Cabinet workloads, then compile the simulator with compile.sh and run the experiments with python3 run.py:
cd soul_gem5
./compile.sh # build the gem5 simulator binary
python3 run.py # replay the captured traces and emit the figure data