BO Forge is a notebook-first Bayesian optimisation campaign tool. The notebook is the user workflow, while the reusable BO logic lives in the bo_forge Python package.
MVP v0.1 is a sequential campaign demo: define a problem, load a CSV log, suggest one experiment, enter one result, reload the log, and repeat.
MVP v0.1 deliberately supports only:
- continuous variables
- one objective
- maximize or minimize direction
- Sobol initial suggestions
- BoTorch
SingleTaskGP - LogEI for single suggestions and qLogEI for batches
- CSV campaign logs
- resume from existing logs
- basic diagnostics
It intentionally does not yet cover categorical variables, constraints, noisy BO, multi-objective optimisation, a CLI, or an app UI.
flowchart LR
A["YAML config"] --> B["Load CSV log"]
B --> C["Validate campaign data"]
C --> D{"Enough observations?"}
D -- "No" --> E["Sobol suggestion"]
D -- "Yes" --> F["Fit SingleTaskGP"]
F --> G["Score LogEI / qLogEI"]
G --> H["Suggest candidate(s)"]
E --> H
H --> I["Append status=suggested"]
I --> J["Run experiment"]
J --> K["mark_observed()"]
K --> B
The app/UI layer is intentionally absent in this MVP.
Future interfaces should wrap this backend package rather than moving BO logic into notebooks or app code.
Create a dedicated environment at the project root:
python3 -m venv .venv
./.venv/bin/pip install -e ".[dev]"The dev extra includes pytest, Ruff, and enough Jupyter tooling to open and execute the example notebook from a fresh clone.
Run the test suite:
./.venv/bin/pytestRun lint checks:
./.venv/bin/ruff check .Run the clean script example:
./.venv/bin/python examples/quickstart.pyIt copies the seed CSV log to an ignored working file, requests one suggestion, simulates one result, records that result with mark_observed(), and reloads the campaign log.
The same workflow in minimal Python:
from pathlib import Path
import shutil
from bo_forge import (
CampaignConfig,
append_suggestions,
load_campaign_log,
mark_observed,
suggest_next,
)
config = CampaignConfig.from_yaml("configs/simple_2d.yaml")
seed_log_path = Path("examples/simple_2d_campaign_log.csv")
log_path = Path("examples/simple_2d_working_log.csv")
shutil.copyfile(seed_log_path, log_path)
df = load_campaign_log(log_path, config)
suggestions = suggest_next(config, df)
append_suggestions(log_path, suggestions)
# After running the suggested experiment:
mark_observed(log_path, row_id=suggestions.loc[0, "row_id"], objective_value=1.95)Campaign logs use this column order:
row_id,iteration,status,source,<variable columns...>,<objective column>,predicted_mean,predicted_std,acquisition
Rules:
statusissuggestedorobserved.sourceismanual,sobol,log_ei, orqlog_ei.- Suggested rows have blank objective values.
- Observed rows require objective values.
- A suggested experiment becomes observed by updating the same row with
mark_observed(). row_id,iteration,source, and variable values are preserved when a result is entered.
See CSV_SCHEMA.md for the full column reference, allowed blank values, and status-transition rules.
configs/simple_2d.yaml: maximises photocatalyst-styleactivity.configs/simple_2d_minimize.yaml: minimises processdefect_rate.
Open notebooks/01_simulated_campaign.ipynb for a simulated end-to-end maximisation campaign using configs/simple_2d.yaml and examples/simple_2d_campaign_log.csv.
Open notebooks/02_minimization_campaign.ipynb for a shorter minimisation campaign using configs/simple_2d_minimize.yaml and examples/simple_2d_minimize_campaign_log.csv. It fills the Sobol initial design, then demonstrates one qLogEI batch BO round.
From a fresh clone:
python3 -m venv .venv
./.venv/bin/pip install -e ".[dev]"
./.venv/bin/jupyter notebook notebooks/01_simulated_campaign.ipynbThe notebook demonstrates the real sequential workflow:
- load the current log
- request one candidate
- append it as
status=suggested - run one experiment
- enter one result with
mark_observed() - reload the log and repeat
The diagnostics use bo_forge/plot_style.py, which captures the bold axes, thicker spines, compact legends, and figure sizing used throughout the local PyTorch & BoTorch tutorial notebooks.
The notebook writes only ignored working files:
examples/simple_2d_working_log.csvexamples/latest_suggestions.csv
The plotting helpers produce report-ready black-on-white figures, even when the notebook or IDE uses a dark theme. plot_progress() shows observed and best-so-far objective values, while plot_diagnostics() shows the observed design space for one- or two-variable campaigns.
Both functions return (fig, ax) and can optionally save figures:
from bo_forge.diagnostics import plot_progress
plot_progress(config, df, filename="progress.png")BO Forge is intentionally strict because users edit YAML and CSV files by hand.
Common errors:
Variable 'temperature' has lower >= upper: check the YAMLloweranduppervalues.Campaign log must start with canonical columns: make sure the CSV begins withrow_id,iteration,status,source.status='observed' but objective ... is blank: fill the objective value or change the row back tosuggested.status='suggested' but objective ... is filled: suggested rows must leave the objective blank untilmark_observed()is called.Cannot generate new suggestions while unresolved status='suggested' rows exist: run the experiment and callmark_observed()before requesting another suggestion.Row ... has invalid source: use onlymanual,sobol,log_ei, orqlog_ei.Duplicate row_id: every row needs a uniquerow_id.Variable ... is outside bounds: check the variable value against the YAML bounds.
When in doubt, run:
from bo_forge import CampaignConfig, load_campaign_log, validate_campaign_data
config = CampaignConfig.from_yaml("configs/simple_2d.yaml")
df = load_campaign_log("examples/simple_2d_campaign_log.csv", config)
validate_campaign_data(config, df)See COMMON_ERRORS.md for a longer error-message reference with fixes.
See REPOSITORY_STRUCTURE.md for the package layout, file responsibilities, and recommended development workflow.
The primary dependency source is pyproject.toml. A direct-dependency snapshot from the v0.1.2 environment is recorded in requirements-lock.txt.
Angze Li