LanguageEvolution is a composable framework for simulating morphological language change across a social network of agents. It combines configurable agent policies, phonological cascades, and iterated learning dynamics to explore how morphology evolves over time.
- Overview
- Key capabilities
- Quick start
- Repository layout
- Running simulations
- Configuration at a glance
- Output artefacts
- Experiment library
- Development workflow
- Further reading
At the core of the project is core/Simulation, which wires together a population of agents, their contact network, and a metrics pipeline. Simulations are configured through YAML files (stored in experiments/) and executed with run.py. Outputs land in data/ and include per-agent lexicons, population-level statistics, and visualisations of morphological change.
- Config-driven simulations covering language templates, phonology, networks, and output control.
- Pluggable adoption, induction, compression, and contact policies for agents.
- Support for scheduled events such as network rewiring and phonological cascade injections.
- Optional iterated learning model (ILM) with learner turnover and mentorship assignment.
- Rich metrics collection with plots, GIFs, and tabular exports for downstream analysis.
- Create and activate a virtual environment.
- Install dependencies from the project root.
- Launch the default experiment.
- Inspect the results under
data/default/.
Running
python run.pywith no argument automatically loadsexperiments/default.yml.
The experiment sweep, result tables, and figures are reproduced through the
pipeline in exploratory_experiments/; see
exploratory_experiments/README.md for
the stage-by-stage guide.
run.py: CLI entry point that loads configs, seeds RNGs, and executesSimulation.core/: Library code implementing agents, networks, phonology, schedulers, caching, and metrics.experiments/: YAML configs you can run directly or use as templates.exploratory_experiments/: Pipeline for the experiment sweep, result tables, and figures.AIHistoricalLinguist/: The AI Historical Linguist (AIHL) assessor/critic evaluation tool.data/: Output directory (created at runtime) containing metrics, images, and caches.tests/: Test suite covering adoption, agent behaviour, cascade events, ILM, and more.requirements.txt: Dependency lock for the Python environment.docs/: Additional documentation (configuration guide, architecture notes, and workflow tips).
Simulations are launched by pointing run.py at a config file:
python run.py path/to/experiment.ymlBehind the scenes core.utils.build_merged_config() merges your overrides with experiments/default.yml, normalises aliases, and derives subsystem seeds. The resulting dictionary instantiates core.simulation.Simulation, which:
- Builds the agent population and contact network.
- Sets up the scheduler, metrics collector, and optional ILM manager.
- Executes the main loop for the configured number of iterations.
- Writes metrics and visualisations according to
outputsettings.
To customise a run, copy one of the configs in experiments/, edit the desired sections, and run it with run.py. See Running Experiments for a step-by-step walkthrough.
The configuration schema is split into seven major sections:
seed: Global RNG seed (derives subsystem seeds automatically).language: Morphological templates, affixes, dialect diversity, and phonology rules.agent: Adoption/contact policies and the induction/compression schedule.population: Number of agents, utterance length, network topology, and caching.simulation: Iteration count, scheduler, and time-indexed events (e.g., cascade injections).ilm: Iterated learning parameters (optional).output: Logging cadence and which artefacts to write.
For field-by-field documentation, refer to the Configuration Guide. It also covers advanced features such as phonological cascade events and ILM learners.
Each run writes to data/<run_name>/ (where <run_name> defaults to output.data_dir):
agent_stats/: Per-agent morphology images, lexicon dumps, signatures, and optional detailed logs.pop_stats/: Population-level timeseries, morphology grids, signature prevalence tables, and network frames.summary_stats/: Aggregate plots (population_complexity.png,population_entropy.png, etc.) and optional GIFs tracing morphology over time._cache_populations/: (Optional) cached initial morphologies keyed by language parameters and seed.
See Running Experiments for tips on managing output volume and leveraging caches during rapid iteration.
Ready-to-run configs in experiments/ include:
default.yml: Baseline settings with full metrics output.default_sfit.yml: Shifts adoption toward stem targets.default_sfit_afit.yml: Balances stem and affix adoption pressure.default_sfit_afit_randaffix.yml: Similar to the above without splitting explicit affixes.cumulative_adoption.yml: Illustrates cumulative adoption dynamics.objective_adoption.yml: Demonstrates objective-based adoption tweaks.cascade_events.yml: Adds mid-run phonological cascade introductions.semantics_router.yml: Enables semantic routing during adoption.ilm.yml: Activates the iterated learning model with learners and mentorship.
Run any experiment with python run.py experiments/<name>.yml.
- New policies or events; update the relevant factory maps and document parameters in your config.
- When adding phonology rules, implement them in
core/phonology/rule_library.pyso they can be referenced by name in configs. - Heavy outputs can slow iteration—temporarily disable them via the
outputsection while developing new features.
- Configuration Guide: Detailed description of every config field and advanced options.
- Running Experiments: Workflow tips for launching, iterating, and analysing simulations.
- Architecture Overview: Component map covering the main modules in
core/and how they interact.
If you only need the core library, import and construct Simulation directly with a normalised config dictionary from core.utils.
The exploratory_experiments directory contains settings for the main experiments
reported in the paper.
Scripts to run the default configurations for the Korean, Romance and Celtic
experiments reported in the supplementary materials can be found in the
paper_run.sh scripts in the respective directories.
Aravinth Kulanthaivelu and Richard Sproat. 2026. "Agent-based models for the evolution of morphological alternation patterns." arXiv.