LanguageEvolution

LanguageEvolution is a composable framework for simulating morphological language change across a social network of agents. It combines configurable agent policies, phonological cascades, and iterated learning dynamics to explore how morphology evolves over time.

Overview

At the core of the project is core/Simulation, which wires together a population of agents, their contact network, and a metrics pipeline. Simulations are configured through YAML files (stored in experiments/) and executed with run.py. Outputs land in data/ and include per-agent lexicons, population-level statistics, and visualisations of morphological change.

Key capabilities

Config-driven simulations covering language templates, phonology, networks, and output control.
Pluggable adoption, induction, compression, and contact policies for agents.
Support for scheduled events such as network rewiring and phonological cascade injections.
Optional iterated learning model (ILM) with learner turnover and mentorship assignment.
Rich metrics collection with plots, GIFs, and tabular exports for downstream analysis.

Quick start

Create and activate a virtual environment.
Install dependencies from the project root.
Launch the default experiment.
Inspect the results under data/default/.

Running python run.py with no argument automatically loads experiments/default.yml.

Reproducing the experiments

The experiment sweep, result tables, and figures are reproduced through the pipeline in exploratory_experiments/; see exploratory_experiments/README.md for the stage-by-stage guide.

Repository layout

run.py: CLI entry point that loads configs, seeds RNGs, and executes Simulation.
core/: Library code implementing agents, networks, phonology, schedulers, caching, and metrics.
experiments/: YAML configs you can run directly or use as templates.
exploratory_experiments/: Pipeline for the experiment sweep, result tables, and figures.
AIHistoricalLinguist/: The AI Historical Linguist (AIHL) assessor/critic evaluation tool.
data/: Output directory (created at runtime) containing metrics, images, and caches.
tests/: Test suite covering adoption, agent behaviour, cascade events, ILM, and more.
requirements.txt: Dependency lock for the Python environment.
docs/: Additional documentation (configuration guide, architecture notes, and workflow tips).

Running simulations

Simulations are launched by pointing run.py at a config file:

python run.py path/to/experiment.yml

Behind the scenes core.utils.build_merged_config() merges your overrides with experiments/default.yml, normalises aliases, and derives subsystem seeds. The resulting dictionary instantiates core.simulation.Simulation, which:

Builds the agent population and contact network.
Sets up the scheduler, metrics collector, and optional ILM manager.
Executes the main loop for the configured number of iterations.
Writes metrics and visualisations according to output settings.

To customise a run, copy one of the configs in experiments/, edit the desired sections, and run it with run.py. See Running Experiments for a step-by-step walkthrough.

Configuration at a glance

The configuration schema is split into seven major sections:

seed: Global RNG seed (derives subsystem seeds automatically).
language: Morphological templates, affixes, dialect diversity, and phonology rules.
agent: Adoption/contact policies and the induction/compression schedule.
population: Number of agents, utterance length, network topology, and caching.
simulation: Iteration count, scheduler, and time-indexed events (e.g., cascade injections).
ilm: Iterated learning parameters (optional).
output: Logging cadence and which artefacts to write.

For field-by-field documentation, refer to the Configuration Guide. It also covers advanced features such as phonological cascade events and ILM learners.

Output artefacts

Each run writes to data/<run_name>/ (where <run_name> defaults to output.data_dir):

agent_stats/: Per-agent morphology images, lexicon dumps, signatures, and optional detailed logs.
pop_stats/: Population-level timeseries, morphology grids, signature prevalence tables, and network frames.
summary_stats/: Aggregate plots (population_complexity.png, population_entropy.png, etc.) and optional GIFs tracing morphology over time.
_cache_populations/: (Optional) cached initial morphologies keyed by language parameters and seed.

See Running Experiments for tips on managing output volume and leveraging caches during rapid iteration.

Experiment library

Ready-to-run configs in experiments/ include:

default.yml: Baseline settings with full metrics output.
default_sfit.yml: Shifts adoption toward stem targets.
default_sfit_afit.yml: Balances stem and affix adoption pressure.
default_sfit_afit_randaffix.yml: Similar to the above without splitting explicit affixes.
cumulative_adoption.yml: Illustrates cumulative adoption dynamics.
objective_adoption.yml: Demonstrates objective-based adoption tweaks.
cascade_events.yml: Adds mid-run phonological cascade introductions.
semantics_router.yml: Enables semantic routing during adoption.
ilm.yml: Activates the iterated learning model with learners and mentorship.

Run any experiment with python run.py experiments/<name>.yml.

Development workflow

New policies or events; update the relevant factory maps and document parameters in your config.
When adding phonology rules, implement them in core/phonology/rule_library.py so they can be referenced by name in configs.
Heavy outputs can slow iteration—temporarily disable them via the output section while developing new features.

Experimental runs

The exploratory_experiments directory contains settings for the main experiments reported in the paper.

Scripts to run the default configurations for the Korean, Romance and Celtic experiments reported in the supplementary materials can be found in the paper_run.sh scripts in the respective directories.

Reference

Aravinth Kulanthaivelu and Richard Sproat. 2026. "Agent-based models for the evolution of morphological alternation patterns." arXiv.

Name		Name	Last commit message	Last commit date
Latest commit History 351 Commits
AIHistoricalLinguist		AIHistoricalLinguist
analysis		analysis
celtic		celtic
core		core
diagrams		diagrams
docs		docs
empirical_stem_allomorphy		empirical_stem_allomorphy
examples		examples
experiment_master		experiment_master
experiments		experiments
exploratory_experiments		exploratory_experiments
korean		korean
romance		romance
tests		tests
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LanguageEvolution

Table of contents

Overview

Key capabilities

Quick start

Reproducing the experiments

Repository layout

Running simulations

Configuration at a glance

Output artefacts

Experiment library

Development workflow

Further reading

Experimental runs

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LanguageEvolution

Table of contents

Overview

Key capabilities

Quick start

Reproducing the experiments

Repository layout

Running simulations

Configuration at a glance

Output artefacts

Experiment library

Development workflow

Further reading

Experimental runs

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages