Two-Timescale Simulations

These pages document the simulation suite implemented in the companion repository and summarize what each model contributes.

Working definition. These simulations test how within-lifetime learning and between-generation selection jointly shape cooperation under controlled social interaction structures.

The common architecture is:

Fast timescale: learning within lifetimes
Slow timescale: selection across generations
Shared topology: all three models place agents on a ring network — see Appendix: The ring network for the rationale and per-model neighbor counts.

Model progression

#	Script	Learning mechanism	Extra social features
1	two_timescale_reciprocity.py	Simple trust update (Rescorla-Wagner style)	None
2	two_timescale_q_learning.py	Q-learning (action-value learning)	None
3	two_timescale_extended.py	Q-learning	Reputation, partner choice, forgiveness

Display 1: Three-model progression in learning and social complexity.

Navigate the simulation docs

Core takeaway

Across all three models, cooperation is not a fixed trait. It is an adaptive outcome that depends on interaction structure, learning dynamics, and selective pressures acting over generations.

For which cooperation mechanisms are included and which are out of scope, see Appendix: Cooperation mechanisms and model scope.

What the theory page predicts — and what these simulations test

The theory page sets out a broader conceptual framework than any single simulation can cover. The table below maps each theoretical concept to its status in this simulation suite.

Theoretical concept	Status	Notes
Fast timescale — learning within lifetimes	✅ Implemented	All three models. Trust update (Model 1), Q-learning (Models 2–3).
Slow timescale — selection across generations	✅ Implemented	All three models. Payoff-proportional reproduction with mutation.
Selection on learning parameters	✅ Implemented	Evolution acts on `trust_prior`, `learning_rate`, `responsiveness`, `alpha`, `epsilon`, `gamma`, `initial_q_bias`, and social parameters.
Fitness landscape smoothing by learning	✅ Demonstrated	Agents discover cooperation during life, raising their fitness and guiding selection toward cooperation-friendly parameters.
Interaction regimes (learning accelerates / masks / opposes evolution)	⚠️ Partial	The one-shot vs repeated comparison tests the accelerating and masking regimes. The opposing regime (short-term defection winning) appears transiently as invasion events but is not isolated experimentally.
Baldwin effect — steps 1 & 2 (plasticity enables cooperation; selection favors learnability)	✅ Demonstrated	Agents that learn cooperation reproduce more; selection shifts the population toward parameter combinations that make learning succeed faster and more robustly.
Baldwin effect — step 3 (genetic assimilation: learned behavior becomes innate)	❌ Not implemented	Offspring always start with reset memories. Cooperation is never directly encoded in genes — it must be relearned every generation. Assimilation would require heritable memory or a genetically fixed cooperative action.
Testable prediction: repeated interaction → higher cooperation than one-shot	✅ Confirmed	All three models show markedly higher cooperation under repeated interaction.
Testable prediction: selection favors partner-discrimination parameters	✅ Confirmed	`responsiveness` and `rejection_threshold` evolve upward under repeated interaction.
Testable prediction: reputation mechanisms outperform partner-memory in stranger-rich environments	✅ Confirmed	Network diversity experiment shows the extended model dominates above ~50% stranger fraction.
Testable prediction: trust learning vs Q-learning produce different cooperation–payoff trade-offs	✅ Confirmed	Trust learning maximises cooperation rate; Q-learning maximises payoff by retaining exploration.

Display 2: Theory–simulation correspondence.

Model progression​

Navigate the simulation docs​

Core takeaway​

What the theory page predicts — and what these simulations test​

Model progression

Navigate the simulation docs

Core takeaway

What the theory page predicts — and what these simulations test