Spatial Prisoner's Dilemma
Spatial Prisoner's Dilemma is the local-game evolved-cooperation case study on this site. It asks a different question from Spatial Altruism and Cooperative Hunting: what kinds of inherited response rules spread when agents repeatedly face a local Prisoner's Dilemma, can move only when isolated, and reproduce into nearby empty cells using the energy they accumulated from those games?
The Puzzle
The classic Prisoner's Dilemma makes selfish temptation obvious.
- mutual cooperation is better than mutual defection
- unilateral defection beats cooperating against a cooperator
- unconditional cooperation can therefore be exploited
- unconditional defection can destroy the mutual gain that makes cooperation attractive in the first place
This model adds three complications that matter evolutionarily.
- interactions are local rather than well mixed
- successful agents reproduce into nearby empty cells
- the default encoding lets one agent use one strategy against same-trait neighbors and another against other-trait neighbors
So the question is not just whether cooperation can ever appear in a matrix game. The question is whether local selection in a sparse spatial ecology favors unconditional defection, unconditional cooperation, or more conditional inherited rules.
What Kind Of Model This Is
This is a spatial agent ecology with selection on inherited discrete strategies, not a model of reasoning, bargaining, or within-lifetime learning.
It represents a toroidal lattice in which:
- each occupied cell contains one agent and empty cells remain available for later movement or reproduction
- each agent has an inherited trait label in
0..trait_count-1 - each agent also carries inherited response rules for Prisoner's Dilemma encounters
- energy rises or falls through local game payoffs, movement cost, reproduction cost, and cost of living
- offspring inherit the parental trait and strategy encoding, optionally with mutation
One useful intuition is to imagine a population of lineages spreading across local empty space. Game success does not matter by itself. It matters because local game success changes which lineages survive long enough to reproduce into neighboring cells.
The Frozen Replay Setup
The browser replay below is a specific seeded run from the frozen website-demo configuration, not a schematic animation. It uses:
- a
60 × 60toroidal grid 576initial agents from16%initial occupancy- a hard cap of
1800agents from50%carrying-capacity occupancy - initial energy drawn from a Gaussian with mean
50.0, standard deviation10.0, and minimum5.0 - Prisoner's Dilemma payoffs
CC = 3.0,CD = -1.0,DC = 5.0,DD = 0.0 4inherited trait labels- default same-vs-other contingent encoding with
pure_strategy = falseandstrategy_per_trait = false - no mutation in the frozen replay run
- random seed
0 200simulation steps sampled every4steps
In the viewer, empty cells are beige. Occupied cells are colored by the strategy each agent uses against same-trait neighbors:
- light blue for co-op
- rust red for defect
- deep blue for tit-for-tat
- ochre for random
That means the replay is intentionally showing one layer of the state rather than every variable at once. The underlying export also keeps each agent's trait label and separate other-trait response.
Display 1: Frozen Overview
Display 1 gives one static read of the same frozen run used by the replay below. It combines three sampled lattice states with the full same-trait strategy history from the canonical export.
Three things are easiest to see in Display 1.
- the lattice fills very quickly from the sparse start to the
1800-agent cap - same-trait tit-for-tat becomes the largest family by the middle of the run and stays there
- the end state remains mixed rather than collapsing to one universal rule
How One Step Works
Before introducing the formal notation later on, it helps to understand what one update does.
- Every agent searches its eight Moore-neighborhood cells and draws a local die roll.
- Each neighboring pair is ordered by that roll, so one side challenges and the other responds.
- The Prisoner's Dilemma is resolved across eight directional subrounds, with directional tit-for-tat memory slots if that strategy is active.
- Agents that played no games pay the travel cost and may compete to move into an adjacent empty cell.
- Agents above the reproduction threshold may compete to place offspring into adjacent empty cells.
- Newborns are appended to the population.
- Agents beyond the hard population cap are culled.
- Remaining non-newborn agents pay the environmental cost of living.
The evolutionary force therefore comes from a full local ecology rather than from the payoff matrix alone. Payoffs matter because they change energy, and energy matters because it changes survival and local reproduction.
Why Conditional Strategies Matter Here
The default encoding does not force one universal rule for all encounters. Each agent can respond one way to same-trait neighbors and another way to other-trait neighbors.
That matters because:
- repeated local contact makes reciprocal strategies meaningful rather than purely abstract
- local reproduction lets successful neighborhoods clonally spread into adjacent empty space
- fallback movement helps isolated agents search for interaction opportunities early in the run
- once the world approaches the hard cap, movement fades and local encounter success does more of the selection work
In the frozen replay, this does not produce universal cooperation. It produces a mixed regime in which tit-for-tat becomes the largest same-trait strategy family, but co-op, random, and defect all remain present.
Interactive Replay
The browser replay below is based on sampled frames from that same frozen configuration.
The canonical implementation and supporting export logic live in the EvolvedCooperation repository:
Evolved Cooperation
Spatial Prisoner's Dilemma
Sampled browser replay of the canonical Python model. Colors show the strategy each agent uses against same-trait neighbors; trait identity and the other-trait response remain part of the exported state but are not mapped to separate colors here.
Replay
World State
Reading The Replay
This frozen run is easiest to read in three phases.
Phase 1: a sparse world fills very quickly
At step 0, the run starts with 576 occupied cells and mean energy 50.564. The initial same-trait strategy mix is already varied: 141 co-op, 130 defect, 171 tit-for-tat, and 134 random. Because the world begins sparse, many agents still have empty neighboring cells available for expansion. By step 25, the population has already risen to 1799, mean energy has climbed to 103.966, and same-trait tit-for-tat has grown to 613 agents while same-trait defect reaches 312.
Phase 2: local interaction replaces exploration
By step 50, the world reaches the hard cap of 1800 agents and mean energy has risen further to 141.273. At that point, movement is almost gone because most agents no longer spend whole steps isolated from play. In the canonical history, step 50 records 5345 interaction pairs but only 1 successful move. From that point onward, most of the selection pressure comes from local game payoffs and reproduction rather than from spatial exploration.
Phase 3: a mixed conditional regime persists
The late replay is not a march to one universal rule. By step 200, same-trait tit-for-tat is the largest same-trait family at 626, defect is the smallest at 299, co-op remains at 457, and random remains at 418. The encoding mix also stays mostly contingent rather than collapsing to pure same-and-other symmetry: 1359 agents are contingent at step 200, versus 441 pure encodings.
So the replay does not show cooperation simply taking over. It shows a stable local ecology in which conditional reciprocity expands strongly in same-trait encounters, but multiple inherited response rules continue to coexist.
Formal Ingredients
The narrative above is enough to follow the core result. The definitions below give the exact implementation and notation used by the underlying model.
State Variables
- agent
i: position(x_i, y_i), energyE_i, traitt_i, strategy vectors_i - directional memory slot
m_i(d): remembered last response used by the neighbor previously encountered in directiond - empty cells: available targets for movement and local reproduction
Interpretation:
t_iis a categorical inherited trait label in0..trait_count-1s_istores discrete strategies for encounters with each trait category- under the default website-demo encoding, agents effectively store one strategy for same-trait encounters and one strategy for other-trait encounters
Strategy Encoding
Available strategies are:
always_cooperatealways_defecttit_for_tatrandom
The frozen replay uses the default contingent encoding:
pure_strategy = falsestrategy_per_trait = false
So each agent stores one same-trait strategy and one other-trait strategy. The summary identifier is:
strategy_id = 10 × same_trait_strategy + other_trait_strategy
That is why a strategy bin such as 20 means tit-for-tat against same-trait neighbors and co-op against other-trait neighbors.
Encounter Rule
Each step begins with a neighborhood search over the wrapped Moore neighborhood.
- each agent draws one local roll
- the higher roll challenges and the lower roll responds
- ties are broken by higher agent ID
- each directed neighborhood edge is resolved at most once per step
If tit-for-tat is active, the remembered response in the relevant directional slot is used for the next choice against that same neighbor in that same direction.
Payoff Rule
For one pairwise encounter, the payoff matrix is:
CC = 3.0
CD = -1.0
DC = 5.0
DD = 0.0
Interpretation:
- mutual cooperation rewards both agents
- unilateral defection gives the defector the highest single-round payoff
- the cooperator against a defector is punished
- mutual defection is the zero baseline
These payoffs are added directly to agent energy before movement, reproduction, and environmental punishment are applied.
Movement, Reproduction, And Punishment
Demographic rules in the frozen replay are:
- only agents with zero games played that step may attempt movement
- movement costs
travel_cost = 0.5 - reproduction requires at least
reproduce_min_energy = 100.0 - a successful birth costs
reproduce_cost = 50.0 - births claim adjacent empty cells competitively
- newborns skip same-step cost of living
- surviving non-newborn agents pay
cost_of_living = 1.0 - population is truncated at the hard cap implied by
carrying_capacity_fraction = 0.5
This is what makes the model evolutionary rather than only game-theoretic. Local lineages do not just win or lose abstract payoffs. They win or lose space.
Why This Belongs Under Evolved Cooperation
This model belongs under evolved cooperation because the changing object is an inherited strategy distribution across generations.
- agents do not learn new policies from reward during their lifetime
- offspring inherit strategy encodings and trait labels from successful parents
- the frequency of those encodings changes through local survival and reproduction
- the result is therefore population-level selection, not within-lifetime adaptation
Compared with the learned repeated Prisoner's Dilemma pages elsewhere on the site, the key difference is the timescale. Here, cooperation changes because inherited strategies become more or less common, not because an individual agent updates a policy from experience.
Conclusions
This case study gives a clear local-selection answer to the question of how cooperation can emerge in a spatial Prisoner's Dilemma ecology.
- cooperation does not emerge here as one universal rule; the frozen run ends in a mixed ecology rather than in pure cooperation or pure defection
- the strongest directional shift in this run is toward same-trait tit-for-tat, which rises from
171agents at step0to626by step200 - unconditional defection does not dominate the frozen run even though the single-shot payoff temptation is real; same-trait defect rises early but ends as the smallest same-trait family at
299 - contingent encodings remain the majority at the end of the run, so the model favors conditional inherited responses more than it favors collapsing same-trait and other-trait behavior into one pure rule
- the broader lesson fits the rest of this section of the site: cooperation can emerge through selection when local interaction, local reproduction, and ecological turnover make certain inherited response patterns reproductively advantageous
References
- Axelrod, R., & Hamilton, W. D. (1981). The Evolution of Cooperation. Science, 211(4489), 1390-1396. https://doi.org/10.1126/science.7466396
- doesburg11. (2026). EvolvedCooperation: spatial_prisoners_dilemma module, frozen website-demo config, and replay exporter. GitHub. https://github.com/doesburg11/EvolvedCooperation/tree/main/spatial_prisoners_dilemma
- zeyus-research. (n.d.). FLAMEGPU2-Prisoners-Dilemma-ABM. GitHub. https://github.com/zeyus-research/FLAMEGPU2-Prisoners-Dilemma-ABM