Skip to main content

Model 1 - Trust Learning

Model 1 (two_timescale_reciprocity.py) is the baseline two-timescale model: agents update a scalar trust value per partner within a lifetime, and evolution selects on the inherited parameters that shape that learning.

Key findings

  • In one-shot interaction, cooperation collapses to zero. Selection drives trust_prior strongly negative (−0.86) — innate suspicion is favored because unconditional cooperators are exploited with no chance to learn.
  • In repeated interaction (80 rounds), cooperation stabilises near 98% within ~40 generations.
  • responsiveness is the decisive evolved trait (~2.46 in repeated play): agents who amplify learned trust sharply reward cooperators and punish defectors, locking in the reciprocal equilibrium.
  • learning_rate stays low (~0.19) in both conditions — fast forgetting is not favored because stable trust relationships are valuable.
  • The two timescales reinforce each other: evolution shapes the parameters that make learning effective, and learning produces the cooperation signal that selection acts on.

Learning during a lifetime

Each agent tracks partner-specific trust:

learned_trust[i, j]

The decision score combines inherited tendency and learned partner history:

score_i = trust_prior[i] + responsiveness[i] * learned_trust[i, j]
cooperate_i = score_i > 0.0

The trust update rule follows a Rescorla–Wagner style prediction-error update — see Appendix: Rescorla–Wagner style learning.

Evolution between generations

Agents reproduce proportional to lifetime payoff. Offspring inherit and mutate:

  • trust_prior
  • learning_rate
  • responsiveness

Payoff structure

Donation-game defaults:

  • benefit = 3.0
  • cost = 1.0

For ecological interpretation of why benefit can exceed cost in this abstraction, see Appendix: Ecological realism of benefit > cost.

For one-shot versus repeated scenario rationale, see Appendix: Why compare one-shot and repeated interaction?.

Cooperation theory context: For how this model relates to direct/network reciprocity and which mechanisms are out of scope, see Appendix: Cooperation mechanisms and model scope.

Results summary

MetricOne-shot (rounds=1)Repeated (rounds=80)
Final cooperation rate0.0000.979
Final mean payoff0.000313.300
Final trust prior−0.8641.447
Final learning rate0.0950.186
Final responsiveness1.1262.459
Display 1: Evolved parameter values and cooperation outcomes for Model 1 (Trust Learning) under one-shot and repeated interaction.

One-shot interaction

Without repeated contact, agents cannot learn who cooperates. Reciprocity never gets off the ground and cooperation collapses to zero.

One-shot cooperation
Display 1: Model 1 cooperation trajectory in one-shot interaction.
One-shot traits
Display 2: Model 1 evolved trait trajectories in one-shot interaction.

Repeated interaction

With 80 rounds per generation, cooperation stabilises near full (~98%).

Repeated cooperation
Display 3: Model 1 cooperation trajectory in repeated interaction.
Repeated traits
Display 4: Model 1 evolved trait trajectories in repeated interaction.

Summary

In one-shot interaction, cooperation collapses to zero. Evolution responds by driving trust_prior negative (−0.86): selection favors innate suspicion because unconditional cooperators are exploited. responsiveness stays moderate but is effectively irrelevant when there is nothing useful to learn in a single round.

In repeated interaction, cooperation stabilises near 98%. trust_prior rises to ~1.45 — selection favors agents who start out cooperative, because unconditional cooperators seed mutual cooperation with neighbors. responsiveness rises strongly to ~2.46 — agents who amplify what they have learned become sharply conditional, strongly rewarding cooperators and punishing defectors, reinforcing the reciprocal equilibrium.

The dip around generation 45–55 is a classic invasion event. A defector lineage briefly spreads, trust collapses, and cooperation crashes. The population recovers because reciprocators with high responsiveness re-establish cooperation faster than defectors can spread.

The two timescales reinforce each other in the repeated case. Learning makes cooperation individually rational within a lifetime. Evolution then favors the inherited traits (trust_prior, responsiveness) that make that learning work most effectively. In the one-shot case, the fast timescale provides no useful signal, so evolution strips away cooperative predispositions entirely.