Skip to content

hxrts/evo-ipd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Genetic Algorithm for Iterated Prisoner's Dilemma Strategy Optimization

This program implements a genetic algorithm to discover effective strategies for the Iterated Prisoner's Dilemma (IPD) game, competing against classic strategies like Tit-for-Tat, Always Cooperate, Always Defect, and Grudger.

Iterated Prisoner's Dilemma

The Iterated Prisoner's Dilemma is a fundamental game theory problem where two players must choose to either cooperate or defect, with payoffs determined by their combined choices. In the iterated version, the game is played repeatedly, allowing for the development of complex strategies based on the player's history of interaction.

Axelrod's Tournament

In 1980, political scientist Robert Axelrod organized a computer tournament to study effective strategies for the IPD. He invited academics to submit strategies, which would complete in a round-robin competition. Surprisingly, the winning strategy was a simple Tit-for-Tat implementation that cooperated on the first move and then copied the opponent's previous move.

The tournament was scored according to the following payoff matrix:

Player 2: Cooperate Player 2: Defect
Player 1: Cooperate (3,3) (0,5)
Player 1: Defect (5,0) (1,1)

Axelrod found that successful strategies shared common traits. They never defected first. They retaliated against defection but forgave and restored cooperation. Simple strategies often outperformed complex ones.

Tournament Implementation

This implementation extends Axelrod's framework with evolutionary optimization. Each strategy plays every other strategy for 200 rounds. Strategies accumulate points according to the payoff matrix. Total scores determine both tournament rankings and evolutionary fitness. Unlike the original tournament, strategies evolve and improve across generations.

Genetic Algorithms

Genetic algorithms use variation and selection to find optimal solutions. They maintain a population of potential solutions and iteratively improve them through an evolutionary process.

The algorithm works in five steps. First, a population represents possible solutions encoded as a "genome". Second, individuals are evaluated based on fitness. Third, better performers are selected as parents. Fourth, offspring are created through crossover (combining parent solutions) and mutation (random modifications). Fifth, new populations replace old ones while preserving top performers.

Genetic algorithms suit strategy optimization because the solution space is too large to search exhaustively. Small strategy changes are meaningful. Solutions encode naturally as probability sequences. Tournament performance provides a well-defined fitness function.

Implementation Details

Strategies are represented by a genome of probabilities. The genome determines cooperation probability based on the history of previous moves. Strategies retain memory of the last N moves (default: 2).

Genetic operations include crossover, where parent strategies combine their genomes to create children. Mutation introduces random changes to maintain diversity. Positive selection preserves the best performing strategies across generations.

Each strategy plays all others in round-robin tournaments. Strategies accumulate scores based on the payoff matrix. Tournament results determine which strategies survive and reproduce.

Base Strategies

The program includes five classic strategies:

Always Cooperate always plays cooperate. Always Defect always plays defect. Tit-for-Tat cooperates on the first move, then copies the opponent's last move. Grudger cooperates until the opponent defects, then always defects. Random randomly chooses between cooperate and defect.

Configuration

The evolutionary process accepts these parameters:

Population Size sets the number of strategies per generation (default: 50). Number of Generations specifies how many generations to evolve (default: 5). Mutation Rate controls the probability of genome mutations (default: 0.1). Survival Count determines how many top strategies are preserved (default: 5). Tournament Rounds sets the number of rounds per game (default: 200).

Building and Running

Build the program:

nix build

Run a single tournament with base strategies only:

nix run .#

Run the tournament with strategy evolution:

nix run .# -- evolve

Results

The program outputs detailed statistics for each generation. It shows how strategies evolve and improve over time. The final tournament compares evolved strategies against the classic base strategies to evaluate their effectiveness.

About

Evolutionary solution to Robert Axelrod's IPD tournament

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published