Peter Zhang

EC 119 F


These notes are for the latter half of Economics 119: Psychology and Economics, based on a collection of readings and lectures slides (4, 5, 6). Notes for the first half are here.



Games and Strategies

A game consists of

  • players: a list of all relevant players
  • moves: what each player can do
  • information: what each player knows
  • payoffs: vNM utilities as a function of every player’s actions

The objects of interest are

  • strategy: a contingent plan of action for each player
  • strategy profile: a collection of strategies for each player
  • solution concept: a strategy profile that is likely

We assume complete information by default. The normal form (or strategic form) of a game involves

  • a set of players: \(N\)
  • the pure-strategy space: \(S_i\)
  • a payoff function for each player \(i\) where \(u_i : S_1 \times ... \time S_n \to \mathbb{R}\)

Payoffs represent the players’ preferences, but aren’t necessary monetary. Models are underdetermined - if one hypothesis is consistent, an infinite number are.

We represent a finite two-player game with two moves as a matrix. In the example below, the “row” player has strategies \(T\) and \(B\), while the “column” player has strategies \(L\) and \(R\). The numbers represent the payoffs of player 1 and player 2.

A pure strategy involves taking actions for sure. A mixed strategy is a probability distribution over pure strategies. A strategy for \(i\) is the best response if there exist no other strategy that yields a higher expected payoff.

Dominance and Equilibrium

A Nash equilibrium is a strategy profile where every player’s strategy is a best response to the strategies played by the other players. A strategy for player \(i\) is strictly dominated if there’s another strategy with a greater payoff for all the strategy profiles of the other player(s). A strategy is weakly dominated if another strategy yields a payoff at least as high for all strategy profiles of the other players and a greater payoff for at least one.

Mixed Strategies and Reaction Functions

Nash equilibria are tricky. Suppose there is a \((1, 1)\) payoff in a two-player two-move game only if they both choose the same move, and \((0, 0)\) otherwise. In addition to the \((A, A)\) and \((B, B)\) cases, another equilibrium is a mixed strategy with equal probabilities for both. In general, each player has a mixed strategy that renders the other player indifferent, and if they simultaneously do so, they reach equilibrium.

A reaction function graphs each player’s best response as a function of the probability the other play puts on one of the actions. Visually, the intersections are Nash equilibria:

Level k Reasoning

A minimax strategy minimizes the maximum payoff of the opponent. In rock-paper-scissors, the optimal strategy is uniformly randomness. In practice, soccer players perform kickoffs consistently with minimax.

One way to model a player’s thought process is level-k reasoning:

  • L0: an ‘unsophisticated’ player who chooses based on some heuristic
  • L1: a strategy based on an opponent’s L0 behavior
  • L2: a strategy based on an opponent’s L1 behavior
  • and so on..

It’s most useful in modeling behavior in novel situations. A similar approach is to have level k players act based on distributions of lower types.

In beauty context, everyone writes down an integer between 0 and 100. The player with the number closest to 2/3rds of the average wins. The higher the level of a player, the lower the number they will guess. Some responses:

  1. Fixed point - zero is the unique equilibrium
  2. Iterated deletion - a rational player never chooses greater than 66; others will know this and never choose greater than 44; so on to 0
  3. Iterated best reply - all players think they’re one level deeper
  4. Iterated best reply II - players can think that other players are on more than one level

In real world experiments, players display behavior from each of these categories, with most people using level-k reasoning. Cognitive hierarchy involves a distribution of step \(k\) players assumed to be Poisson, where \(\lambda = 1.61\). Experimental results aren’t perfect because a categorization could be spurious.

Consider the 11-20 money request game. Each player can choose a number of dollars to receive between 11 and 20; they receive a $20 bonus if they ask for one dollar less than the other person. Most people displayed L1, L2, and L3 reasoning. Increasing the salience of $20 increases the rate of L1 reasoning.

Subjects are often better at coordinating than the mixed-strategy result. K-level reasoning explains why subjects can outperform the mixed-strategy Nash equilibrium.

In hide and seek, hiders must use k-level reasoning to predict which boxes seekers will view as salient. Subjects tend to see endpoints and distinctly labeled boxes as salient, leading both hiders and seekers to avoid those boxes.

In experiments, cognitive ability seems to be positively correlated with level. Agreeable and emotionally stable subjects also had higher levels.

Backward Induction

Backwards induction is about using expect future behave to induce present behavior. The extensive form of a game uses a game tree to represent the order of a game.

A decision node is a point where a player makes a move. The initial decision node is the first point. Each branch represents a choice, and the game ends at a terminal node. An information set links decision nodes so that a player only knows that a node in the information set has been reached. A strategy is sequentially rational if it is optimal at every point in the game tree. Every finite game with perfect information has a pure strategy Nash equilibrium that can be derived through backwards induction. At any given point of induction, the set of Nash equilibria that survive is called the set of Subgame Perfect Nash Equilibria.

Key point: a strategy is a contingent plan of action, which means strategies must be completely specified.

A subgame is a subset of game which begins at an information set containing a single decision node, contains only the successors, and recursively includes information sets. A strategy profile is a subgame perfect Nash equilibrium of a game if it induces a Nash equilibrium in every subgame.

The centipede and race to 100 games both test backward induction skills. Skilled chess players seem to perform better at backward induction.

Repeated Games

In the Chainstore paradox, backward induction implies that a store never has a logical reason to fight an entrant, even to deter future entry. In this sense, the result of SPNE is unconvincing. When the repeated game has multiple Nash equilibria, SPNE is less restrictive.

The grim strategy involves cooperating initially, and then defecting indefinitely whenever the other player first defects. In the infinitely repeated Prisoner’s Dilemma, the grim strategy is mutually preferable for high enough discount rate. This model predicts that firms will collude to set high prices if they are sufficiently patient.

The Folk Theorem states that for any feasible pair of individually rational payoffs \((\pi_1, \pi_2) >> (\not\pi_1, \not\pi_2)\), there exists a \(\delta\) such that \((\pi_1, \pi_2)\) are the average payoffs in SPNE. In experiments, subjects are more likely to cooperate if a game is more likely to continue. Across a longer time span, players may also bet on the chance that the other subject is a “grim” player.



The main games we will examine are the ultimatum/dictator games, public goods contribution games, and trust games.

In the ultimatum game, player 1 proposes a division of $10 and player 2 can either accept or reject the division. In the dictator game, player 2 has no such discretion. In both games, subjects chose to make large, fair offers between 40% and 50%. In experiments, men split more when there is a face which may be because men are less socially aware by default.

In public goods games, people contribute in the hopes of conditional cooperation. The table below displays the incentives for free-riding - if consumers \(1, 2\) split the cost of a public good from which they receive benefit \(b_i\), they each have incentives to not contribute if \(b_i < c\).

In an experiment, subjects that were paid by earning ranking or who simply knew of their rank quickly reduced their contribution to the public good. Across multiple rounds, cooperation decreases among both partners and strangers, suggesting that subjects care about future periods.

In trust games with payback, trust was not reciprocated in over half of cases, yet subjects who knew this still sent money. When individuals are sorted and paired by trustworthiness, they become more cooperative over time. Women seem to return more. Seeing a picture induces more trusting, and senders will pay to see a picture. In simulations of labor markets, a higher wage or thoughtful gift is reciprocated with more work.

Fairness and Norms

People have a preferences for redistribution. The motives come primarily from self-interest and secondarily from income uncertainty; WTP is 0.4% of ones own payoff for a 10% decrease in inequality. Attitudes about social welfare can be easily induced with cues.

Queues are seen as the most fair way to allocate tickets, followed by a lottery and then by an auction. People think raising prices is more fair than opportunistically lowering wages. We tend to weigh actual losses greater than opportunity costs. Removing a discount is seen as much more fair than a price hike.

In games, people will use reputations or punishments to enforce social norm. The Fehr-Schmidt utility function models inequality aversion as follows, where \(\alpha\) is the amount that \(i\) suffers from disadvantageous inequality and \(\Beta_i\) is the amount that \(i\) suffers from advantageous inequality.

Since \(\Beta_i\) is constrained to be less than 1, players will always accept advantageous offers. For higher \(\alpha\), players are more stringent about equality. Other models are based on ERC (equity, reciprocity and competition) and the total/lowest payoffs.

Incentives and Crowding Out

In some settings, cash incentives can crowd out intrinsic motivation to do the right thing. In others, they can crowd in intrinsic incentives. A famous example of a daycare found that charging a fine for lateness increased the number of late-coming parents, with an effect persisting after the fine was lifted. A similar result was found for building a nuclear waste repository.

Two explanations are self-determination (substitution of motivators) and self-esteem (their intrinsic motivation isn’t being recognized). If external interventions are controlling rather than supporting, then self-determination and self-esteem are impaired.

We can model social norm with a modified utility function: \(U = u(c) - \gamma (x - \bar{x})^2\) Notice that losses from deviations are symmetric and increase non-linearly. When researchers measured and modeled social acceptability in the dictator and bully games, the models matched the data very well.

Prosocial incentives seem to be less effective than selfish incentives, mandatory prosocial incentives even less so. In the field, pay-what-you-want models with proportional donations to charities elicit the most payment.



Beliefs are important for behavior, and beliefs update with new information. Bayes’ Theorem models the mathematically correct way to update events: \(Pr(A|B) = \frac{Pr(B|A)Pr(A)}{Pr(B)}\) People are bad at applying Bayesian reasoning.

Biases Galore

They also are bad at randomizing, with a bias towards the number 7 and the color blue. People suffer from the gambler’s fallacy (after three heads in a row, the chance of another is <50%) and the law of small numbers (introducing extra switches in random sequences). Even asylum judges, loan officers, and MLB umpires suffer from these biases. In contrast, the hot hand fallacy causes people to think that random processes enter a “hot” state, especially in sports.

People are insensitive to sample size, failing to understand the law of large numbers. Subjects also suffer from correlation neglect, failing to detect and use correlating information. Simultaneously, people are overconfident in precision and ability, giving overly narrow confidence intervals. Men seem to be more overconfident, on average.

People experience motivated beliefs as a source of optimism or social signaling. They use selective memory, conservative updating of beliefs, asymmetric updating, and information avoidance to improve their self-image or maintain beliefs. In experiments, subjects ignore negative feedback later on, but will consider it with high enough incentives.

Alerting a DM to the expectation of donating causes them to avoid the ask. Similarly, telling them that lying is widespread causes them to lie more.

The conjunction effect is a bias in which people substitute a question about likelihood for a question about representativeness. One explanation is poor wording, but even raising the stakes and picking informed audiences doesn’t erase the effect.

People are bad at portioning probability - especially when asked to consider small portions, they drastically overestimate probabilities. They suffer from projection bias, with shoppers overestimating their hunger. They also suffer diversification bias, preferring a variety only when planning ahead. A model of projection bias for a future state \(s\) is \(\hat{u}(c, s) = (1-\alpha)u(c, s) + \alpha u(c, s')\) where \(\alpha\) is a measure of projection bias and \(s'\) is the current state.


The persuasion rate for a behavioral outcome can be expressed as \(f = 100 \times \frac{y_T - y_C}{e_T - e_C} \frac{1}{1-y_0}\) where:

  • \(T\) and \(C\) refer to the treatment and control groups, respectively.
  • \(e_i\) is the share of group \(i\) that receives the message.
  • \(y_i\) is the share of group \(i\) that adopts the behavior
  • \(y_0\) is the share that would adopt without the message, usually approximated as \(y_c\) when unknown

Persuasion can be mediated by beliefs (with updating), preferences (advertising affect utility), and information (costs of acquiring info and paying attention).

Dialectic Model

The DM makes a best guess about the state of the world. They maintain two frames with means \(\mu_L, \mu_H\) and widths \(w_L, w_H\). The DM’s belief is the weighed average of the two: \(\hat{u} = \frac{w_H^s}{w_L^2 + w_H^s} \mu_L + \frac{w_L^2}{w_L^s + w_H^2} \mu_H\) where \(s\) tracks the level of skepticism. A skeptical DM punishes width but people vary, which explains why politically, conservatives may adopt the strategy of extreme beliefs, appealing to credulous DMs. People also don’t react to more of existing evidence.

Information Cascades

People close ot each other display conformity. Suppose a new behavior has an adaption cost of \(\frac{1}{2}\) and a value \(v\), either 0 or 1 with equal probability. Suppose that each individual has a private signal that is correct with \(p > \frac{1}{2}\). If one person reacts to their private signal, a second person updates their beliefs to be in conformity. Because of these cascades, people can ignore their private signals.

In lab experiments, people quickly form cascades. In terms of incentives, majority rule institutions had the fewest information cascades, followed by individualistic institutions and conformity-rewarding institutions. One form of spillover is awareness: even bad reviews of unknown authors increase sales.

Built with Jekyll on the Swiss theme.