Zhu Han, Dusit Niyato, Walid Saad, Tamer Basar, and Are Hjorungnes

Yüklə 1,94 Mb.

tarix	16.08.2018
ölçüsü	1,94 Mb.
	#63220

Zhu Han, Dusit Niyato, Walid Saad,
Tamer Basar, and Are Hjorungnes

Introduction to Game Theory: Lecture 1

Introduction to Game Theory: Lecture 1
Noncooperative Game: Lecture 1, Chapter 3
Bayesian Game: Lecture 2, Chapter 4
Differential Game Lecture 3, Chapter 5
Evolutional Game: Lecture 4, Chapter 6
Cooperative Game: Lecture 5, Chapter 7
Auction Theory: Lecture 6, Chapter 8
Game Theory Applications: Lecture 7, Part III
Total Lectures are about 8 Hours

John von Neuman (1903-1957) co-authored, Theory of Games and Economic Behavior, with Oskar Morgenstern in 1940s, establishing game theory as a field.

John von Neuman (1903-1957) co-authored, Theory of Games and Economic Behavior, with Oskar Morgenstern in 1940s, establishing game theory as a field.
John Nash (1928 - ) developed a key concept of game theory (Nash equilibrium) which initiated many subsequent results and studies
Since 1970s, game-theoretic methods have come to dominate microeconomic theory and other fields
Nobel prizes

Nobel prize in Economic Sciences 1994 awarded to Nash, Harsanyi (Bayesian games) and Selten (subgame perfect equilibrium)
2005, Auman and Schelling got the Nobel prize for having enhanced our understanding of cooperation and conflict through game theory
2007 Leonid Hurwicz, Eric Maskin and Roger Myerson won Nobel Prize for having laid the foundations of mechanism design theory.

Game theory – mathematical models and techniques developed in economics to analyze interactive decision processes, predict the outcomes of interactions, identify optimal strategies

Game theory – mathematical models and techniques developed in economics to analyze interactive decision processes, predict the outcomes of interactions, identify optimal strategies
Game theory techniques were adopted to solve many protocol design issues (e.g., resource allocation, power control, cooperation enforcement) in wireless networks.
Fundamental component of game theory is the notion of a game.

A game is described by a set of rational players, the strategies associated with the players, and the payoffs for the players. A rational player has his own interest, and therefore, will act by choosing an available strategy to achieve his interest.
A player is assumed to be able to evaluate exactly or probabilistically the outcome or payoff (usually measured by the utility) of the game which depends not only on his action but also on other players’ actions.

Non-cooperative static games:

Non-cooperative static games:

Sports: zero sum game. Boxing: example of equilibrium
Mandayam and Goodman (2001)
Virginia tech

Repeated games: play multiple times

Threat of punishment by repeated game. MAD: Nobel prize 2005.
Tit-for-Tat (infocom 2003):

Dynamic games: (Basar’s book)

ODE for state, Optimization utility over time, HJB and dynamic programming
Evolutional game (Hossain and Dusit’s work)

Stochastic games (Altman’s work)
Cooperative Games

Nash Bargaining Solution
Coalitional Game

Book of Myerson (Nobel Prize 2007), J. Huang, H. Zheng, X. Li

Book of Myerson (Nobel Prize 2007), J. Huang, H. Zheng, X. Li

Basics of definition

Basics of definition
Game in Strategic Form

Dominating strategy
Nash equilibrium
Mixed strategy
Static continuous-Kernel game

Dynamic noncooperative game

Extensive form
Repeated game
Stochastic game

Special game

Potential game
Stackelberge game
Correlated equilibrium
Supermodular game
Wardrop game

Summary

A game in strategic (normal) form is represented by three elements

A game in strategic (normal) form is represented by three elements

A set of players N
Set of strategies of player Si
Set of payoffs (or payoff functions) Ui

Notation si strategy of a player i while s-i is the strategy profile of all other players
Notice that one user’s utility is a function of both this user’s and others’ strategies
A game is said to be one with complete information if all elements of the game are common knowledge. Otherwise, the game is said to be one with incomplete information, or an incomplete information game.

Two suspects in a major crime held for interrogation in separate cells

Two suspects in a major crime held for interrogation in separate cells

If they both stay quiet, each will be convicted with a minor offence and will spend 1 year in prison
If one and only one of them finks, he will be freed and used as a witness against the other who will spend 4 years in prison
If both of them fink, each will spend 3 years in prison

Components of the Prisoner’s dilemma

Rational Players: the prisoners
Strategies: Stay quiet (Q) or Fink (F)
Solution: What is the Nash equilibrium of the game?

Representation in Strategic Form

Dominant strategy is a player's best strategy, i.e., a strategy that yields the highest utility for the player regardless of what strategies the other players choose.

Dominant strategy is a player's best strategy, i.e., a strategy that yields the highest utility for the player regardless of what strategies the other players choose.
A Nash equilibrium is a strategy profile s* with the property that no player i can do better by choosing a strategy different from s*, given that every other player j ≠ i .
In other words, for each player i with payoff function ui ,
No user can change its payoff by Unilaterally changing its strategy, i.e., changing its strategy while s-i is fixed

Does the Nash equilibrium always exist?

Does the Nash equilibrium always exist?
Is it efficient ?
One measure of efficiency is Pareto optimality

A payoff vector x is Pareto optimal if there does not exist any payoff vector y such that
y ≥ x
with at least one strict inequality for an element yi

In some references a strategy profile s that achieves a Pareto optimal payoff distribution can sometimes be referred to as a Pareto optimal strategy

Centralized system: In a centralized system, one seeks to find the social optimum (i.e., the best operating point of the system), given a global knowledge of the parameters. This point is in many respect efficient but often unfair.

Centralized system: In a centralized system, one seeks to find the social optimum (i.e., the best operating point of the system), given a global knowledge of the parameters. This point is in many respect efficient but often unfair.
Decentralized: When the players act noncooperatively and are in competition, one operating point of interest is the Nash equilibrium. This point is often inefficient but stable from the players’ perspective.
The Price of Anarchy (PoA), defined as the ratio of the cost (or utility) function at equilibrium with respect to the social optimum case, measures the price of not having a central coordination in the system
PoA is, loosely, a measure of the loss incurred by having a distributed system!

So far we assumed that the players make deterministic choices from their strategy spaces

So far we assumed that the players make deterministic choices from their strategy spaces
Strategies are pure if a player i selects, in a deterministic manner (probability 1), one strategy out of its strategy set Si
Players can also select a probability distribution over their set of strategies, in which cases the strategies are called mixed
Nash 1950

Every finite strategic form N-player game has a mixed strategy Nash equilibrium

Define σi as a probability mass function over Si, the set of actions of player i

Define σi as a probability mass function over Si, the set of actions of player i
When working with mixed strategies, each player i aim to maximize their expected payoff
Mixed strategies Nash equilibrium

For a general N-player game, finding the set of NEs is not possible in polynomial time!

For a general N-player game, finding the set of NEs is not possible in polynomial time!

Unless the game has a certain structure

Some existing algorithms

Fictitious play (based on empirical probabilities)
Iterative algorithms (can converge for certain classes of games)
Best response algorithms

Popular in some games (continuous kernel games for example)

Useful Reference

D. Fundenberg and D. Levine, The theory of learning in games, the MIT press, 1998.

Action (strategy) sets have uncountably many elements

Action (strategy) sets have uncountably many elements

For example, strategies are intervals

Similar to the definitions before

Best response
Nash equilibrium

Interference channel SINR

Interference channel SINR

Existence and uniqueness of Nash equilibrium depend on the utility function design

Existence and uniqueness of Nash equilibrium depend on the utility function design

Basics of definition

Basics of definition
Game in Strategic Form

Dominating strategy
Nash equilibrium
Mixed strategy
Static continuous-Kernel game

Dynamic noncooperative game

Extensive form
Repeated game
Stochastic game

Special game

Potential game
Stackelberge game
Correlated equilibrium
Supermodular game
Wardrop game

In dynamic games, the notion of time and information is important

In dynamic games, the notion of time and information is important

The strategic form cannot capture this notion
We need a new game form to visualize a game

In extensive form, a game is represented with a game tree.
Extensive form games have the following four elements in common:

Nodes: This is a position in the game where one of the players must make a decision. The first position, called the initial node, is an open dot, all the rest are filled in. Each node is labeled so as to identify who is making the decision.
Branches: These represent the alternative choices that the player faces, and so correspond to available actions.

3. Payoffs: These represent the pay-offs for each player, with the pay-offs listed in the order of players.

3. Payoffs: These represent the pay-offs for each player, with the pay-offs listed in the order of players.

When these payoff vectors are common knowledge the game is said to be one of complete information.
If, however, players are unsure of the pay-offs other players can receive, then it is an incomplete information game.

4. Information sets: When two or more nodes are joined together by a dashed line this means that the player whose decision it is does not know which node he or she is at. When this occurs the game is characterized as one of imperfect information.

When each decision node is its own information set the game is said to be one of perfect information, as all players know the outcome of previous decisions.

While the normal form gives the minimum amount of information necessary to describe a game, the extensive form gives additional details about the game concerning the timing of the decisions to be made and the amount of information available to each player when each decision has to be made.

While the normal form gives the minimum amount of information necessary to describe a game, the extensive form gives additional details about the game concerning the timing of the decisions to be made and the amount of information available to each player when each decision has to be made.
For every extensive form game, there is one and only one corresponding normal form game. For every normal form game, there are, in general, several corresponding extensive form games.
Every finite extensive form game of perfect information has a pure strategy Nash equilibrium.

A subgame of a dynamic noncooperative game consists of a single node in the extensive form representation of the game, i.e., the game tree, and all of its successors down to the terminal nodes.

A subgame of a dynamic noncooperative game consists of a single node in the extensive form representation of the game, i.e., the game tree, and all of its successors down to the terminal nodes.
The information sets and payoffs of a subgame are inherited from the original game.
Moreover, the strategies of the players are restricted to the history of actions in the subgame.

Backward induction

Backward induction

Step

Step

Pick a subgame that does not contain any other subgame.
Compute a Nash equilibrium of this subgame.
Assign the payoff vector associated with this equilibrium to the starting node, and eliminate the subgame.
Iterate this procedure until a move is assigned at every contingency, when there remains no subgame to eliminate.

Nash equilibrium is not necessarily a subgame perfect equilibrium.

A basic technique in CSMA/CA protocol

A basic technique in CSMA/CA protocol
The two devices p1 and p2 are not perfectly synchronized.

p1 decides to transmit or not.
p2 observes p1 before making his own move.

The strategy of p1 is to transmit (T) or to be quiet (Q).
How many pure Nash equilibria do we have?
H. W. Kuhn, ”Extensive Games and the
problem of information”, Contributions to the
Theory of Games II, 1953.
Theorem. (Kuhn, 1953). Every finite
extensive-form game of perfect information has a pure strategy

How do we solve the game?

How do we solve the game?
If player p2 plays the strategy T then the best response of player p1 is to play Q.
However, T is not the best strategy of player p2 if player p1 chooses T
We can eliminate some possibilities by backward induction.
Player p2 knows that he has the last move...
Given all the best moves of p2, player p1
calculates his best moves as well.
It turns out that the backward induction
solution is the historical move (T) then (Q).

Repeated game: average utility (power in our case) over time.

Repeated game: average utility (power in our case) over time.
Discounting factor 
Folk theorem

Ensure cooperation by threat of future punishment.
Any feasible solution can be enforced by repeated game

Enforcing Cooperation by Punishment

Each user tries to maximize the benefit over time.
Short term greedy benefit will be weighted out by the future punishment from others. By maintaining this threat of punishment, cooperation is enforced among greedy users.

Repeated Game Approach

Initialization: Cooperation
Detect the outcome of the game:
If better than a threshold, play cooperation in the next time;
Else, play non-cooperation for T period, and then cooperate.

A repeated game with stochastic (probabilistic) transitions between the different states of the game.

A repeated game with stochastic (probabilistic) transitions between the different states of the game.
A dynamic game composed of a number of stages, and where, at the beginning of each stage the game, is in some state.
In this state, the players select their actions and each player receives a payoff that depends on the current state and the chosen actions.
The game then moves to a new random state whose distribution depends on the previous state and the actions chosen by the players.
The procedure is repeated at the new state and the game continues for a finite or infinite number of stages.
The total payoff to a player is often taken to be the discounted sum of the stage payoffs (similar to the discounted sum of repeated games) or the limit inferior of the averages of the stage payoffs.
Notice that 1-state stochastic game is equal to (infinitely) repeated game, and 1-agent stochastic game is equal to Markov Decision Process (MDP).
Partial observed MDP is widely used model for wireless networking
Bellman equation

Basics of definition

Basics of definition
Game in Strategic Form

Dominating strategy
Nash equilibrium
Mixed strategy
Static continuous-Kernel game

Dynamic noncooperative game

Extensive form
Repeated game
Stochastic game

Special game

Potential game
Stackelberge game
Correlated equilibrium
Supermodular game
Wardrop game

A special class of noncooperative games having a special structure

A special class of noncooperative games having a special structure
In layman’s terms, a potential game is a noncooperative game in which the variations of the users’ utilities can be captured by a single function known as the potential function
Potential games are characterized by their simplicity and the existence of a Nash equilibrium solution
Often, potential games are useful when dealing with continuous-kernel games

Formally,

Formally,
In exact potential games, the difference in individual utilities achieved by each player when changing unilaterally its strategy has the same value as the difference in values of the potential function. In ordinal potential games, only the signs of the differences have to be the same.

Corollary 1: Every finite potential game (exact or ordinal) has at least one pure strategy Nash equilibrium.

Corollary 1: Every finite potential game (exact or ordinal) has at least one pure strategy Nash equilibrium.

Example: Consider a single-cell CDMA, M users

Example: Consider a single-cell CDMA, M users
SINR
Optimization

The player that imposes its own strategy upon the others is called the leader while the other players who react to the leader's declared strategy are called followers.

The player that imposes its own strategy upon the others is called the leader while the other players who react to the leader's declared strategy are called followers.
Stackelberg equilibrium strategy
Every two-person finite game admits a Stackelberg strategy for the leader.
whenever the follower has a single optimal response for every strategy of the leader, then the leader can, at the Stackelberg solution, perform at least as good as at the Nash equilibrium.

Stackelberg games are characterized by a hierarchy

Stackelberg games are characterized by a hierarchy
Stackelberg games are not limited to the single-leader single-follower case
In a single-leader multi-follower case, the Stackelberg equilibrium is basically composed of an optimal policy for the leader with respect to a Nash equilibrium of the followers

It is often desirable to have a unique Nash for the followers game, so as to make the Stackelberg solution tractable
Example application: Pricing for Internet Service Providers

Multi-leader multi-follower Stackelberg games

At the Stackelberg equilibrium, both leaders and followers are in a Nash equilibrium (the Nash equilibria are correlated)
Hard to solve when the followers game has many equilibria

The technique of backward induction is similar to the iterated strict dominance technique.

The technique of backward induction is similar to the iterated strict dominance technique.
It helps reducing the strategy space but becomes very complex for longer extensive form games.
The method provides a technique to identify Stackelberg equilibrium.
Definition: The strategy profile s is a Stackelberg equilibrium with player p1 as the leader and player p2 as the follower if player p1 maximizes his payoff subject to the constraint that player p2 chooses according to his best response function.
Application:

If p1 chooses T then the best response of p2 is to
play Q (payoff of 1-c).
If p1 chooses Q, then the best response of p2 is
T (payoff of 0 for p1).

p1 will therefore choose T which is the Stackelberg equilibria.

Beyond the Nash equilibrium, one can seek a more generalized solution concept for noncooperative games: the correlated equilibrium

Beyond the Nash equilibrium, one can seek a more generalized solution concept for noncooperative games: the correlated equilibrium
A correlated equilibrium is, in essence, a generalization of the Nash equilibrium

Requires an arbitrator who can send (private or public) signals to the players.
These signals allow the players to coordinate their actions and perform joint randomization over strategies.

The arbitrator can be a virtual entity (the players can agree on the first word they hear on the radio) and generate signals that do not depend on the system.
A multi-strategy obtained using the signals is a set of strategies (one strategy for each player which may depend on all the information available to the player including the signal it receives)
It is said to be a correlated equilibrium if no player has an incentive to deviate unilaterally from its part of the multi-strategy.
A special type of “deviation” can be of course to ignore the signals.

Each user to consider the joint distribution of users' actions

Each user to consider the joint distribution of users' actions
Multiple access game
Distributive Opportunistic Spectrum Access for Cognitive Radio using Correlated Equilibrium and No-regret Learning, WCNC 2007

Strategic complementarities: if a player chooses a higher action, the others want to do the same thing.

Strategic complementarities: if a player chooses a higher action, the others want to do the same thing.
Nice properties: existence and achievability of NE

Power control in CDMA networks can often be captured using a supermodular game model

dog barking effect

A. Haurie and P. Marcotte, “On the Relationship between Nash-Cournot and Wardrop Equilibria,” Networks, vol. 15, pp. 295-308, 1985.

A. Haurie and P. Marcotte, “On the Relationship between Nash-Cournot and Wardrop Equilibria,” Networks, vol. 15, pp. 295-308, 1985.
Wardrop (1952) postulated that users in a network game select routes of minimal length. In the asymptotic regime, the non-cooperative game becomes a non-atomic one, in which the impact of a single player on the others is negligible.
In the networking game context, the related solution concept is often called Wardrop equilibrium and is often much easier to compute than the original Nash equilibrium.
Useful for games dealing with network flows and in which there exists a large population of players

Non-cooperative game is the most basic form of the game theory

Non-cooperative game is the most basic form of the game theory
How to carefully design the utility function

Basic components
Convergence and uniqueness

Extensive form
Different type of games and their examples
Price of Anarchy:

Further reading: pricing to improve the performance
Cem Saraydar and Narayan B. Mandayam and David J. Goodman
Efficient Power Control via Pricing in Wireless Data Networks
IEEE Trans. on Communications, vol. 50, No. 2, pp. 291-303,
February 2002

Yüklə 1,94 Mb.

Dostları ilə paylaş:

Zhu Han, Dusit Niyato, Walid Saad, Tamer Basar, and Are Hjorungnes

Zhu Han, Dusit Niyato, Walid Saad,

Tamer Basar, and Are Hjorungnes

Introduction to Game Theory: Lecture 1

Introduction to Game Theory: Lecture 1

Noncooperative Game: Lecture 1, Chapter 3

Bayesian Game: Lecture 2, Chapter 4

Differential Game Lecture 3, Chapter 5

Evolutional Game: Lecture 4, Chapter 6

Cooperative Game: Lecture 5, Chapter 7

Auction Theory: Lecture 6, Chapter 8

Game Theory Applications: Lecture 7, Part III

Total Lectures are about 8 Hours

John von Neuman (1903-1957) co-authored, Theory of Games and Economic Behavior, with Oskar Morgenstern in 1940s, establishing game theory as a field.

John von Neuman (1903-1957) co-authored, Theory of Games and Economic Behavior, with Oskar Morgenstern in 1940s, establishing game theory as a field.

John Nash (1928 - ) developed a key concept of game theory (Nash equilibrium) which initiated many subsequent results and studies

Since 1970s, game-theoretic methods have come to dominate microeconomic theory and other fields

Nobel prizes

Game theory – mathematical models and techniques developed in economics to analyze interactive decision processes, predict the outcomes of interactions, identify optimal strategies

Game theory – mathematical models and techniques developed in economics to analyze interactive decision processes, predict the outcomes of interactions, identify optimal strategies

Game theory techniques were adopted to solve many protocol design issues (e.g., resource allocation, power control, cooperation enforcement) in wireless networks.

Fundamental component of game theory is the notion of a game.

Non-cooperative static games:

Non-cooperative static games:

Repeated games: play multiple times

Dynamic games: (Basar’s book)

Stochastic games (Altman’s work)

Cooperative Games

Book of Myerson (Nobel Prize 2007), J. Huang, H. Zheng, X. Li

Basics of definition

Basics of definition

Game in Strategic Form

Dynamic noncooperative game

Special game

Summary

A game in strategic (normal) form is represented by three elements

A game in strategic (normal) form is represented by three elements

Notation si strategy of a player i while s-i is the strategy profile of all other players

Notice that one user’s utility is a function of both this user’s and others’ strategies

A game is said to be one with complete information if all elements of the game are common knowledge. Otherwise, the game is said to be one with incomplete information, or an incomplete information game.

Two suspects in a major crime held for interrogation in separate cells

Two suspects in a major crime held for interrogation in separate cells

Components of the Prisoner’s dilemma

Representation in Strategic Form

Dominant strategy is a player's best strategy, i.e., a strategy that yields the highest utility for the player regardless of what strategies the other players choose.

Dominant strategy is a player's best strategy, i.e., a strategy that yields the highest utility for the player regardless of what strategies the other players choose.

A Nash equilibrium is a strategy profile s* with the property that no player i can do better by choosing a strategy different from s*, given that every other player j ≠ i .

In other words, for each player i with payoff function ui ,

No user can change its payoff by Unilaterally changing its strategy, i.e., changing its strategy while s-i is fixed

Does the Nash equilibrium always exist?

Does the Nash equilibrium always exist?

Is it efficient ?

One measure of efficiency is Pareto optimality

In some references a strategy profile s that achieves a Pareto optimal payoff distribution can sometimes be referred to as a Pareto optimal strategy

Centralized system: In a centralized system, one seeks to find the social optimum (i.e., the best operating point of the system), given a global knowledge of the parameters. This point is in many respect efficient but often unfair.

Centralized system: In a centralized system, one seeks to find the social optimum (i.e., the best operating point of the system), given a global knowledge of the parameters. This point is in many respect efficient but often unfair.

Decentralized: When the players act noncooperatively and are in competition, one operating point of interest is the Nash equilibrium. This point is often inefficient but stable from the players’ perspective.

The Price of Anarchy (PoA), defined as the ratio of the cost (or utility) function at equilibrium with respect to the social optimum case, measures the price of not having a central coordination in the system

PoA is, loosely, a measure of the loss incurred by having a distributed system!

So far we assumed that the players make deterministic choices from their strategy spaces

So far we assumed that the players make deterministic choices from their strategy spaces

Strategies are pure if a player i selects, in a deterministic manner (probability 1), one strategy out of its strategy set Si

Players can also select a probability distribution over their set of strategies, in which cases the strategies are called mixed

Nash 1950

Define σi as a probability mass function over Si, the set of actions of player i

Define σi as a probability mass function over Si, the set of actions of player i

When working with mixed strategies, each player i aim to maximize their expected payoff

Mixed strategies Nash equilibrium

For a general N-player game, finding the set of NEs is not possible in polynomial time!

For a general N-player game, finding the set of NEs is not possible in polynomial time!

Some existing algorithms

Action (strategy) sets have uncountably many elements

Action (strategy) sets have uncountably many elements

Similar to the definitions before

Interference channel SINR

Interference channel SINR

Existence and uniqueness of Nash equilibrium depend on the utility function design

Existence and uniqueness of Nash equilibrium depend on the utility function design

Basics of definition

Basics of definition

A Nash equilibrium is a strategy profile s* with the property that no player i can do better by choosing a strategy different from s, given that every other player j ≠ i .*