Zhu Han, Dusit Niyato, Walid Saad, Tamer Basar, and Are Hjorungnes



Yüklə 1,94 Mb.
tarix16.08.2018
ölçüsü1,94 Mb.
#63220


  • Zhu Han, Dusit Niyato, Walid Saad,

  • Tamer Basar, and Are Hjorungnes


Introduction to Game Theory: Lecture 1

  • Introduction to Game Theory: Lecture 1

  • Noncooperative Game: Lecture 1, Chapter 3

  • Bayesian Game: Lecture 2, Chapter 4

  • Differential Game Lecture 3, Chapter 5

  • Evolutional Game: Lecture 4, Chapter 6

  • Cooperative Game: Lecture 5, Chapter 7

  • Auction Theory: Lecture 6, Chapter 8

  • Game Theory Applications: Lecture 7, Part III

  • Total Lectures are about 8 Hours



John von Neuman (1903-1957) co-authored, Theory of Games and Economic Behavior, with Oskar Morgenstern in 1940s, establishing game theory as a field.

  • John von Neuman (1903-1957) co-authored, Theory of Games and Economic Behavior, with Oskar Morgenstern in 1940s, establishing game theory as a field.

  • John Nash (1928 - ) developed a key concept of game theory (Nash equilibrium) which initiated many subsequent results and studies

  • Since 1970s, game-theoretic methods have come to dominate microeconomic theory and other fields

  • Nobel prizes

    • Nobel prize in Economic Sciences 1994 awarded to Nash, Harsanyi (Bayesian games) and Selten (subgame perfect equilibrium)
    • 2005, Auman and Schelling got the Nobel prize for having enhanced our understanding of cooperation and conflict through game theory
    • 2007 Leonid Hurwicz, Eric Maskin and Roger Myerson won Nobel Prize for having laid the foundations of mechanism design theory.


Game theory – mathematical models and techniques developed in economics to analyze interactive decision processes, predict the outcomes of interactions, identify optimal strategies

  • Game theory – mathematical models and techniques developed in economics to analyze interactive decision processes, predict the outcomes of interactions, identify optimal strategies

  • Game theory techniques were adopted to solve many protocol design issues (e.g., resource allocation, power control, cooperation enforcement) in wireless networks.

  • Fundamental component of game theory is the notion of a game.

    • A game is described by a set of rational players, the strategies associated with the players, and the payoffs for the players. A rational player has his own interest, and therefore, will act by choosing an available strategy to achieve his interest.
    • A player is assumed to be able to evaluate exactly or probabilistically the outcome or payoff (usually measured by the utility) of the game which depends not only on his action but also on other players’ actions.


Non-cooperative static games:

  • Non-cooperative static games:

  • Repeated games: play multiple times

    • Threat of punishment by repeated game. MAD: Nobel prize 2005.
    • Tit-for-Tat (infocom 2003):
  • Dynamic games: (Basar’s book)

    • ODE for state, Optimization utility over time, HJB and dynamic programming
    • Evolutional game (Hossain and Dusit’s work)
  • Stochastic games (Altman’s work)

  • Cooperative Games

    • Nash Bargaining Solution
    • Coalitional Game


Book of Myerson (Nobel Prize 2007), J. Huang, H. Zheng, X. Li

    • Book of Myerson (Nobel Prize 2007), J. Huang, H. Zheng, X. Li


Basics of definition

  • Basics of definition

  • Game in Strategic Form

    • Dominating strategy
    • Nash equilibrium
    • Mixed strategy
    • Static continuous-Kernel game
  • Dynamic noncooperative game

    • Extensive form
    • Repeated game
    • Stochastic game
  • Special game

    • Potential game
    • Stackelberge game
    • Correlated equilibrium
    • Supermodular game
    • Wardrop game
  • Summary



A game in strategic (normal) form is represented by three elements

  • A game in strategic (normal) form is represented by three elements

    • A set of players N
    • Set of strategies of player Si
    • Set of payoffs (or payoff functions) Ui
  • Notation si strategy of a player i while s-i is the strategy profile of all other players

  • Notice that one user’s utility is a function of both this user’s and others’ strategies

  • A game is said to be one with complete information if all elements of the game are common knowledge. Otherwise, the game is said to be one with incomplete information, or an incomplete information game.



Two suspects in a major crime held for interrogation in separate cells

  • Two suspects in a major crime held for interrogation in separate cells

    • If they both stay quiet, each will be convicted with a minor offence and will spend 1 year in prison
    • If one and only one of them finks, he will be freed and used as a witness against the other who will spend 4 years in prison
    • If both of them fink, each will spend 3 years in prison
  • Components of the Prisoner’s dilemma

    • Rational Players: the prisoners
    • Strategies: Stay quiet (Q) or Fink (F)
    • Solution: What is the Nash equilibrium of the game?
  • Representation in Strategic Form





Dominant strategy is a player's best strategy, i.e., a strategy that yields the highest utility for the player regardless of what strategies the other players choose.

  • Dominant strategy is a player's best strategy, i.e., a strategy that yields the highest utility for the player regardless of what strategies the other players choose.

  • A Nash equilibrium is a strategy profile s* with the property that no player i can do better by choosing a strategy different from s*, given that every other player j ≠ i .

  • In other words, for each player i with payoff function ui ,

  • No user can change its payoff by Unilaterally changing its strategy, i.e., changing its strategy while s-i is fixed



Does the Nash equilibrium always exist?

  • Does the Nash equilibrium always exist?

  • Is it efficient ?

  • One measure of efficiency is Pareto optimality

    • A payoff vector x is Pareto optimal if there does not exist any payoff vector y such that
    • y ≥ x
    • with at least one strict inequality for an element yi
  • In some references a strategy profile s that achieves a Pareto optimal payoff distribution can sometimes be referred to as a Pareto optimal strategy



Centralized system: In a centralized system, one seeks to find the social optimum (i.e., the best operating point of the system), given a global knowledge of the parameters. This point is in many respect efficient but often unfair.

  • Centralized system: In a centralized system, one seeks to find the social optimum (i.e., the best operating point of the system), given a global knowledge of the parameters. This point is in many respect efficient but often unfair.

  • Decentralized: When the players act noncooperatively and are in competition, one operating point of interest is the Nash equilibrium. This point is often inefficient but stable from the players’ perspective.

  • The Price of Anarchy (PoA), defined as the ratio of the cost (or utility) function at equilibrium with respect to the social optimum case, measures the price of not having a central coordination in the system

  • PoA is, loosely, a measure of the loss incurred by having a distributed system!







So far we assumed that the players make deterministic choices from their strategy spaces

  • So far we assumed that the players make deterministic choices from their strategy spaces

  • Strategies are pure if a player i selects, in a deterministic manner (probability 1), one strategy out of its strategy set Si

  • Players can also select a probability distribution over their set of strategies, in which cases the strategies are called mixed

  • Nash 1950

    • Every finite strategic form N-player game has a mixed strategy Nash equilibrium


Define σi as a probability mass function over Si, the set of actions of player i

  • Define σi as a probability mass function over Si, the set of actions of player i

  • When working with mixed strategies, each player i aim to maximize their expected payoff

  • Mixed strategies Nash equilibrium





For a general N-player game, finding the set of NEs is not possible in polynomial time!

  • For a general N-player game, finding the set of NEs is not possible in polynomial time!

      • Unless the game has a certain structure
  • Some existing algorithms

    • Fictitious play (based on empirical probabilities)
    • Iterative algorithms (can converge for certain classes of games)
    • Best response algorithms
      • Popular in some games (continuous kernel games for example)
    • Useful Reference
      • D. Fundenberg and D. Levine, The theory of learning in games, the MIT press, 1998.


Action (strategy) sets have uncountably many elements

  • Action (strategy) sets have uncountably many elements

    • For example, strategies are intervals
  • Similar to the definitions before

    • Best response
    • Nash equilibrium


Interference channel SINR

  • Interference channel SINR



Existence and uniqueness of Nash equilibrium depend on the utility function design

  • Existence and uniqueness of Nash equilibrium depend on the utility function design



Basics of definition

  • Basics of definition

  • Game in Strategic Form

    • Dominating strategy
    • Nash equilibrium
    • Mixed strategy
    • Static continuous-Kernel game
  • Dynamic noncooperative game

    • Extensive form
    • Repeated game
    • Stochastic game
  • Special game

    • Potential game
    • Stackelberge game
    • Correlated equilibrium
    • Supermodular game
    • Wardrop game


In dynamic games, the notion of time and information is important

  • In dynamic games, the notion of time and information is important

    • The strategic form cannot capture this notion
    • We need a new game form to visualize a game
  • In extensive form, a game is represented with a game tree.

  • Extensive form games have the following four elements in common:

    • Nodes: This is a position in the game where one of the players must make a decision. The first position, called the initial node, is an open dot, all the rest are filled in. Each node is labeled so as to identify who is making the decision.
    • Branches: These represent the alternative choices that the player faces, and so correspond to available actions.


3. Payoffs: These represent the pay-offs for each player, with the pay-offs listed in the order of players.

  • 3. Payoffs: These represent the pay-offs for each player, with the pay-offs listed in the order of players.

    • When these payoff vectors are common knowledge the game is said to be one of complete information.
    • If, however, players are unsure of the pay-offs other players can receive, then it is an incomplete information game.
  • 4. Information sets: When two or more nodes are joined together by a dashed line this means that the player whose decision it is does not know which node he or she is at. When this occurs the game is characterized as one of imperfect information.

    • When each decision node is its own information set the game is said to be one of perfect information, as all players know the outcome of previous decisions.




While the normal form gives the minimum amount of information necessary to describe a game, the extensive form gives additional details about the game concerning the timing of the decisions to be made and the amount of information available to each player when each decision has to be made.

  • While the normal form gives the minimum amount of information necessary to describe a game, the extensive form gives additional details about the game concerning the timing of the decisions to be made and the amount of information available to each player when each decision has to be made.

  • For every extensive form game, there is one and only one corresponding normal form game. For every normal form game, there are, in general, several corresponding extensive form games.

  • Every finite extensive form game of perfect information has a pure strategy Nash equilibrium.



A subgame of a dynamic noncooperative game consists of a single node in the extensive form representation of the game, i.e., the game tree, and all of its successors down to the terminal nodes.

  • A subgame of a dynamic noncooperative game consists of a single node in the extensive form representation of the game, i.e., the game tree, and all of its successors down to the terminal nodes.

  • The information sets and payoffs of a subgame are inherited from the original game.

  • Moreover, the strategies of the players are restricted to the history of actions in the subgame.



Backward induction

  • Backward induction



Step

  • Step

    • Pick a subgame that does not contain any other subgame.
    • Compute a Nash equilibrium of this subgame.
    • Assign the payoff vector associated with this equilibrium to the starting node, and eliminate the subgame.
    • Iterate this procedure until a move is assigned at every contingency, when there remains no subgame to eliminate.
  • Nash equilibrium is not necessarily a subgame perfect equilibrium.



A basic technique in CSMA/CA protocol

  • A basic technique in CSMA/CA protocol

  • The two devices p1 and p2 are not perfectly synchronized.

    • p1 decides to transmit or not.
    • p2 observes p1 before making his own move.
  • The strategy of p1 is to transmit (T) or to be quiet (Q).

  • How many pure Nash equilibria do we have?

  • H. W. Kuhn, ”Extensive Games and the

  • problem of information”, Contributions to the

  • Theory of Games II, 1953.

  • Theorem. (Kuhn, 1953). Every finite

  • extensive-form game of perfect information has a pure strategy



How do we solve the game?

  • How do we solve the game?

  • If player p2 plays the strategy T then the best response of player p1 is to play Q.

  • However, T is not the best strategy of player p2 if player p1 chooses T

  • We can eliminate some possibilities by backward induction.

  • Player p2 knows that he has the last move...

  • Given all the best moves of p2, player p1

  • calculates his best moves as well.

  • It turns out that the backward induction

  • solution is the historical move (T) then (Q).



Repeated game: average utility (power in our case) over time.

  • Repeated game: average utility (power in our case) over time.

  • Discounting factor 

  • Folk theorem

    • Ensure cooperation by threat of future punishment.
    • Any feasible solution can be enforced by repeated game
  • Enforcing Cooperation by Punishment

    • Each user tries to maximize the benefit over time.
    • Short term greedy benefit will be weighted out by the future punishment from others. By maintaining this threat of punishment, cooperation is enforced among greedy users.
  • Repeated Game Approach

    • Initialization: Cooperation
    • Detect the outcome of the game:
    • If better than a threshold, play cooperation in the next time;
    • Else, play non-cooperation for T period, and then cooperate.


A repeated game with stochastic (probabilistic) transitions between the different states of the game.

  • A repeated game with stochastic (probabilistic) transitions between the different states of the game.

  • A dynamic game composed of a number of stages, and where, at the beginning of each stage the game, is in some state.

  • In this state, the players select their actions and each player receives a payoff that depends on the current state and the chosen actions.

  • The game then moves to a new random state whose distribution depends on the previous state and the actions chosen by the players.

  • The procedure is repeated at the new state and the game continues for a finite or infinite number of stages.

  • The total payoff to a player is often taken to be the discounted sum of the stage payoffs (similar to the discounted sum of repeated games) or the limit inferior of the averages of the stage payoffs.

  • Notice that 1-state stochastic game is equal to (infinitely) repeated game, and 1-agent stochastic game is equal to Markov Decision Process (MDP).

  • Partial observed MDP is widely used model for wireless networking

  • Bellman equation



Basics of definition

  • Basics of definition

  • Game in Strategic Form

    • Dominating strategy
    • Nash equilibrium
    • Mixed strategy
    • Static continuous-Kernel game
  • Dynamic noncooperative game

    • Extensive form
    • Repeated game
    • Stochastic game
  • Special game

    • Potential game
    • Stackelberge game
    • Correlated equilibrium
    • Supermodular game
    • Wardrop game


A special class of noncooperative games having a special structure

  • A special class of noncooperative games having a special structure

  • In layman’s terms, a potential game is a noncooperative game in which the variations of the users’ utilities can be captured by a single function known as the potential function

  • Potential games are characterized by their simplicity and the existence of a Nash equilibrium solution

  • Often, potential games are useful when dealing with continuous-kernel games



Formally,

  • Formally,

  • In exact potential games, the difference in individual utilities achieved by each player when changing unilaterally its strategy has the same value as the difference in values of the potential function. In ordinal potential games, only the signs of the differences have to be the same.



Corollary 1: Every finite potential game (exact or ordinal) has at least one pure strategy Nash equilibrium.

  • Corollary 1: Every finite potential game (exact or ordinal) has at least one pure strategy Nash equilibrium.



Example: Consider a single-cell CDMA, M users

  • Example: Consider a single-cell CDMA, M users

  • SINR

  • Optimization



The player that imposes its own strategy upon the others is called the leader while the other players who react to the leader's declared strategy are called followers.

  • The player that imposes its own strategy upon the others is called the leader while the other players who react to the leader's declared strategy are called followers.

  • Stackelberg equilibrium strategy

  • Every two-person finite game admits a Stackelberg strategy for the leader.

  • whenever the follower has a single optimal response for every strategy of the leader, then the leader can, at the Stackelberg solution, perform at least as good as at the Nash equilibrium.





Stackelberg games are characterized by a hierarchy

  • Stackelberg games are characterized by a hierarchy

  • Stackelberg games are not limited to the single-leader single-follower case

  • In a single-leader multi-follower case, the Stackelberg equilibrium is basically composed of an optimal policy for the leader with respect to a Nash equilibrium of the followers

    • It is often desirable to have a unique Nash for the followers game, so as to make the Stackelberg solution tractable
    • Example application: Pricing for Internet Service Providers
  • Multi-leader multi-follower Stackelberg games

    • At the Stackelberg equilibrium, both leaders and followers are in a Nash equilibrium (the Nash equilibria are correlated)
    • Hard to solve when the followers game has many equilibria


The technique of backward induction is similar to the iterated strict dominance technique.

  • The technique of backward induction is similar to the iterated strict dominance technique.

  • It helps reducing the strategy space but becomes very complex for longer extensive form games.

  • The method provides a technique to identify Stackelberg equilibrium.

  • Definition: The strategy profile s is a Stackelberg equilibrium with player p1 as the leader and player p2 as the follower if player p1 maximizes his payoff subject to the constraint that player p2 chooses according to his best response function.

  • Application:

    • If p1 chooses T then the best response of p2 is to
    • play Q (payoff of 1-c).
    • If p1 chooses Q, then the best response of p2 is
    • T (payoff of 0 for p1).
  • p1 will therefore choose T which is the Stackelberg equilibria.



Beyond the Nash equilibrium, one can seek a more generalized solution concept for noncooperative games: the correlated equilibrium

  • Beyond the Nash equilibrium, one can seek a more generalized solution concept for noncooperative games: the correlated equilibrium

  • A correlated equilibrium is, in essence, a generalization of the Nash equilibrium

    • Requires an arbitrator who can send (private or public) signals to the players.
    • These signals allow the players to coordinate their actions and perform joint randomization over strategies.
  • The arbitrator can be a virtual entity (the players can agree on the first word they hear on the radio) and generate signals that do not depend on the system.

  • A multi-strategy obtained using the signals is a set of strategies (one strategy for each player which may depend on all the information available to the player including the signal it receives)

  • It is said to be a correlated equilibrium if no player has an incentive to deviate unilaterally from its part of the multi-strategy.

  • A special type of “deviation” can be of course to ignore the signals.



Each user to consider the joint distribution of users' actions

  • Each user to consider the joint distribution of users' actions

  • Multiple access game

  • Distributive Opportunistic Spectrum Access for Cognitive Radio using Correlated Equilibrium and No-regret Learning, WCNC 2007



Strategic complementarities: if a player chooses a higher action, the others want to do the same thing.

  • Strategic complementarities: if a player chooses a higher action, the others want to do the same thing.

  • Nice properties: existence and achievability of NE



  • Power control in CDMA networks can often be captured using a supermodular game model

    • dog barking effect


A. Haurie and P. Marcotte, “On the Relationship between Nash-Cournot and Wardrop Equilibria,” Networks, vol. 15, pp. 295-308, 1985.

  • A. Haurie and P. Marcotte, “On the Relationship between Nash-Cournot and Wardrop Equilibria,” Networks, vol. 15, pp. 295-308, 1985.

  • Wardrop (1952) postulated that users in a network game select routes of minimal length. In the asymptotic regime, the non-cooperative game becomes a non-atomic one, in which the impact of a single player on the others is negligible.

  • In the networking game context, the related solution concept is often called Wardrop equilibrium and is often much easier to compute than the original Nash equilibrium.

  • Useful for games dealing with network flows and in which there exists a large population of players



Non-cooperative game is the most basic form of the game theory

  • Non-cooperative game is the most basic form of the game theory

  • How to carefully design the utility function

    • Basic components
    • Convergence and uniqueness
  • Extensive form

  • Different type of games and their examples

  • Price of Anarchy:

    • Further reading: pricing to improve the performance
    • Cem Saraydar and Narayan B. Mandayam and David J. Goodman
    • Efficient Power Control via Pricing in Wireless Data Networks
    • IEEE Trans. on Communications, vol. 50, No. 2, pp. 291-303,
    • February 2002






Yüklə 1,94 Mb.

Dostları ilə paylaş:




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə