Resolving the social dilemmas of climate action by Hayden Wilkinson

According to scientific consensus, one problem currently adversely affecting the environment, and perhaps the most serious, is that of greenhouse gas emissions (Stocker et al, 2013). Efforts to curb these emissions, on both interpersonal and international levels, are falling notoriously short of their goals in averting catastrophic global climate change. This failure to act, at least among those parties holding true beliefs, appears to be highly irrational. However, a game-theoretic analysis does appear to shed some light on that seemingly irrational lack of cooperation as, for a large number of players, the scenario may well be a Prisoner’s Dilemma or unfortunately-weighted Assurance Game. For each scenario, I will show that the optimal strategy is not to cooperate but instead to continue to emit. Fortunately, there are remedies to these scenarios, including the provision of rewards and the meting of punishments, which could potentially direct fully-rational players to act to curb their greenhouse gas emissions.

On a voluntary interpersonal level, each individual has a choice to either behave in a manner which will minimise emissions of carbon dioxide and other greenhouse gases by some threshold amount, or to not do so[1]. Behaving in such a manner might involve: adopting a vegetarian diet which leads to less methane emissions from livestock (Carlsson-Kanyama, 1998); reducing electricity usage with more efficient appliances, such as LED lightbulbs (U.S. Department of Energy, 2014); changing one’s habits of energy use, perhaps by forgoing air conditioning in favour of electric fans; purchasing private renewable energy sources such as solar panels; choosing to patronise companies which emit less, whether they be electricity companies which invest more heavily in renewable energy or perhaps food brands which source their products locally; working from home rather than commuting, or travelling by public transport rather than private car (Bureau of Infrastructure, Transport and Regional Economics, 2010); and/or a great many other more environmentally sound practices.

Almost all of these practices involve a personal cost – vegetarian diets cost some individuals the pleasure of eating meat, travelling by public transport may be less convenient than by car, changing one’s habits requires some effort, and replacing household appliances incurs a monetary cost, as might purchasing solar panels and patronising the more carbon-conscious companies. However, if a sufficient number of people around the world did the same, then emissions would reduce considerably and we could avoid the possible world in which average global temperatures rise by several degrees and make life in future either unpleasant or impossible (Dhakal et al, 2014). It is also worth noting that the actions are undertaken over a single round, as either the climate changes in such a way or it does not – there is no iteration by which the climate might be repeatedly reset. Thus, the game which is played, while complex in exactly how emission-reduction actions are taken, is played over a single round.

The international case is largely analogous. States can either take costly action or do nothing, and only by a sufficient number of states taking action can the problem be resolved. Actions might include taxing emissions, establishing emission trading schemes, directly subsidising low-emission energy production, or banning high-emission products and processes. As above, all of these actions would likely incur some cost to GDP or direct cost to the nation’s government, and not all nations need do so in order to collectively avoid unpleasant consequences. In this paper, I will focus on the interpersonal case, though the scenarios are almost entirely analogous. As will be shown, rational individuals may well choose not to take emission-reducing actions, and the same applies for nation-states. Then, if there are insufficient numbers of both individuals and states acting to reduce emissions, the unpleasant consequences of climate change will inevitably follow even if this is a world which nobody prefers.

For those individuals who do not hold false beliefs about possible consequences, their preferences may be distributed in a number of different ways. If the above costs result in sufficient disutility for the individual, then that individual may possess the following distribution which constitutes a Prisoner’s Dilemma[2].


An individual such as player 1 prefers the world A in which catastrophic climate change is averted to world D where it is not. However, player 1 finds the costs of M sufficiently burdensome to outweigh the benefits of their own tiny contribution to reduced emissions. However, the impact of the player’s action does not independently prevent or allow catastrophe to occur (except in exceptionally improbable cases) so they will prefer the world C, in which they do not incur those costs but still enjoy a stable climate, over world A in which they do incur costs. They will also prefer the catastrophe of world D over world B in which they both experience catastrophe and fruitlessly incur costs. (The above matrix represents an example of cardinal values fitting this ordering but in what follows I will also consider the general ordinal Prisoner’s dilemma – that is, if U(W) is the utility value for the player in world W, then all cases in which U(C)>U(A)>U(D)>U(B)).

It seems rational, then, for any such player 1 to choose ~M, as it results in a better world for the player regardless of what the rest of the population does. Whether enough of the population chooses M to avert catastrophe or not, U(C)>U(A) and U(D)>U(B), so it will maximise utility for player 1 to choose ~M. However, there are also an infinite number of mixed strategies so, to confirm that this pure strategy of ~M is the optimal[3] (and hence rational) choice over any possible mixed strategies, we will have to look at the expected values. This may seem excessively rigorous for this particular scenario but will prove useful for later examples.

If Penough(A) is the probability of enough members of the population taking action A to to significantly change the resultant world (that is, to change the column in the above diagram), then player 1’s expected utility value for M is,:


The expected utility value for player 1 taking action ~M is then:


Also, more generally, the expected value for player 1 adopting any possible mixed strategy (with P(choosing action M)=p) is:


(*) is a strictly decreasing linear function with respect to p, so E1(p) is maximised uniquely at the lower bound of p. Therefore, since p is a probability constrained to the interval [0,1], player 1 will always be able to maximise their expected utility by adopting the strategy with p=0, the pure strategy ~M.

Also, notably, this result emerges not only for these particular cardinal values but for all other values satisfying U(C)>U(A)>U(D)>U(B)(i.e. all Prisoners’ Dilemmas). Also, the expected utility is maximal only at p=0, so the pure strategy of always choosing ~M does indeed strictly dominate all other mixed strategies. So, regardless of player 1’s knowledge of Penough(M), ~M will always maximises the player’s utility.

Therefore, each individual with the preferences of player 1 would be rational to not minimise their emissions through those various costly measures and instead enjoy their car trips, meat-heavy diets, and the savings of not buying solar panels or more efficient appliances. But if other community members have the same order of preferences, then it is rational for each of them to choose ~M and thereby, if their numbers are sufficient, bring about world D – even though they all prefer world A. Here lies the Prisoner’s Dilemma.

Indeed, this appears to be accurate of a large proportion of the community. For instance, 66% of Australians believe that climate change is occurring and 87% of that group believe that humans are causing that change (Stefanova, 2013), so it appears that a large portion of the Australian community has true beliefs regarding the issue and would likely prefer world A over world D. However, few Australians attempt to minimise emissions – for instance, 73% of households use air conditioning (ABS, 2011) and only 2% of Australians follow a vegetarian diet (Newspoll, 2010). Individuals in this group, then, fail to act collectively to reach world A despite it being highly preferable. This can potentially be explained by the possibility of a large portion of the population being trapped in the above Prisoner’s Dilemma, and likewise for the collective inaction at an international level.

That is, however, not the only possibility. If the player values their own integrity highly or otherwise dislikes world C for some reason, there is another possible set of preferences which may still produce similar problems.


Both worlds A and D constitute Nash Equilibria – states where no player can improve their expected utility by unilaterally changing strategy – as, given the actions of the population, they are the better worlds for player 2. With two equilibria, this player is hence in an Assurance Game (definition from Skyrms, 2004).

In this game, the player again prefers the worlds in which catastrophic climate change is averted over the worlds in which it isn’t. And again, if catastrophe is to occur, the player prefers not to suffer those aforementioned costs unnecessarily. However, here the order of preferences for A and C are swapped, perhaps due to integrity or perhaps the player finding enjoyment in making a contribution.

With the above weighting, the expected values of each action are as follows[4]:


However, note that this result relies on the magnitudes of the utilities, rather than just the order as with the Prisoner’s Dilemma. IfU2(A) and Penough(M) were large enough then the expected value of M would be greater and indeed make it the more appealing option.


But the value of p which maximises the expected utility can be either M or ~M, depending on the value of Penough(M), and so there is no generally optimal strategy.

Such an Assurance Game situation can, therefore, produce a result similar to the Prisoner’s Dilemma. Player 2, with the above weightings and Penough(M)< 2/3 will still rationally choose ~M to maximise their expected utility, even though such players agree that world A is significantly better than D.

In addition to players 1 and 2, there is the individual with false beliefs (player 3) who, although they may share the ordinal preferences of player 1 or 2, simply believes that anthropogenic climate change is a myth. Although their actual (unknown) preferences match those of players 1 or 2, they might mistakenly believe instead that U3(D)>U3(A), and therefore not encounter any dilemma when choosing ~M over other alternatives.

There is also the hero, who prefers to minimise their emissions regardless of whether it will bring about the public goods of stable air temperatures and stable climate. Such a hero will prefer world A over C and B over D, perhaps due to a particular set of moral beliefs or the personal satisfaction which it brings them.


Given these preferences, we have the following expected value (for any mixed strategy).


As above, it can easily be shown that E4(M)-E4(~M) > 0. So E4(p,1-p) is maximised by setting p = 1 (given that p[0,1]). Thus, the pure strategy M dominates all others, and this holds for all heroes with U(A)>U(C) and U(B)>U(D).

This gives us four distinct classes of players: player 1 in a Prisoner’s Dilemma; player 2 in an Assurance Game; player 3, the ‘climate-denier’, who holds false beliefs and thereby chooses ~M; and player 4, the hero[5]. Individuals who match players 1 or 3 will choose~M; player 4 will choose M; and player 2 will choose either pure strategy M or pure strategy ~M, depending on the exact utilities and probability of others choosing M. Now, with some distribution of individuals into these classes, it may be possible to construct an accurate model of individuals’ behaviour in the real world. Given that the majority of individuals, in Australia at least, hold true beliefs about anthropogenic climate change (Stefanova, 2013), there presumably only a small minority of individuals falling to player 3’s misconception. Also, based on the relatively small segment of the population adopting vegetarianism and foregoing air conditioners, not many matching player 4. The majority hold true beliefs and still do not act as effectively as they could, so can be identified as players 1 or 2.

Thus, while it may seem irrational for so many in the population to choose ~M when, collectively, this results in world D which so few people prefer, modelling the behaviour of those individuals through the Prisoner’s Dilemma of player 1 or Assurance Game of player 2 does provide some explanation. It is rational for each individual in either such game (at least for low Penough(M) in the Assurance Game) to always choose ~M. This results in every rational such individual taking that same action and, given sufficient numbers, bringing about world D rather than the much-preferred world A. Meanwhile, a small number of heroes always choose M and a number of climate-deniers choose ~M. This results in approximately 57% of Australians accepting anthropogenic climate change but only 27% going without air conditioners and 2% adopting an emission-minimising diet.

Of course, this model has several flaws, including: that the game is not actually played as a single instance but is instead far more complex in the number of individual actions and variations of actions over many decades; that players do have some knowledge of others’ actions unlike in the games’ traditional forms, although this is somewhat limited in the interpersonal case; that there is no clear distinction between M and ~M, as individuals are able to select a limited number of emission-minimising methods; and that there is no clear distinction between the catastrophic worlds and the non-catastrophic worlds, but instead a spectrum of temperature changes. I am unable to address these issues here, but I will note that, despite these complications, the situations of players 1 and 2 do still constitute social dilemmas, which are both continuous-input and continuous-output[6]. Such players will still benefit more over time the less they adopt actions making up M and still be able to enjoy any public goods of low temperatures and stable climate regardless of their own actions. Their actions are also best modelled by a single round as it is not until the end of several decades that the exact results are determined and the utility values obtained.

Turning now to possible solutions to the players’ dilemmas, we would like some means to reduce those players’ tendency to choose~M and bring about world D which, at least by most consequentialist theories, is one which we have a moral obligation to bring about if we are able to. The remainder of the paper will hence be devoted to examining what solutions might be offered up by the same game-theoretic framework used above.

Starting with the heroes, and those Assurance Game players whose utility values and observed probabilities result in an optimal strategy of M, no action is required. Player 3, however, is doing much the opposite, but this is due to false beliefs. To bring about actionM, the player’s actual preferences and their actions’ actual consequences must first be revealed. This may require improving science education in schools, or perhaps public awareness campaigns or, to prevent misinformation, legal restrictions on any misinformation propagated by individuals and media organisations on the topic. All of these approaches may potentially help but, still, this class of players is only a small minority and, by themselves, are unlikely to solve the problem (particularly if it simply shifts their preferences to create a Prisoner’s dilemma).

For player 1 and the unfortunately-weighted player 2, a more substantial approach may be required. Neither are acting under false beliefs regarding the consequences of their actions, yet both are rationally selecting actions which, collectively, bring about a far less desirable world. I will assume that these players remain rational[7], that we do not alter or restrict the possible actions themselves, and that we cannot simply change the value of Penough(M) without first changing individual actions. Given this, in order to change their behaviour we must effect some reordering of their preferences by altering the worlds which result from their actions. Player 2’s choice might also be changed merely by reweighting their preferences or by convincing them that Penough(M) is sufficiently large, but neither of these two solutions will work for player 1, and we would like a general solution for both. Also, for player 1, reordering world A to be preferred above C may not be sufficient, as this is the exact situation of player 2, but setting both U(A)>U(C) and U(B)>U(D) will be, as this produces player 4’s situation which we examined above. U(B)>U(D) will also change the dominant strategy of player 2 and thereby constitute a general solution for both classes of player[8].

In order to bring about this reordering, there are only two possible options. One is to make worlds A and B far better, such that enough community members prefer them to C and D. The other is to make C and D far worse. The difference between the two pairs of worlds is of the individual player’s own action, and so any changes to those worlds will be a direct response to their own actions rather than to the general state of the world – that is, the change must be of a reward and/or punishment given to the individual based on their action (though to give both rewards and punishment would be inefficient).

The option of rewarding those who take action M is a problematic one. To avoid the world of catastrophic climate change, we need a very large portion of the community to take this action. To give financial or material rewards to all those who do so – rewards which must be considered valuable enough to make up for the individual costs incurred – would be an extremely costly exercise. This is particularly so if it were an individual or small group attempting to provide the rewards, but even a wealthy government would likely find it difficult, at least if the individual difference in utility between the pairs of worlds were great. Given that the financial costs alone, for those taking action M, would be in the order of thousands of dollars per person per year, rewarding the majority of the population in such a way as to outweigh this cost would be an enormous and perhaps insurmountable obstacle to whoever was to provide these rewards, government or otherwise.

The second option matches the solution given by Hardin (1968) – that of coercion. It may perhaps involve attempting to instil shame in individuals that choose ~M, making worlds in which they are guilty less desirable. It may also involve more direct coercion, in that each individual taking action ~M is punished by a fine or incarceration, or perhaps execution. The punishment need only be harsh enough to successfully make worlds in which they choose ~M less desirable than the corresponding worlds in which they don’t (i.e.U(A)>U(C) for player 1 and U(B)>U(D) for both classes of players). How harsh those punishments need be is hence contingent on the exact cardinal preferences of individuals in each community. Also, due to practicality, such penal action would be more effective when carried out by a higher authority or government rather than some ambitious individual[9]. This mirrors Hobbes’s justification for government authority, but in game-theoretic terms rather than for social dilemmas in general, however we need not proceed all the way to Hobbes’s advocacy of absolute sovereign authority.

Both of these two options are problematic in some way. The rewards-based approach would inevitably be extremely costly and, in practice, perhaps impossible to implement. The punishment-based approach, however, raises significant ethical questions. There is a hypothetical imperative that such an approach be used (particularly if the alternative is so impractical) – if policy-makers want to prevent the realisation of world D, they should intervene on those utility values with harsh punishments – however this may well conflict with a moral imperative. From a merely hypothetical perspective, the harsher the punishment the more likely it is that world Ais realised, and so the approach which should be taken is that which applies the harshest possible punishment – execution perhaps, or brutal torture followed by execution, or the torture and execution of the culprit’s entire family. From a moral perspective, such punishments might be seen as entirely disproportionate.

An examination of every possible moral theory and their verdicts on such punishments is beyond the scope of this paper, however there are some which are very clear on how governments and policy-makers ought to act. Desert-based theories of justice, for instance, would struggle to justify any such punishment, whether harsh or not. Individual culprits who have taken action ~M have not themselves harmed anyone. After all, their cooperation is not necessary for enough of the population to take action M and avoid worlds B and D, and so it is difficult to say that they deserve any level of punishment for their actions or for any harm done to others. There is also the difficulty that the harm produced is done mainly to those who do not yet exist and who might not exist otherwise, which leads us to the even greater problem of non-identity (see Parfit, 1984). Likewise, more widely, moral theories of duty and of rights would, for the most part, condemn such punishments.

Theories of deterrence, and the consequentialist theories on which they often rest, might seem to be more amenable to such an approach. After all, reordering preferences is essentially aimed at deterring individuals from action ~M and it is the consequent state of the world with which we are concerned. In particular though, most such theories of deterrence or of consequentialism would prescribe that our punishments be as mild as possible while still retaining their deterrent effect and while still successfully avoiding worlds B and D, or at least that the full punishment be applied as rarely as possible. However there is an additional problem under consequentialist theories. To solve the dilemmas of both classes of players, we need to reorder world D as preferred over world B. To do this, those who take action ~M must be punished in world D even though this punishment will not do any good or diminish human suffering in any way. There is hence a well-founded consequentialist reason not to impose such punishments. This perhaps might be overcome if punishments were delivered throughout the multiple decades over which the future state of the climate is determined but, still, if catastrophe is not averted in the end then those punishments will have been counter to consequentialist morality. Thus, if such punishments are to be imposed, they must guarantee with high probability that catastrophe is averted. The harsher the punishments are, the stronger the guarantee, but this conflicts with the claim that the suffering inflicted on culprits should be as little as possible. This complexity is perhaps best left to empirical inquiries, as it is difficult to ascertain what level of punishment will inflict the least suffering while still effectively deterring a given behaviour.

In summation, individual contributions to greenhouse gas emissions can be modelled as Prisoner’s Dilemmas and Assurance Games, for large portions of the population. This explains the apparent paradox of individual inaction despite clear collective benefit, and suggests that this inaction can be averted by reordering preferences through the imposition of either or punishments by some higher authority. From a practical standpoint, punishment may be the only realistic solution, but this encounters various ethical challenges and it is unclear exactly what level of punishment would be best-justified.

Hayden Wilkinson has just completed honours in philosophy at UQ after a dual bachelor of arts and science last year.


[1]    Of course, in practice, few individuals take every possible action to minimise emissions, and very few individuals take every possible action to maximise them, but this is one of the model’s several flaws which I will discuss later.

[2]    Note that utility values are cardinal rather than ordinal, for the purpose of considering mixed or non-pure strategies.

[3]    Optimal here in the sense that the individual player’s utility is maximised, rather than Pareto-optimal.

[4]    With U2(World) the utility value of a world for player 2 and, in general Ui(World) the utility value of a world for player i.

[5]    There are, of course, other possible orderings of preferences and other possible explanations for behaviour. Some individuals may, for instance, accept the scientific consensus but take pleasure in the consequences of environmental turmoil (they may be rather sadistic or they may just highly value their own enjoyment of a more temperate winter). Others still might value having hero status in world B as they may enjoy taunting others, but not value that same status in world A as there is no one to taunt. These variations are all possible, but I would claim that they are difficult to empathise with and hence not likely to constitute a considerable portion of the population, so I will omit them from the discussion.

[6]    See Felkins (1995) for details of continuous vs. step inputs and outputs, as well as a much more comprehensive examination of social dilemmas.

[7]    Undermining their rationality in some way, for instance by brainwashing them or controlling their movements through some invasive technology, would be ethically controversial for a great many people. Therefore I will focus on alternative methods.

[8] We could, at the same time, attempt to convince player 2 that Penough(M) is sufficiently large or some other approach specific to that class, but this is inefficient as the reordering of worlds C and D to be preferred over worlds A and B is sufficient (and indeed the simplest option to change player 1’s actions).

[9]    This would perhaps lead us to the same Prisoner’s Dilemma or Assurance Game on an international level, as it is not in a national government’s interest to impose such a punishment when the number of other nations doing the same in not sufficient to avert catastrophe. The solution to this, then, is to have another authority or agreement above this, applying to all nations and hence not subject to any social dilemma on an even higher level.


Carlsson-Kanyama, A 1998, ‘Climate change and dietary choices—how can emissions of greenhouse gases from food consumption be reduced?’, Food policy, 23, pp. 277-293.

Dhakal, S, Seto, K et al. 2014, Working Group III Contribution to the IPCC Fifth Assessment Report Climate Change 2014, Mitigation of Climate Change, IPCC, Stockholm.

U.S. Department of Energy 2014, How Energy-Efficient Light Bulbs Compare with Traditional Incandescents, viewed 22nd April 2015, <>.

Felkins, L 1995, The voter’s paradox, viewed 22nd April 2015, <>.

Hardin, G 1968,The tragedy of the commons’, Science, 162, pp. 1243-1248.

Kuhn, S 2007, Prisoner’s dilemma, Stanford Encyclopedia of Philosophy, viewed 22nd April 2015, <>.

Lloyd, SA & Sreedhar, S 2002, Hobbes’s moral and political philosophy, Stanford Encyclopedia of Philosophy, viewed 22nd April 2015, <;.

Pacuit, E & Roy, O 2012, Epistemic foundations of game theory, Stanford Encyclopedia of Philosophy, viewed 22nd April 2015, <>.

Parfit, D 1984, Reasons and persons, Oxford University Press, Oxford.

Ross, D 2001, Game theory, Stanford Encyclopedia of Philosophy, viewed 22nd April 2015, <>.

Skyrms, B 2004, The stag hunt and the evolution of social structure, Cambridge University Press.

Australian Bureau of Statistics 2011, Environmental Issues: Energy Use and Conservation, Tables, viewed 22nd April 2015, <>.

Stefanova, K 2013, Climate of the nation 2013: Australian attitudes on climate change, The Climate Institute, Sydney.

Stocker, T, Dahe, Q & Plattner, G 2013, Working Group I Contribution to the IPCC Fifth Assessment Report Climate Change 2013, The Physical Science Basis. IPCC, Stockholm.

Verbeek, B & Morris, C 2010, Game theory and ethics, Stanford Encyclopedia of Philosophy, viewed 22nd April 2015, <>.

Photo by Thomas Richter.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s