Discouraging Pool Block Withholding Attacks in Bitcoins

06.03.2025
Discouraging Pool Block Withholding Attacks in Bitcoins

Abstract The arisen of Bitcoin has led to much enthusiasm for blockchain research and block mining, and the extensive existence of mining pools helps its participants (i.e., miners) gain reward more frequently. Recently, the mining pools are proved to be vulnerable for several possible attacks, and pool block withholding attack is one of them: one strategic pool manager sends some of her miners to other pools and these miners pretend to work on the puzzles but actually do nothing. And these miners still get reward since the pool manager cannot recognize these malicious miners.

In this work, we revisit the game-theoretic model for pool block withholding attacks and propose a revised approach to reallocate the reward to the miners. Fortunately, in the new model, the pool managers have strong incentive to not launch such attacks. We show that for any number of mining pools, no-pool-attacks is always a Nash equilibrium. Moreover, with only two minority mining pools participating, no-pool-attacks is actually the unique Nash equilibrium.

Keywords Blockchain · Mining Pool · Block Withholding Attack · Game Theory

1 Introduction

Bitcoin is a decentralized crypto currency originally proposed by Satoshi Nakamoto in 2008 [15]. Since then, it has attracted wide attention in the world both academically and commercially because of its soaring price and the characteristics of decentralization. Bitcoin realizes its decentralization by blockchain, which is a global ledger maintained by distributed system. This ledger can record historical transactions and other information.

One main difference between Bitcoin and most existing currencies is the way of currency issuance and the method of bookkeeping. Bitcoin is implemented on a P2P network and everyone is able to join or leave the network without any permission. The participants in generating new blocks are called miners, and their tasks are to verify the legitimacy of undetermined transactions and pack those legal transactions into a block. In order to validate the block, every miner needs to work on a cryptographic puzzle, which needs large amount of computational resources (like electricity and hardware). To motivate the miners to generate new blocks, the first miner who solves the puzzle correctly, will get rewarded. Roughly speaking, the probability that a miner can propose a block successfully is proportional to her computational power among all miners. The above protocol of identifying the block generator is called proof of work and the process of solving puzzles is called mining. More detailed of Bitcoin system description can be referred to the Bitcoin white paper [15] and wiki(^1).

To control the frequency of block generation, the difficulty of puzzles is adjusted dynamically in Bitcoin system, which is about every ten minutes in expectation. With the development of Bitcoin, the computational power in the system is extremely large nowadays. Thus, it is very difficult for a solo miner to propose a block successfully. This means the miner may spend several years to obtain (huge) rewards which is of course unacceptable. To pursue stable reward, a lot of miners join in the mining pools which are gatherings of individual computational powers [16]. Miners in the same mining pool work together on the proof of work protocol, thus a pool can obtain reward much more frequently compared to mining individually. After obtaining reward, the mining pool will distribute the reward among miners in this pool based on the computational power of each miner. For a solo miner, attending a mining pool cannot increase the expected reward, but the variance is improved significantly. Nowadays, there are more than 12 Bitcoin mining pools and more than 90% of Bitcoin mining is done by those pools(^2).

However, it is shown in [7] that the permissionless mining pools have strong incentives to launch the pool block withholding attacks (PBW attacks) on each other: one strategic pool manager sends some of her miners to other pools and these miners pretend to work on the puzzles but actually do nothing. In Bitcoin or any decentralized system, the pool managers are not able to recognize such


  1. https://wikipedia.org/wiki/Bitcoin
  2. Bitcoin hashrate distribution. https://blockchain.info/pools. Accessed June 14, 2019.

malicious miners, thus these miners can still obtain the reward from mining pool proportional to their computing powers. Eyal [7] proved that although infiltrating the other pools consumes some of her computational power, the reward from infiltrating miners may still increase the pool’s total utility. PBW attack is also demonstrated in [13] with a different reward function.

Eyal [7] modeled the above scenario as pool block withholding games (PBW games) and showed that with any number of pools, no-pool-attacks is not a Nash equilibrium. When there are two pools, the situation faced by the two managers is similar to prisoner’s dilemma, which is called miner’s dilemma: in an equilibrium, both managers launch the PBW attack and accordingly earn less rewards compared to when they mine honestly.

Alkalay-Houlihan and Shah [1] further studied the miner’s dilemma between two strategic mining pools and obtained the bound of the social loss due to noncooperation, i.e., price of anarchy. They showed that the pure Nash equilibrium always exists, and the pure price of anarchy is at most 3 in this game. They also conjectured the tight bound should be 2 and demonstrated this in some special cases.

It is disappointing that launching PBW attacks to be always profitable for the pools, and would dramatically decrease the social welfare. Such bad news is derived from the inaction of the malicious Bitcoin pools. Actually, in both [7] and [1], the authors assume that all the rewards are proportionally distributed to the miners. Accordingly, a natural question to ask would be

\textit{Is there any other approach to reallocate the rewards such that PBW attacks can be avoided or discouraged.}

1.1 Main Contributions

Briefly, we resolve the above question affirmatively by refining the PBW games and show that for any number of mining pools, no-pool-attacks is always a Nash equilibrium under a proper way to award the successful miner. Moreover, with only two minority pools participating, no-pool-attacks is the unique Nash equilibrium. We summarize our contributions as follows.

– We refine the PBW game by allowing all pool managers to deduct (different) percentages of their rewards before they distribute the reward to the miners, where the deducted reward is used to award the successful miner. We call the refined model DPBW game. We believe such a new reward distribution rule in DPBW games is more realistic than the proportional way, as, intuitively, the former provides enough motivation for the miners to work hard on mining. Moreover, it should also be careful that the manager cannot deduct too much from the total reward since the more they deduct, the less incentives are left to the miners to join the pools.

– We prove that for arbitrary number of pools, no-pool-attacks is always a Nash equilibrium of DPBW games with reasonable deductions. Thus the price of stability (PoS) of DPBW games is 1. This is intriguing for the situations in which there is some authority that can influence the managers a bit and help them converge to the optimal solution that is also stable.

– Particularly, for two minority pools participating, the unique Nash equilibrium of DPBW games with reasonable deductions is no-pool-attacks. This result is more exciting for the special case: even without such authorities, every manager is automatically willing to not launch PBW attacks.

Thus in the present work, we improve the negative results proved in [7] and [1] by designing a new way to allocate the reward to the miners and award the successful miner such that all the computing power will be used to mining.

1.2 Related Works

Recent years have seen a number of studies on security issues of blockchain, such as the selfish mining attack [8], the Eclipse attack [10] and the distributed denial-of-service attack [11]. A closely relevant attack to our study is the Block Withholding Attack (BW attack) which is first defined by Rosenfeld in [16]. In this attack, a miner who has found a legal block chooses not to submit it to the mining pool immediately but rather delays to submit or even directly abandons it. Many follow-up studies focus on the simulation and countermeasures of BW attacks.

To resist BW attacks, Schrijvers et al. [17] designed a new incentive compatible reward function and showed proportional mining rewards are not incentive compatible. It was shown in [4] that by giving extra reward to a miner who actually finds the winning block on behalf of the pool, it is possible to discourage BW attacks. Tosh et al. [18] modeled the BW attacks in a blockchain cloud and demonstrates that attacker’s access to extra computational power could disrupt the honest mining operation. Very recently, Wu et al. [19] constructed a generalized model where two participants can choose to either cooperate with each other or employ a BW attack. They showed that increasing the information asymmetry by utilizing information conceal mechanisms could lower the occurrence of BW attacks. More studies in this line can be found in [3,14,9].

The Pool Block Withholding Attack (PBW attack) is a variation of BW attacks, the difference is that the PBW attack is launched by a pool manager rather than a miner. Eyal [7] describes PBW attacks from the perspective of game theory in which the players are pool managers and the strategy of each manager is allocating some powers to launch the block withholding attack. Eyal also proves that with any number of pools, all pools mining honestly and not attacking each other is never a Nash equilibrium. Subsequently, [1] follows Eyal’s study on the case of two pools attacking each other. They show that this game always admits a pure Nash equilibrium, and its pure price of anarchy is at most 3.

Finally, there are also many other game theoretic studies about Bitcoin and other crypto currencies. For example, [2] studies how to incentivize the users to propagate each transaction in a tree network; [5] shows that with only transaction fees (and negligible block rewards), the miners have strong incentive to fork a block and generate a greater reward; [12] views Bitcoin from a cooperative game theoretic perspective and show under high transaction loads, it is difficult for pool managers to distribute rewards in a stable way; [6] adopts an axiomatic approach to investigate how the reward should be distributed among the miners.

2 Model

In this section, we formally define the Discouraging pool block withholding games (DPBW games). Similar with [7] and [1], it is assumed that each miner exactly joins one pool and is totally operated by that pool’s manager. Thus the players in a DPBW game are the managers of ( n ) mining pools, denoted by ( N ). Let ( m_i \in \mathbb{R}^+ ) be the mining power of manager ( i \in N ). Assume ( m ) is the total mining power in the worldwide Bitcoin system and ( m \geq \sum_{i \in N} m_i ). By ( m > \sum_{i \in N} m_i ), we mean there are extra mining power outside of the studied pools ( N ), due to any solo miners or inaccessible mining pools.

In a DPBW game, each player ( i ) might only allocate ( \alpha_i ) fraction of the total reward to her miners proportionally to their mining power and award the left to the successful miner. She might also use a fraction of its mining power to infiltrate another pool ( j ). Such mining power does not actually work for pool ( j ), but gets a fraction of the total reward from ( j ). It is assumed that the mining powers are continuous and can be arbitrarily divided. In this work, we study how to select these ( \alpha )’s for the players so that all of them do not want to infiltrate others. Thus player ( i )’s strategy space is all possible ( x_i = (x_{ij}){j \in N \setminus {i}} ) such that ( \sum{j=1}^{n} x_{ij} \leq m_i ) and ( x_{ij} \geq 0 ) for all ( j \in N \setminus {i} ). Each ( x_{ij} ) with ( j \neq i ) represents the amount of mining power that ( i ) wants to infiltrate pool ( j ). Denote by ( x = (x_1, \cdots, x_n) ) a full strategy profile.

To make us focus on the pool block withholding behaviors, we assume the reward for each block is fixed, thus the selection of the transactions does not matter. Given a strategy profile ( x ), the players’ utilities are defined as follows: Assume each player ( i ) first gets a total reward of ( r_i(x) ) by mining and infiltrating other pools. Then she deducts ((1 – \alpha_i)) fraction from ( r_i(x) ) used for award and obviously only her honest miners can get this. The remaining reward ( \alpha_i r_i(x) ) is proportionally allocated to all her miners including both her honest and the infiltrated ones from the other pools. Thus the pool’s true utility is the total reward allocated to her honest miners.

More precisely, each player ( i )’s reward ( r_i(x) ) consists of two parts: direct reward and infiltrating reward. Since every player ( i ) only uses ( m_i – \sum_{l \neq i} x_{il} ) mining power for honest mining, player ( i )’s direct reward is proportional to the fraction of the honest mining power contributed by her pool, denoted by

[ DR_i(x) = \frac{m_i – \sum_{l \neq i} x_{il}}{m – \sum_{j=1}^{n} \sum_{l \neq j} x_{jl}}. ]

since the pools cannot distinguish infiltrating miners from honest miners, player (i)’s infiltrating reward from every other pool (j) is proportional to the fraction of her infiltrating mining power to (j), denoted by [ IR_i(x) = \sum_{j \in N \setminus {i}} \frac{\alpha_j \cdot r_j(x) \cdot x_i}{m_j + \sum_{l \neq j} x_{lj}}. ] Note that both the direct reward and the infiltrating reward will be allocated to all miners. Accordingly, player (i)’s total reward is [ r_i(x) = DR_i(x) + IR_i(x), ] and her utility is [ U_i(x) = (1 – \alpha_i + \frac{m_i \alpha_i}{m_i + \sum_{j \neq i} x_{ji}}) r_i(x). ] (1)

For any vector (v = (v_1, \ldots, v_n)) and a particular (1 \leq i \leq n), denote by (v_{-i}) the resulting vector of (v) when element (v_i) is omitted. A strategy profile (x) is called a Nash equilibrium if for every player (i) and every possible strategy (x’_i), [ U_i(x) \geq U_i(x’i, x{-i}). ]

In a non-cooperative game, the price of anarchy (PoA) is defined as the ratio between the optimal social welfare and the worst social welfare of any possible Nash equilibria; while the price of stability (PoS) is defined as the ratio between the optimal social welfare and the best social welfare of any possible Nash equilibria. In DPBW games, the social welfare is defined as the total mining power that is used to honest mining.

3 The PoS of DPBW Games is 1

In this section, we prove that for most reasonable (\alpha_i)’s, no-pool-attacks is always a Nash equilibrium of DPBW games for any number of mining pools, that is, the PoS of DPBW games is always 1. Formally,

Theorem 1 Strategy profile ((0, \ldots, 0)) is a Nash equilibrium for any DPBW game if (\forall i, \alpha_i \leq 1 – \frac{m_{\text{max}}}{m}), where (m_{\text{max}} = \max_{i \in N} m_i).

Proof Let (x = (x_1, \ldots, x_n)) be the strategy profile such that (x_{ij} = 0) for any (i \in N) and (j \in N \setminus {i}). Note that, in Equation (1), the coefficient before (r_i(x)) is always 0 as (x_{ji} = 0) for all (j \neq i). Thus, in the rest proof we show [ r_i(x’i, x{-i}) \geq r_i(x)] instead of [ U_i(x’i, x{-i}) \geq U_i(x_i), ] where (x’_i) is any feasible strategy of player (i).

Arbitrarily fix a player (i) and it is easy to see (r_i(x) = \frac{m_i}{m}). To prove the theorem, it suffices to show that for any feasible strategy (x’i) with all (j \neq i) and (x’{ij} \geq 0), [ r_i(x’i, x{-i}) \leq \frac{m_i}{m}. ] Denote ( i )’s total infiltrating mining power under strategy ( \mathbf{x}’i ) by ( x’i = \sum{j \neq i} x’{ij} ).

Then [ r_i(\mathbf{x}’i, \mathbf{x}{-i}) = \frac{m_i – x’i}{m – x’i} + \sum{j \in N \setminus {i}} \frac{\alpha_j x’{ij} m_j}{(m – x’i)(m_j + x’{ij})} ] [ = \frac{m_i}{m} \left( 1 + \frac{x’i}{m_i} \cdot \frac{m_i – m}{m – x’i} + \sum{j \in N \setminus {i}} \frac{m_i}{m_i} \cdot \frac{\alpha_j x’{ij} m_j}{(m – x’i)(m_j + x’{ij})} \right) ] [ = \frac{m_i}{m} \left( 1 + \sum_{j \in N \setminus {i}} \frac{x’{ij}}{m_i} \cdot \frac{m_i – m}{m – x’i} + \sum{j \in N \setminus {i}} \frac{m_i}{m_i} \cdot \frac{\alpha_j x’{ij} m_j}{(m – x’i)(m_j + x’{ij})} \right). ]

Now we claim the following inequality: [ \frac{x’_{ij}}{m_i} \frac{m_i – m}{m – x’i} + \frac{m_i}{m_i} \cdot \frac{\alpha_j x’{ij} m_j}{(m – x’i)(m_j + x’{ij})} \leq 0. ] \hspace{1cm} (2)

Note that Inequality (2) implies ( r_i(\mathbf{x}’i, \mathbf{x}{-i}) \leq \frac{m_i}{m} = r_i(\mathbf{x}) ), which completes the proof of Theorem 1.

The remaining is dedicated to show the correctness of Inequality (2). By rearranging the terms, the lefthand of Inequality (2) equals [ \frac{x’{ij}[(m_i – m)(m_j + x’{ij}) + mm_j \alpha_j]}{m_i(m – x’i)(m_j + x’{ij})}. ]

As the denominator is always positive, we only need to consider the sign of its numerator: [ (m_i – m)(m_j + x’{ij}) + mm_j \alpha_j \leq (m_i – m)m_j + mm_j \alpha_j \ \leq (m_i – m)m_j + mm_j \left(1 – \frac{m{\text{max}}}{m}\right) \ = m_im_j – m_{\text{max}}m_j \leq 0, ]

where the first inequality is because ( x’{ij} \geq 0 ) and the second is because ( \alpha_i \leq 1 – \frac{m{\text{max}}}{m} ).

Remark. Note that ( \frac{m_{\text{max}}}{m} ) fraction of the reward cannot be too large for any decentralized system, thus we believe the requirement of ( \alpha_i \leq 1 – \frac{m_{\text{max}}}{m} ) in Theorem 1 is a reasonable tradeoff between complementing the maintenance of a pool and incentivizing the miners to join it.

4 The Uniqueness of Nash equilibrium for Two-Pool Case

In this section, we study a special case of the DPBW game when only two pools are included, which is exactly the same setting with the previous work [1]. However, as will be proved, in PBWA+ game, it is possible for the pool managers to deduct a small fraction from the reward, so that no-pool-attacks is a unique Nash equilibrium.

Theorem 2 For PBWA+ Game with two players, by setting ( \alpha_1 \leq 1 – \frac{m_2}{m} ), ( \alpha_2 \leq 1 – \frac{m_1}{m} ) and ( m > 3(m_1 + m_2) ), the game has a unique Nash equilibrium where both players do not infiltrate the other pool.

We believe the requirement of ( m > 3(m_1 + m_2) ) in the theorem is reasonable as the statistic website(^3) shows that the largest two pools have roughly a third of the total computational power.

4.1 Notations and Proof Ideas of Theorem 2

Before we prove Theorem 2, we first simplify our notions to ease our representation. Since there are only two players, we simplify our notions as follows. Let ((x_1, x_2)) be a strategy profile, where (x_i \in [0, m_i]) means how much player (i) infiltrates player (j = 3 – i). Thus each player (i)’s direct reward (r_i(x_1, x_2)) is

[ DR_i(x_1, x_2) = \frac{m_i – x_i}{m – x_1 – x_2}; ]

her total reward is

[ r_i(x_1, x_2) = R_i(x_1, x_2) + \frac{\alpha_j r_j(x_1, x_2) x_i}{m_j + x_i} = \frac{m_i – x_i}{m – x_1 – x_2} + \frac{\alpha_j r_j(x_1, x_2) x_i}{m_j + x_i}, ]

and her utility is

[ U_i(x_1, x_2) = (1 – \alpha_i + \frac{m_i \alpha_i}{m_i + x_j}) r_i(x_1, x_2). ]

Here note that in (U_i(x_1, x_2)), the first coefficient (1 – \alpha_i + \frac{m_i \alpha_i}{m_i + x_j}) only depends on (x_j), thus when we analyze player (i)’s utility gain by unilateral deviation, without loss of generality, we just ignore this coefficient and only consider her total reward (r_i(x_1, x_2)).

Solving the reward system of equations in (r_i(x_1, x_2)) for (i = 1, 2), we get the closed forms for the reward functions:

[ r_1(x_1, x_2) = \frac{(m_1 + x_2)[(m_1 – x_1)(m_2 + x_1) + \alpha_2 x_1(m_2 – x_2)]}{(m – x_1 – x_2)[(m_1 + x_2)(m_2 + x_1) – \alpha_1 \alpha_2 x_1 x_2]} ]

and

[ r_2(x_1, x_2) = \frac{(m_2 + x_1)[(m_2 – x_2)(m_1 + x_2) + \alpha_1 x_2(m_1 – x_1)]}{(m – x_2 – x_1)[(m_2 + x_1)(m_1 + x_2) – \alpha_1 \alpha_2 x_1 x_2]}. ]


(^3)https://btc.com/stats/pool?pool_mode=month

Let ( x_i + \Delta x_i ) be player ( i )’s deviation, where ( \Delta x_i \in [-x_i, m_i – x_i] ). Denote by [ f_1(\Delta x_1) = r_1(x_1 + \Delta x_1, x_2) – r_1(x_1, x_2) ] and [ f_2(\Delta x_2) = r_2(x_1, x_2 + \Delta x_2) – r_2(x_1, x_2) ] the reward gain by unilateral deviation. By definition, the necessary and sufficient condition for strategy profile ((x_1, x_2)) to be a Nash equilibrium is for any ( \Delta x_i \in [-x_i, m_i – x_i] ), [ \begin{cases} f_1(\Delta x_1) \leq 0 \ f_2(\Delta x_2) \leq 0. \end{cases} ] Thus the problem of finding all possible Nash equilibria becomes finding all such ((x_1, x_2)) strategy profiles.

Since players 1 and 2 are symmetric, in the following, without loss of generality, we often use player 1 for illustration. The formula of ( f_1(\Delta x_1) ) can be simplified as a quadratic function of ( \Delta x_1 ): [ f_1(\Delta x_1) = A_1(x_1, x_2)(\Delta x_1)^2 + B_1(x_1, x_2)\Delta x_1 + C_1(x_1, x_2), ] where the formulas of ( A_1(x_1, x_2), B_1(x_1, x_2) ) and ( C_1(x_1, x_2) ) are shown in Table 1.

[ \begin{align*} A_1(x_1, x_2) &= -(m_1 + x_2) + r_1(x_1, x_2)(m_1 + x_2 – \alpha_1 \alpha_2 x_2) \ B_1(x_1, x_2) &= (m_1 + x_2)(m_1 – m_2 – 2x_1 + \alpha_2(m_2 – x_2)) \ &\quad + \frac{(m_1 + m_2)[(m_1 – x_1)(m_2 + x_1) + \alpha_2 x_1(m_2 – x_2)]}{m – x_1 – x_2} \ &\quad – \frac{(m_2 – x_1)(m_2 + x_1)(m_2 + x_1 + \alpha_2 x_2(m_2 – x_2))}{m_1 + x_2(m_2 + x_1) – \alpha_1 \alpha_2 x_1 x_2} \ C_1(x_1, x_2) &= 0. \end{align*} ]

Note that since ( r_1(x_1, x_2) \leq 1 ), ( A_1(x_1, x_2) \leq 0 ). Thus, for any strategy profile ((x_1, x_2)), in order to make ( f_1(\Delta x_1) \leq 0 ) for any ( \Delta x_1 \in [-x_1, m_1 – x_1] ), there are three possible cases, as shown in Figures 1a, 1c and 1b:

Case 1. ( x_1 = 0 ) and ( B_1(x_1, x_2) \leq 0 ); Case 2. ( 0 < x_1 < m_1 ) and ( B_1(x_1, x_2) = 0 ); and Case 3. ( x_1 = m_1 ) and ( B_1(x_1, x_2) \geq 0 ).

Symmetrically, we have all the corresponding definitions of ( f_2(\Delta x_2), A_2(x_1, x_2), B_2(x_1, x_2) ) and ( C_2(x_1, x_2) ) for player 2.

Therefore, combining with players 1’s and 2’s strategies, we have nine kinds of possible Nash equilibria. By Theorem 1, we already know that ((0, 0)) is a Nash equilibrium. In the following, we will show that ((0, 0)) is actually the only possible Nash equilibrium for among the nine situations. Thus in any DPBW game with two players, both the PoA and the PoS are 1.

Fig. 1: Three possible strategies for player 1

4.2 Proof of Theorem 2

In the remaining of this section, we prove Theorem 2 by showing that all the eight situations, except (0, 0), are not possible to be Nash equilibria.

4.3 Case 2 + Case 2: $B_1(x_1, x_2) = B_2(x_1, x_2) = 0$

To simplify our analysis, we first multiply $B_1(x_1, x_2)$ by

$$\frac{(m – x_1 – x_2)[(m_1 + m_2)(m_2 + m_1) – \alpha_1\alpha_2x_1x_2]}{m_1 + x_2}$$

to eliminate the denominator, denoting the resulting polynomial by $Q_1(x_1, x_2)$. By rearranging every monomials, $Q_1(x_1, x_2)$ can be denoted as a quadratic function of $x_1$:

$$Q_1(x_1, x_2) = a_1(x_2)x_1^2 + b_1(x_2)x_1 + c_1(x_2),$$

where $a_1(x_2)$, $b_1(x_2)$ and $c_1(x_2)$ are independent of $x_1$ and their formulas are shown in Table 2. Similarly, we also have all the corresponding definitions of $Q_2(x_1, x_2)$, $a_2(x_1)$, $b_2(x_1)$, and $c_2(x_1)$ for player 2.

To show the impossibility of Case 1, it suffices to prove the following lemma.

Lemma 1 For any strategy profile $(x_1, x_2)$, $Q_1(x_1, x_2)$ and $Q_2(x_1, x_2)$ cannot be 0 simultaneously.

Before we prove Lemma 1, we first prove the following claims (Claim 4.3, Claim 4.3 and Claim 4.3).

Claim $c_1(x_2) \leq 0$ for any $x_2 \in [0, m_2]$.


Useful information for enthusiasts:

Contact me via Telegram: @ExploitDarlenePRO