
Due to the open-source nature of the blockchain ecosystem, it is common for new blockchains to fork or partially reuse the code of classic blockchains. For example, the popular Dogecoin, Litecoin, Binance BSC, and Polygon are all variants of Bitcoin/Ethereum. These “forked” blockchains thus could encounter similar vulnerabilities that are propagated from Bitcoin/Ethereum during forking or subsequently commit fetching. In this paper, we conduct a systematic study of detecting and investigating the propagated vulnerabilities in forked blockchain projects. To facilitate this study, we propose BlockScope, a novel tool that can effectively and efficiently detect multiple types of cloned vulnerabilities given an input of existing Bitcoin/Ethereum security patches. Specifically, BlockScope adopts similarity-based code match and designs a new way of calculating code similarity to cover all the syntax-wide variant (i.e., Type-1, Type-2, and Type-3) clones. Moreover, BlockScope automatically extracts and leverages the contexts of patch code to narrow down the search scope and locate only potentially relevant code for comparison.
Our evaluation shows that BlockScope achieves good precision and high recall both at 91.8% (1.8 times higher recall than that in the state-of-the-art ReDeBug while with close precision). BlockScope allows us to discover 101 previously unknown vulnerabilities in 13 out of the 16 forked projects of Bitcoin and Ethereum, including 16 from Dogecoin, 6 from Litecoin, 1 from Binance BSC, and 4 from Optimism. We have reported all the vulnerabilities to their developers; 40 of them have been patched or accepted, 66 were acknowledged or under pending, and only 4 were rejected. We further investigate the propagation and patching processes of discovered vulnerabilities, and reveal three types of vulnerability propagation from source to forked projects, as well as the long delay (mostly over 200 days) for releasing patches in Bitcoin forks (vs. \textasciitilde100 days for Ethereum forks).
I. INTRODUCTION
Blockchain \cite{67} and DeFi (Decentralized Finance) \cite{79} are emerging in recent years. A good development in the blockchain ecosystem is that many projects are open-source. This is particularly true for the public blockchains like Bitcoin and Ethereum. As a result, new blockchains could fork or partially reuse the code of classic blockchains to speed up the development. Notably, Bitcoin is the one with most forked projects — the popular Dogecoin, Litecoin, Dash, Zcash, and Bitcoin Cash/SV are all variants of Bitcoin. In recent years, Ethereum was also forked by a number of EVM (Ethereum Virtual Machine)-compatible chains, such as Binance Smart Chain (BSC), Polygon, Avalanche Contract Chain, and Optimism (Ethereum’s Layer-2 rollup network).
However, “forked” blockchains could encounter similar vulnerabilities that appeared in the code of Bitcoin and Ethereum. Specifically, a vulnerability could be propagated from Bitcoin/Ethereum to the forked projects during the initial fork or subsequently when updated commits are fetched from Bitcoin/Ethereum. In this paper, we aim to systematically detect cloned vulnerabilities in forked blockchain projects and investigate how they are propagated and patched.
To facilitate this study and future analysis, we propose BlockScope, a novel tool that can not only automatically detect vulnerable clones but also pinpoint the cases already fixed and their patching process information. To achieve effective and efficient detection on all the syntax-wide cloned vulnerabilities (i.e., Type-1, Type-2, and Type-3 clones, as to be defined in Sec. II-C), BlockScope has two unique designs as compared to typical code clone detection tools, e.g., \cite{43}, \cite{47}, \cite{50}, \cite{57}, \cite{68}, \cite{82}. First, we adopt similarity-based code match, instead of the hash-based exact match in ReDeBug \cite{43}, VUDDY \cite{50}, and MVP \cite{82}, so that BlockScope is more tolerant to the code lines with no exact “abstracted” hashes. Moreover, we design a new way of calculating code similarity to better handle the code fragments with inserted/deleted/reordered code lines. According to our evaluation with the state-of-the-art ReDeBug tool, our new design greatly reduces false negatives while only slightly increasing false positives for our problem. Second, BlockScope automatically extracts and leverages patch code contexts to locate only potentially relevant code for comparison. This not only dramatically improves the running performance for large projects, e.g., 15.4 times faster than ReDeBug in analyzing Ethereum’s forked projects with more lines of code (LOC), but also enhances the detection precision because the context similarity is also being considered.
To evaluate BlockScope, we collect a dataset of 38 security patches — 32 of them are directly from Bitcoin’s repository because there were only four CVEs in the recent five years, and the rest six are CVEs of Ethereum reported in the last three years. With this input, we apply BlockScope and ReDeBug to test 11 most popular forked projects of Bitcoin and 5 of Ethereum (identified from nearly the top 100 cryptocurrencies), with 4.2M C/C++ LOC and 3.5M Go LOC, respectively. The evaluation shows that BlockScope detects 101 true vulnerabilities in all the 13 forked projects (three projects, Qtum, Avalanche, and Polygon, does not contain any of the tested vulnerabilities), whereas ReDeBug detects…
only 57 vulnerabilities in 11 forked projects. By performing a thorough code review of all the raw detection results, we find that BlockScope achieves good precision and high recall both at 91.8%, whereas ReDeBug’s recall is only 51.8% despite its precision at 95%. Among the 101 vulnerabilities automatically detected by BlockScope, we are able to identify serious ones from the top blockchains like Dogecoin, Litecoin, Bitcoin SV, Binance BSC, and Optimism. This demonstrates the real-world impact of our work.\footnote{Binance acknowledged our vulnerability report with a bug bounty reward.}
We further investigate how the discovered vulnerabilities\footnote{Besides 101 automatically detected cases, we also analyzed 9 that were false negatives in BlockScope but manually identified during the evaluation.} are propagated from Bitcoin/Ethereum to their forked projects and understand the patching processes of the 138 cases that were already fixed in forked projects before our detection. Specifically, we reveal three types of vulnerability propagation from Bitcoin/Ethereum to their forked projects, including the cases directly forked in the beginning, fetched from vulnerable commits, and infected with no explicitly vulnerable commits. Besides vulnerability propagation, we additionally identify three other propagation that caused false positives and negatives in BlockScope; details in Sec. V-B. As for patch delays, we find that only DigiByte, among the six forked projects of Bitcoin with enough patched cases, can catch up with Bitcoin’s patch release schedule. The patch delays for the other five are typically long, mostly over 200 days. Compared with Bitcoin, the result for Ethereum’s forked projects is relatively acceptable, with half of the patches released within 100 days.
Contributions. To sum up, we make the following major contributions in this paper:
- (Methodology) We propose novel patch-based clone detection for vulnerable code clones in forked projects, in which we design (i) a context-based search with similarity measurement to efficiently locate candidate code clones and (ii) a new way of calculating the similarity between two code fragments that is immune to Type-1/2/3 clones.
- (Detection) We apply this methodology to detect 101 previously unknown vulnerabilities in the forked projects of Bitcoin and Ethereum with high precision and recall.
- (Investigation) We further conduct a deep investigation of the vulnerability propagation and patching processes of the discovered vulnerabilities, and reveal new findings.
Ethics. As an ethical research and one contribution of this paper, we have spent significant efforts reporting all the 110 vulnerabilities (including nine false negatives manually identified during the evaluation). The details are available in Sec. V-D and this GitHub repository, \url{https://github.com/VPRLab/BlkVulnReport}.
Roadmap. The rest of this paper is organized as follows. After explaining different blockchain projects and code clone types in Sec. II, we first propose the BlockScope tool in Sec. III to effectively detect the propagated vulnerabilities in the forked blockchains. We then evaluate the accuracy and performance of BlockScope and leverage it to discover previously unknown vulnerabilities in Sec. IV. We further analyze how the discovered vulnerabilities are propagated from Bitcoin and Ethereum to the forked projects and understand their patching processes in Sec. V. We then discuss some insights and implications in Sec. VI. Lastly, Sec. VII reviews the related work and Sec. VIII concludes the paper.
II. BACKGROUND
In this section, we first introduce the background of Bitcoin, Ethereum, and their popular forked projects in Sec. II-A and Sec. II-B and then provide the definition of different code clone types in Sec. II-C.
A. Bitcoin and its Forked Projects
Bitcoin (BTC)\footnote{Bitcoin (BTC) \cite{61} is the first cryptocurrency that introduced the blockchain technology to the world.} is the first cryptocurrency that introduced the blockchain technology to the world. Bitcoin leverages blockchain as a distributed ledger to guarantee the consensus between different peers. Currently, Bitcoin is, without doubt, the dominant cryptocurrency, whose market capitalization takes around 40% of the whole market. Since Bitcoin is open-sourced, it has nourished many blockchain projects. Specifically, among the top 100 cryptocurrencies on CoinMarketCap\cite{15} as of 7 September 2021, we identified that 11 projects directly fork or partially reuse the code of Bitcoin. We list them in Table Ia and refer to them as Bitcoin’s forked projects in this paper.
Most forked projects forked only the Bitcoin code, whereas Bitcoin Cash (BCH), Bitcoin SV (BSV), and Bitcoin Gold (BTG) also forked Bitcoin’s blockchain, i.e., copying its transaction history, as the basis for their own blockchain\footnote{Their code is available in their GitHub repositories.}. They
\begin{table}[h] \centering \begin{tabular}{|l|l|l|l|l|l|} \hline \textbf{Name} & \textbf{Code} & \textbf{Market Cap} & \textbf{Repository} & \textbf{Star} \ \hline Bitcoin & BTC & $749.70B$ & bitcoin/bitcoin & 60.3K \ Dogecoin & DOGE & $42.55B$ & dogecoin/dogecoin & 13.6K \ Bitcoin Cash & BCH & $12.02B$ & Bitcoin-ABC/bitcoin-abc & 1.1K \ Litecoin & LTC & $11.88B$ & litecoin-project/litecoin & 4K \ Bitcoin SV & BSV & $3.24B$ & bitcoin-sv/bitcoin-sv & 520 \ Dash & DASH & $1.79B$ & dashpay/dash & 1.4K \ Zcash & ZEC & $1.64B$ & zcash/zcash & 4.5K \ Bitcoin Gold & BTG & $1.04B$ & BTCGPUB/TBTCP & 611 \ Horizen & ZEN & $935.27M$ & HorizenOfficial/zen & 202 \ Qtum & QTUM & $923.88M$ & qtumproject/qtum & 1.1K \ DigiByte & DGB & $686.91M$ & dgbbyte/dgbbyte & 361 \ Ravencoin & RVN & $693.34M$ & RavenProject/Ravencoin & 932 \ \hline \end{tabular} \caption{The basic information of Bitcoin, Ethereum, and their popular forked projects.} \end{table}
\begin{table}[h] \centering \begin{tabular}{|l|l|l|l|l|} \hline \textbf{#} & \textbf{Name} & \textbf{Code} & \textbf{Market Cap} & \textbf{Repository} & \textbf{Star} \ \hline 1 & Ethereum & ETH & $229.87B$ & ethereum/go-ethereum & 37.7K \ 5 & Binance & BNB & $50.69B$ & bnb-chain/bsc & 1.6K \ 14 & Avalanche & AVAX & $7.65B$ & ava-labs/subnet-evm & 1.6K \ 17 & Polygon & MATIC & $5.15B$ & maticnetwork/bor & 400 \ 78 & Celo & CELO & $604.02M$ & celo-org/celo-blockchain & 382 \ 199 & Optimism & OP & $263.36M$ & ethereum-optimism/optimism & 1.2K \ \hline \end{tabular} \caption{Ethereum and its forked projects (as of 6 June 2022).} \end{table}
are known as the “hard forks” of Bitcoin, as each of them creates a permanent fork of the original Bitcoin’s blockchain. We present the relationship between Bitcoin and these three projects in Fig. 1. As we can see, Bitcoin Cash is the earliest fork, which aims to reduce the transaction fee and improve the transaction speed of the original Bitcoin. Therefore, they extend the maximum block size to 32MB, while the original Bitcoin’s block size limit is 1MB. Bitcoin SV further extends this limit to 128MB, which leads to another hard fork. Bitcoin Gold, on the other hand, claims to solve the original Bitcoin’s monopolized mining problem. Specifically, they hope that by enabling mining on commonly available GPUs instead of specialized ASICs, it can democratize and decentralize the mining.
Litecoin gets its name from “the light version of Bitcoin”. Its goal is to provide faster transactions than Bitcoin. Notably, instead of using Bitcoin’s SHA-256, Litecoin adopts Scrypt [23] as the hash function, which offers a less compute-intensive but more memory-intensive mining process [33]. Dogecoin also leverages Scrypt as the hash function. Indeed, it copies both Bitcoin’s and Litecoin’s code. Although Dogecoin reached a market capitalization of over 40 billion USD, it was initially created as a meme cryptocurrency with an unlimited total supply [16]. DigiByte is another fork of Litecoin’s code. Besides SHA-256 and Scrypt, it can work with three more mining algorithms [25].
Dash is not only a cryptocurrency but also a decentralized autonomous organization run by a subset of its users called “masternodes”. Specifically, anyone with 1,000 Dash can become a masternode in the Dash network and share the block reward. Besides the standard node functions, the masternodes can vote on proposals to improve the ecosystem and provide two additional kinds of transactions, i.e., “InstantSend” and “PrivateSend” for instant transactions and private transactions, respectively [19].
Zcash and Horizen are designed to enhance the privacy for their users. As the original Bitcoin is pseudo-anonymous, it is possible to decipher the patterns and connections involved, which may expose all information related to the sender and the receiver [62]. To tackle this problem, Zcash applies Zero-Knowledge proof algorithms (called zk-SNARKs) to “shield” the transactions so that it will not disclose the information about the coin holders. Similarly, Horizen (formerly known as ZenCash) is a derivative of Zcash. On top of the zk-SNARKs system, Horizen adopts a different funding model, which shares the block reward among miners, developers, and secure/super node operators, while Zcash just rewards miners and developers [62].
Qtum is a hybrid blockchain that combines the characteristics of Bitcoin and Ethereum. It introduces an Account Abstraction Layer to integrate Bitcoin’s Unspent Transaction Output model with the Ethereum Virtual Machine for smart contracts to operate [74]. Besides, Qtum adopts Proof-of-Stake (PoS) consensus mechanism instead of Bitcoin’s Proof-of-Work (PoW) to simplify the mining process since PoW is resource-intensive, i.e., it wastes enormous amounts of electricity on mining coins [37].
Ravencoin is unique in terms of that it was designed for users to tokenize assets on-chain and transfer ownership via blockchain transactions [34]. Such assets can be physical or digital, including gold, in-game items, copyrights, etc [71].
B. Ethereum and its Forked Projects
Ethereum [80] is the first blockchain system with the capability of constructing Turing-complete smart contracts, which contain a set of pre-defined rules and regulations for self-execution. Ether (ETH) is the native cryptocurrency for maintaining the operations on Ethereum, which is the second largest cryptocurrency with a market capitalization of around 230 billion USD as of June 2022. As an open-sourced project, Ethereum also nourished many blockchain projects. Specifically, we analyzed all the projects listed on Blockscan [10] and selected five of the most popular projects that directly fork or partially reuse the code of Ethereum. Table [15] presents the basic information of these forked projects as of 6 June 2022.
Binance is the largest cryptocurrency exchange in the world. As of 27 July 2022, its 24-hour trading volume reaches 11.7 billion USD [13]. Originally, Binance developed Binance Chain to provide a marketplace for trading cryptocurrency in a decentralized manner, with BNB being the native token. However, as Binance Chain is not EVM-compatible, users cannot develop decentralized applications (DApps) using smart contracts [11]. Binance initiated Binance Smart Chain (BSC) with EVM compatibility to solve this problem. On February 15, 2022, Binance Chain and Binance Smart Chain united into BNB Chain [18]. Currently, BNB Chain holds around 3.4 million transactions daily, with 2.0 million active wallets [14].
Avalanche aims to solve Ethereum’s issues regarding transaction fee, scalability, and programmability, by leveraging a multi-chain approach [40]. Specifically, Avalanche combines three separate blockchain networks, i.e., X-Chain: for issuing digital assets, C-Chain: for converting Ethereum’s DApps to Avalanche, and P-Chain: for validating the states of subnets. Celo is also EVM-compatible. Notably, it provides a client designed for mobile phone users. Moreover, while the transaction fee is paid with the native asset (ETH) on Ethereum, Celo allows users to pay transaction fees with the native asset (CELO) and stable coins (cUSD and cEUR) [35].
Polygon and Optimism are Ethereum’s layer-2 networks, which also target on Ethereum’s scalability and transaction fee issues. Layer-2 solutions refer to infrastructures or simple protocols built on top of the Ethereum main chain [21], i.e., layer-1. Typically, they handle off-chain transactions and send only compact data to layer-1. Polygon is technically a sidechain of Ethereum, as it uses its own consensus algorithms and runs in parallel with the main chain. However, different from sidechains, Optimism uses Optimistic Rollups [20] to interact with the main chain and use smart contracts that reside within Ethereum [24].
C. Definition of Code Clone Types
Due to the nature of open-source projects, it is common for projects to reuse parts of code from others. However, vulnerabilities are always reintroduced due to the casual code reuses, namely code clones. While code clone detections are widely studied among the famous open-source projects, e.g., Linux Kernel, detections for cloned vulnerabilities in the forked blockchain projects are much less explored.
In this study, it is essential to analyze the cloned code among the forked blockchain projects. Therefore, we adopt the type definitions of code clones from [59] as follows:
- Type-1 clones refer to two identical code fragments with variations in whitespaces, layouts, and comments.
- Type-2 clones include Type-1 clones and extend the variations to identifiers, literals, and types, e.g., variable renaming.
- Type-3 clones further extend these variations to syntactically similar code with inserted, deleted, or updated statements.
- Type-4 clones refer to semantically equivalent code fragments but syntactically different, which is out of the scope of this paper.
In this paper, we focus on the detection of Type-1, Type-2, and Type-3 code clones. Detecting Type-4 code clones requires code semantic learning or understanding, which is out of the scope of typical clone detection tools including BlockScope.
III. BlockScope
A. Design Choices and System Overview
To detect the propagated vulnerabilities from the existing security patches of Bitcoin/Ethereum, we design BlockScope as a patch-based code clone detection tool. This makes BlockScope, by nature, more similar to security-oriented clone detection tools (e.g., ReDeBug [43], VUDDY [50], MVP [82], and VGraph [29]) rather than the traditional clone detection tools (e.g., COPE [47], CPMiner [57], DECKARD [44], and SourcererCC [68]) that do not differentiate vulnerable and patched code inputs. Moreover, since we aim to test all different blockchain projects, we design BlockScope to be language-agnostic as similar to ReDeBug. As a result, we do not perform “program analysis-alike” preprocessing, such as variable/typeofunction abstraction in VUDDY, program slicing in MVP, and code property graph [83] in VGraph, before the similarity measurement between source and target code.
Besides the choices above, BlockScope offers two unique designs that are also the major novelty of our methodology:
- Leveraging patch code contexts to search and locate only potentially relevant code. Since our detection targets are the propagated vulnerabilities in the forked projects, it is reasonable to assume that they have similar contexts as the original patch code in the source repositories. BlockScope thus leverages the extracted patch code contexts to search for potentially relevant code in the target repositories and employs code similarity to finalize the contexts of candidate code clones. This not only helps BlockScope avoid the whole-repository analysis as in typical code clone detection tools but also improves the precision because the context similarity is also being considered.
- Adopting similarity-based code match for being more tolerant to variant code clones. To cover all the syntax-wide Type-1, Type-2, and Type-3 clones, we adopt similarity-based code match, instead of the hash-based exact code match in ReDeBug [43], VUDDY [50], and MVP [82]. This allows BlockScope to be more tolerant to the code lines with no exact “abstracted” hashes (i.e., Type-2 clones). Moreover, we design a new way of calculating code similarity to better handle the code fragments with inserted/deleted/reordered code lines (i.e., Type-3 clones).
Fig. 2 presents the overall workflow of BlockScope in five major steps. Firstly, Sec. III-B describes how the Extractor component or Extractor(^1) extracts the code contexts from patches in the source repositories. Secondly, in Sec. III-C Searcher leverages the extracted patch contexts to search for candidate contexts in the target repositories. Thirdly, Fetcher in Sec. III-D retrieves the patch and candidate code hunks in the source and target repositories, respectively. Fourthly, Comparator in Sec. III-E employs a new similarity-based code matching technique to determine the propagated vulnerabilities from Fetcher’s outputs. Lastly, for the vulnerabilities already patched, Calculator in Sec. III-F measures their patch delays in the target repositories.
B. Extracting Patch Contexts from the Source Repositories
Given a security patch from the source project or code repository (e.g., Bitcoin/Ethereum), BlockScope first extracts its code context. In this paper, we provide an Extractor component to automatically extract the contexts of patch code and use its output for system evaluation. In reality, BlockScope also supports the manually crafted code contexts from security experts for better accuracy. To distinguish the context of patch code from that of target code, we call the former “patch context” and the latter “candidate context”, as shown in Fig. 2.
(^1)We describe different BlockScope components using their names, e.g., Extractor, hereafter.
in Sec. III-C. As a result, we do not require each extracted keyword to be precise because as long as one of the context keywords can find the correct candidate context, context similarity measurement (in Sec. III-C) will automatically exclude the search results of other incorrect context keywords.
We use the left patch code of Fig. 3 to illustrate the process of extracting context keywords. After normalizing and tokenizing each patch code line, Extractor uses the following heuristics to automatically recognize at most one context keyword per code line. Specifically, we consider the tokens with both lower and upper case letters (including some special characters like “.”) and select the longest one as the most important variable or function name of one code line. In this way, BlockScope automatically selects nine context keywords, as highlighted in red color, from the patch code context in Fig. 3. As mentioned above, we do not require each extracted keyword to be precise, and according to our evaluation in Sec. IV, this simple strategy of automatically extracting context keywords works well for our problem.
C. Searching for Candidate Contexts in the Target Repositories
The Searcher component of BlockScope then uses the extracted context keywords to search for candidate contexts in the target repositories. The basic idea is to first search for the key statements in target code (via patch context keywords), then recover the corresponding boundary of each potential code context, and finally determine the candidate contexts via the similarity measurement with the original patch context. To illustrate this context-based search process, we use Bitcoin’s patch of checking corrupted blocks and its vulnerable clone in Dogecoin as a running example. As shown in Fig. 4, the left-hand side is the patch code hunk (commit 0e7c52dc) from Bitcoin, while the right-hand side shows the cloned version in Dogecoin. It also illustrates the following three steps.
- Searching for the key statements. The first step is to find the key statements ((k_s)) that are the code statements in the target code with the searched context keywords. Specifically, Searcher first leverages \texttt{git grep} to search for all the code statements that contain the patch context keyword(s) in the target repositories, and then finalize the search result by measuring the similarity between the searched (k_s) with the original (k_s). If the measured similarity is higher than the threshold configured in BlockScope, we consider it one potential candidate (k_s). To minimize the misses and avoid causing false negatives to the subsequent steps, this step uses a relatively low threshold (0.25) based on the Normalized Levenshtein(^5) metric, i.e., \texttt{strsim()} used in equation (1).
[ p, q = \arg \max_{1 \leq i \leq m, 1 \leq j \leq n} \text{strsim}(s_i, s’_{i,j}) ]
Moreover, in the course of implementing the candidate context search, we adopt three \texttt{automatic} optimizations to further improve BlockScope’s context search precision and avoid unnecessary analysis in the subsequent steps. First, it excludes the search result with comments and test code. Second, it excludes the search result with the file type different from the patch’s file type, e.g., the patch in Fig. 3 is a C/C++ source code file, based on which BlockScope excludes C/C++ header files and non-C/C++ source code files in the search result. Third, BlockScope excludes the search result with different statement types. For example, since line 5 in Fig. 3 is an assignment statement, any search result does not match the same statement type will be automatically discarded.
- Determining the boundary of candidate contexts.
Once identified the candidate ( k_s ), the next step of Searcher is to retrieve the code statements surrounding it and determine their boundary. Specifically, we need to expand the one-line candidate ( k_s ) into the multi-line candidate context that has the corresponding boundary as the original patch context. To do so, we first fetch the same number of nearby code statements from target code as that, represented as ( C_{\text{LINES}} ), in the patch context. For example, in Fig. 3, if we set ( C_{\text{LINES}} = 5 ), Searcher fetches line 1 to 5 and line 7 to 11 for the candidate ( \text{UP} ) and ( \text{DOWN} ) contexts in Dogecoin, respectively. Then starting from the ( k_s ) (i.e., line 5 and 9 of Dogecoin), Searcher compares each code statement upwards and downwards with the start statement (ss) and end statement (es) in the patch context, respectively. It then selects the ones with the highest similarity and also exceeding the aforementioned threshold (0.25) as the boundary ( \text{ss} ) and ( \text{es} ) in the candidate context, e.g., line 3 and line 5 for Dogecoin’s ( \text{UP} ) context.
- Finalizing the candidate contexts via similarity measurement. It is worth noting that ( \text{ss} ) and ( \text{es} ) only define the boundary of the candidate context, while the code statements in between remain unchecked. As illustrated in the step 3 of Fig. 3, we thus further check whether the entire candidate context is indeed similar to the patch context via the same multi-line code similarity measurement that will be introduced in Sec. III-E. If the measured similarity between the candidate context ( C ) and the patch context ( P ) exceeds a threshold, we consider ( C ) as the context of a candidate clone for further processing; otherwise, we discard this candidate context. Note that since multiple candidate contexts’ similarity could exceed the threshold, all of these candidate contexts will be further processed.
D. Fetching Patch and Candidate Code Hunks from the Source and Target Repositories
With the determined candidate context(s), we leverage Fetcher to retrieve the patch code from the source repository and the candidate code from the target repository, respectively. Note that Fetcher is also used by the earlier Searcher component to retrieve the context of a patch/candidate code hunk. Specifically, a typical code hunk consists of three code fragments, the ( \text{UP} ) context, the ( \text{DOWN} ) context, and the middle patch/candidate code, as previously shown in Fig. 3.
For the patch code hunk, Fetcher directly fetches its patch code from the commit history and selects the nearby code statements upwards and downwards (with the line number specified by ( C_{\text{LINES}} )) as the ( \text{UP} ) and ( \text{DOWN} ) contexts, respectively. For the candidate code hunk, we fetch its code statements according to the candidate context determined in Sec. III-C and also the original patch context. Specifically, if the original patch contains both ( \text{UP} ) and ( \text{DOWN} ) contexts, we regard the code statements between the corresponding candidate contexts as the candidate code. As a result, line 6 of Dogecoin is fetched as the candidate code in Fig. 3. If the patch context contains only the ( \text{UP} ) context, we regard the code statements above it as the candidate code. Note that for the last two situations, the candidate code is fetched with the same number of code statements as the patch code.
E. Measuring the Similarity between Patch and Candidate Code
With the fetched patch and candidate code, Comparator measures the similarity between their two code fragments and also determine whether the target repository has fixed the vulnerability, if the candidate code is not vulnerable. As mentioned in Sec. III-A, we need a new way of calculating the code similarity that is immune to Type-1/2/3 clones.
We first abstract the code similarity problem in this form: given a source code fragment ( S ) with ( p ) code statements and a target code fragment ( T ) with ( q ) code statements, respectively, we need to design an appropriate measure to determine their similarity. Intuitively, we can compute the similarity between ( S ) and ( T ) by first adding up the similarity of each pair of code statements at the same position in ( S ) and ( T ) and then normalizing it into ([0, 1]), i.e., ( \frac{1}{p} \sum_{i=1}^{p} \text{strsim}(S_i, T_i) ). While this can handle Type-2 clones because of not using the hash-based exact match per code line, it is still not applicable to measuring Type-3 clones for two reasons. First, as Type-3 clones involve inserted/deleted statements, i.e., ( p \neq q ), the extra code statements will not be measured in this way. Second, because of the inserted/deleted statements, the ordering of the same code statement in ( S ) and ( T ) might be also different.
To solve the problems above, we determine two principles: (i) all the code statements in ( S ) and ( T ) should be considered; and (ii) the influence of the ordering issue should be adjustable. For the first principle, we identify the most similar code statement in ( T ) for every code statement in ( S ), i.e., for each code statement ( S_i \in S ), we find ( T_j \in T ), s.t., ( j = \arg \max_k \text{strsim}(S_i, T_k) ). For the second principle, we first define the index ( i ) and ( j ) as the relative positions of the code statements in ( S ) and ( T ) if ( S_i )’s most similar statement is ( T_j ). The basic idea is that the greater the difference between ( i ) and ( j ) is, the less similarity between ( S_i ) and ( T_j ) should be. Therefore, we introduce a parameter ( r \in [0, 1] ), and ( r^{i-j} ) to indicate the reward of the similarity between ( S_i ) and ( T_j ). By multiplying this reward by the original similarity, we can adjust the ordering issue’s influence on code similarity. To illustrate the impact of ( r ) on the similarity measurement, we calculate the similarities of all the patch and candidate code pairs under different ( r ). We present the result in Appendix A. In this paper, we set 0.95 as the default value of ( r ). Once finishing the calculation of such similarity for every code statement in ( S ), we sum them up and normalize the result into ([0, 1]), as shown in the following equation (2).
$$\text{SIMILARITY}(S, T) = \frac{1}{p} \sum_{i=1}^{p} \text{strsim}(S_i, T_j)^{|i-j|} \quad \text{s.t., } j = \arg \max_{1 \leq k \leq q} \text{strsim}(S_i, T_k)$$
While the method above provides a new way of measuring the similarity between two code fragments, we still need to determine whether the target repository has applied a patch or not. Specifically, given the candidate code ( C ) of the target repository, we compare it with the patch code ( P ). Note that there are three types of ( P ): (i) ( \text{DEL}-\text{type} ): contains only the deleted lines, i.e., ( P = [\text{dp}] ); (ii) ( \text{ADD}-\text{type} ): contains only the added lines, i.e., ( P = [\text{ap}] ); and (iii) ( \text{CHA}-\text{type} ): contains both deleted and added lines, i.e., ( P = [\text{dp}, \text{ap}] ). We thus determine the comparison logic as follows (where ( t ) is the threshold):
TABLE II: An example of the output of git blame
.
Line | Hash | Date |
---|---|---|
202d853b 203 | 202d853b 203 | } |
202d853b 203 | 202d853b 203 | } |
a2714a5c 204 | static int qt_argc = 1; | |
797fef7b 205 | static const char* qt_argv = “qtum-qt”; | |
a2714a5c 206 | QApplication(qt_argc, const_cast<char**>(…)), | |
a2714a5c 207 | QtGui::Application::createApplication(…); | |
a2714a5c 208 | QApplication(qt_argc, const_cast<char**>(…)), | |
9096276e 209 | coreThread(nullptr), | |
71e98908 210 | m_node(node), | |
9096276e 211 | optionsModel(nullptr), |
- For type (i), if SIMILARITY($C, dp$) $\geq t$, we determine that $C$ did not apply $P$; otherwise, we determine that $C$ has applied $P$.
- For type (ii), if SIMILARITY($C, ap$) $\geq t$, we determine that $C$ has applied $P$; otherwise, we determine that $C$ did not apply $P$.
- For type (iii), if SIMILARITY($C, dp$) $\geq t$ and SIMILARITY($C, ap$) $\geq t$ and SIMILARITY($C, dp$) $\geq$ SIMILARITY($C, ap$), we determine that $C$ did not apply $P$; otherwise, if SIMILARITY($C, dp$) $\geq t$ and SIMILARITY($C, ap$) $\geq t$ and SIMILARITY($C, dp$) $<$ SIMILARITY($C, ap$), we determine that $C$ has applied $P$.
Moreover, as Searcher may return multiple candidate contexts in the target repository, leading to multiple candidate code, i.e., $C_i \in [C_1, C_2, …, C_n]$. For each $C_i$, we calculate $s_i = \text{SIMILARITY}(C_i, P)$, and determine its patch applying status $f_{v_i} \in {0, 1}$, where $f_{v_i} = 1 (= 0)$ indicates $C_i$ has (not) applied $P$. Here we introduce a factor $conf_j$ to measure the confidence of $f_{v_i}$ on $C_i$ by $conf_j = s_i – t$, i.e., the greater $s_i$ exceeds $t$ the more confident $f_{v_i}$ is on $C_i$. Finally, we can determine the status of $P$ in the target repository by the most confident $f_{v_i}$, i.e., $i = \arg\max_j conf_j$. If the target repository did not apply $P$, we consider it a vulnerability; otherwise, we consider the vulnerability fixed.
F. Determining Patch Delays for the Vulnerabilities Already Patched in the Target Repositories
For the vulnerabilities already patched in the target repositories, we further leverage Calculator to automatically measure their patch delays. We define the patch delay as the interval between the patch’s commit date in the source project and the patch’s release date in the target project because eventually, the release date is the actual time when a patch is available to the blockchain node operators and end users.
Upon receiving a candidate code that is determined as fixed, Calculator leverages git blame
to retrieve the commit that patched the code. Table I illustrates an example output of git blame
, where the left column shows the commit hash (SHA), the column in the middle shows the line number for the code statements on the right in Qtum’s src/qt/bitcoin.cpp
file. The code from line 204 to line 208 is actually Qtum’s patch for fixing the cloned CVE-2021-3401 [12] in its project. It was added by two commits, a2714a5c69 and 797fef7bee, where 797fef7bee only modified line 205. Hence, we still need to determine which commit is the true fix. In the Qtum example, after checking both commits, we identify that line 205 in Table II was originally added by a2714a5c69 on 10 August 2019 as static const char* qt_argv = “bitcoin-qt”; where “bitcoin-qt” is later replaced by “qtum-qt” in 797fef7bee on 26 June 2020. As a result, if multiple commits modify the candidate code, we consider the earliest one is the true fix commit.
Moreover, we need to scrape the release information from GitHub because the local git repository does not contain such information. By analyzing a commit’s GitHub webpage, Calculator can retrieve all of its release versions and determine the earliest date when the commit was first released. In the Qtum example, the patch commit a2714a5c69 was first released in the version mainnet-ignition-v0.19.0 on 22 February 2020, which was delayed from the original Bitcoin commit by 197 days.
IV. DETECTING THE VULNERABILITIES PROPAGATED TO FORKED PROJECTS
In this section, we aim to detect the vulnerabilities that are propagated from Bitcoin and Ethereum to their forked blockchain projects using BlockScope. To this end, we first benchmark the accuracy and performance of BlockScope (Sec. IV-B) using an experimental setup introduced in Sec. IV-A. We then present the detected vulnerabilities in Sec. IV-C. Finally, we conduct ethical vulnerability reporting and summarize vendors’ response/actions in Sec. IV-D.
A. Experimental Setup
To make sure that BlockScope’s vulnerability detection results are reliable, we not only run BlockScope in our experiment but also compare it with the open-source state-of-the-art ReDeBug [43] using the same dataset and environment below. Note that we also considered other clone detection tools (e.g., [47], [50], [68], [82]) for more comparison but eventually did not choose them for two reasons. First, MVP [82] was not open-source and it does not support the Go language. While VUDDY [50] released its signature generating scripts, its most important vulnerability search engine was not available. Indeed, we contacted the VUDDY team and confirmed that their cloud version currently supports only one CVE in our dataset. Second, CCFinder [47] and SourcererCC [68] are pure code clone detection tools and are not able to perform patch-based detection in our problem without adjustment.
Dataset. As illustrated in Fig. 2, BlockScope requires two sets of input, the target blockchain code repositories and the security patches of a reference blockchain (i.e., Bitcoin and Ethereum in this paper). As a result, we collect these two sets of data as our dataset. Specifically, for code repositories, we select all the 11 forked projects of Bitcoin from the top 100 cryptocurrencies (based on the market capitalization on CoinMarketCap) and five popular forked projects of Ethereum (picked from Blockscan) as our target blockchains, as previously introduced in Sec. II. The total market capitalization of these 16 blockchains was around 142 billion USD. To build a reproducible dataset, we kept a local copy of the latest version of code repositories at the time of our research on 7 September 2021 and 6 June 2022 for Bitcoin forks and Ethereum forks, respectively. On the other hand, for security patches, an intuitive idea is to use the CVE (Common Vulnerabilities and Exposures) information; however, we found…
Table III: The experimental result of BlockScope.
(a) The accuracy and performance comparison between BlockScope and ReDeBug.
Forked Project | LOC | BlockScope | ReDeBug |
---|---|---|---|
TP/FN/TN | TP/FN/TN | Time | |
Dogecoin | 326.9K | 16/15/1 | 7.6s |
Bitcoin Cash | 607.1K | 1/-/30/1 | 10.5s |
Litecoin | 423.3K | 6/-/26/1 | 8.3s |
Bitcoin SV | 221.1K | 11/-/18/2 | 10.6s |
Dash | 380.3K | 9/-1/22/2 | 13.9s |
Zcash | 199.4K | 9/-/2/19/2 | 8.4s |
Bitcoin Gold | 381.7K | 10/-1/21/1 | 8.8s |
Horizen | 178.9K | 9/-2/20/1 | 7.7s |
Qtum | 569.0K | -/-/31/1 | 12.0s |
DigiByte | 416.3K | 10/-1/21/1 | 10.7s |
Ravencoin | 504.2K | 14/-1/16/1 | 11.4s |
Sum | 4.2M | 95/9/239/9 | 109.9s |
*: the numbers in (.) of these cells represent the average LOC per project.
(b) The fixed cases detected by BlockScope.
Forked Project | # Fixed Cases | Detected | Truth | Err |
---|---|---|---|---|
Dogecoin | 1/1/1 | 1/1/1 | ||
Bitcoin Cash | 23/25 (2;-) | 1/1/1 | ||
Litecoin | 22/22 | 1/1/1 | ||
Bitcoin SV | 1/1/1 | 1/1/1 | ||
Dash | 11/10 (-;1) | 1/1/1 | ||
Zcash | 2/1 (-;1) | 1/1/1 | ||
Bitcoin Gold | 14/14 | 1/1/1 | ||
Horizen | 1 (-;1) | 1/1/1 | ||
Qtum | 28/28 (1;1) | 1/1/1 | ||
DigiByte | 14/14 | 1/1/1 | ||
Ravencoin | 3/3 | 1/1/1 |
- represents the (number of missed cases; the number of mistake cases).
Environment and tool configuration. We evaluate BlockScope and ReDeBug on the same virtual machine running Ubuntu 18.04 with 4GB memory configured, while the host machine is a Macbook Pro with a 3.5GHz dual-core Intel Core i7 CPU and 16GB memory. Note that ReDeBug needs to set a n-gram parameter to adjust the number of lines for context code. While the default is four, we tried from one to ten and found that when n-gram=3, ReDeBug achieves its best result when analyzing our dataset.
B. Accuracy and Performance
After running BlockScope and ReDeBug on the dataset in Sec. V-A (i.e., using 32 Bitcoin patches and six Ethereum patches to test the 16 forked projects) and performing a thorough code review of all the raw detection results (including the cases that have no any output), we are able to precisely obtain the accuracy and performance data for both tools. Overall, BlockScope detects 101 true vulnerabilities in 13 forked projects (Qtum, Avalanche, and Polygon do not contain any vulnerability in our dataset as we manually checked), whereas ReDeBug detects only 57 vulnerabilities in ten forked projects, which makes BlockScope’s recall 1.8 times higher than that in ReDeBug. For performance, BlockScope is also 1.7 times faster than ReDeBug in analyzing Bitcoin’s forked projects and even 15.4 times faster in analyzing Ethereum’s forked projects with more code per project.
Table IIIa shows a breakdown of the detailed accuracy and performance results of BlockScope and ReDeBug, where TP, FN, TN, and FP represent true positive, false negative, true negative, and false positive, respectively. According to this table, we can calculate the precision via ( TP/(TP+FP) ) and the recall via ( TP/(TP+FN) ), respectively. We find that BlockScope achieves good precision and high recall both at 91.8%. In contrast, while ReDeBug has only three false positives in our dataset (mainly because it uses the exact match per code line), its recall is as low as 51.8%. That said, ReDeBug fails to detect many of the vulnerabilities covered by BlockScope. Since we aim to perform a thorough investigation of forked blockchains’ vulnerabilities, BlockScope achieves the high recall we need while introducing a low false alarming rate at only 8.18%. Moreover, among the 13 forked projects with vulnerabilities (i.e., no Qtum, Avalanche, and Polygon), BlockScope detects vulnerabilities in all of them, while ReDeBug fully misses the results for two projects, Bitcoin Cash and Binance. In particular, BlockScope successfully detects all the
that there are only 12 CVEs about Bitcoin with explicit patch code and eight of them are out of the recent five years. That said, we could select only four to test if we just use the public CVE information.
To address this problem, we select bug issues/pull requests with notable security impacts (i.e., vulnerabilities) and their patch commits (i.e., patches) directly from Bitcoin’s GitHub repository according to three simple principles: (i) the patches should be released within the recent five years since outdated patches had been applied to Bitcoin before it gets forked; (ii) the patches that cover different vulnerability types should have a higher chance to be picked up so that we can evaluate the generality of BlockScope; and (iii) the patches should be applicable to most forked projects, i.e., not specific to one particular Bitcoin component or one fork. As a result, we are able to select 32 patches of Bitcoin from June 2017 to March 2020, including four CVEs. For Ethereum, since its forks are relatively new, we select six CVEs of Ethereum since November 2020 as the patches. These 38 patches involve multiple vulnerability types, including denial-of-service, race conditions, privacy leakage, and etc. While the number of Bitcoin and Ethereum vulnerabilities here is not large, we have to be selective to make sure they are actually vulnerabilities. Indeed, Bitcoin and Ethereum have a limited number of vulnerabilities over the years. For example, the VUDDY dataset included only 9 CVEs of Bitcoin, with 8 of them already before 2013 and only one after 2018. Moreover, we have 16 popular forked projects of Bitcoin and Ethereum forked projects to test, which multiplied the total test cases to 382 ((32 \times 11 + 6 \times 5)).
TABLE IV: # of different vulnerability types in each project.
Forked Project | Type-1 | Type-2 | Type-3 | Sum |
---|---|---|---|---|
T B;R | T B;R | T B;R | T B;R | |
Dogecoin | 6 (6;4) | – | 10 (10;3) | 16 (16;7) |
Bitcoin Cash | 1 (1;-) | – | 1 (1;-) | 1 (1;-) |
Litecoin | 5 (5;5) | – | 1 (1;-) | 6 (6;5) |
Bitcoin SV | 1 (1;-) | – | 11 (10;2) | 12 (11;2) |
Dash | 7 (7;7) | – | 3 (2;-) | 10 (9;7) |
Zcash | 1 (1;-) | 2 (1;2) | 8 (7;1) | 11 (9;1) |
Bitcoin Gold | 9 (9;8) | – | 2 (1;2) | 11 (10;10) |
Horizen | – | 2 (2;-) | 9 (7;1) | 11 (9;1) |
Qtum | – | – | – | – |
DigiByte | 7 (7;7) | 1 (1;-) | 3 (2;3) | 11 (10;10) |
Ravencoin | 7 (7;7) | – | 8 (7;3) | 15 (14;10) |
Sum | 44 (44;38) | 5 (4;-) | 55 (47;15) | 104 (95;53) |
T, B, and R represent: the total number of vulnerabilities of each clone type, the number of vulnerabilities detected by BlockScope, and the number of vulnerabilities detected by ReDeBug, respectively.
vulnerabilities in Dogecoin, Bitcoin Cash, Litecoin, Binance, Celo, and Optimism with zero false negative.
We further explore the reasons that cause BlockScope to have a much better detection effectiveness than ReDeBug by analyzing the detailed results of detecting different clone types. This is because while ReDeBug claims that it can handle Type-1 and Type-3 clones, the accuracy of each clone type may vary. As shown in Table IV, among the 110 (TP + FN) vulnerabilities in the forked projects of our dataset, 95.5% of them are the Type-1 and Type-3 clones, with the number of Type-3 clones slightly higher than that of Type-1 clones. For these cases, ReDeBug achieves an accuracy of 85.7% for Type-1 clones, but its detection rate for Type-2 and Type-3 clones drops to 0% and 26.8%, respectively. This explains why ReDeBug performs better on six particular projects — the number of Type-1 clones in those six projects (i.e., Litecoin, Dash, Bitcoin Gold, DigiByte, Celo, and Optimism) is larger than that of Type-3 clones. Indeed, if a forked project has more Type-1 clones, it is more similar to the original project. In contrast, BlockScope does not have this limitation. It is able to detect all the Type-1 clones, and misses only one and eight cases for the more complicated Type-2 and Type-3 clones, respectively. This indicates that BlockScope still reaches a high rate of 80% for Type-2 clones and 85.7% for Type-3 clones.
For performance, BlockScope performs much faster on all the projects than ReDeBug. In particular, BlockScope can finish the analysis of 10 forked projects within ten seconds, while ReDeBug just finishes only one project (i.e., Bitcoin SV) within ten seconds. We further analyze whether the project’s LOC affects the performance of BlockScope and ReDeBug. For BlockScope, we notice that it takes almost the same time (10.5s vs. 10.6s) to analyze Bitcoin Cash and Bitcoin SV, even though the LOC of Bitcoin Cash is 2.7 times that of Bitcoin SV (607K vs. 221.1K). In contrast, the processing time of ReDeBug for the same two projects is 22.2s and 9.9s, respectively. The difference of 2.2 times is close to the ratio of those two projects’ LOC. This indicates that the project’s LOC does not explicitly affect the processing time of BlockScope, while it has a significant effect on ReDeBug’s performance.
Indeed, when we compare the performance of BlockScope between Bitcoin forks (with fewer LOC) and Ethereum forks (with more LOC), we notice that BlockScope can finish the analysis of Ethereum forks even faster. It suggests that for BlockScope, the number of target patches (32 for Bitcoin vs. 6 for BlockScope) has a more noticeable impact on its performance than LOC. ReDeBug, on the other hand, is the opposite, with LOC having much more impact than the number of target patches on its performance. For example, for Qtum and Binance that have almost the same LOC, the analysis time of ReDeBug is also almost the same (33.5s vs. 30.2s). As we mentioned earlier, typical code clone detection tools like ReDeBug perform a whole-project analysis – so LOC dominates the performance, while BlockScope leverages patch code contexts to search and locate only potentially relevant code for comparison – so LOC has a much limited effect.
C. Analysis of the Detected Vulnerabilities
Since BlockScope detects not only the cloned vulnerabilities but also whether a patch is applied, we perform an analysis on both the detected vulnerabilities and the fixed cases in this subsection. For a deep investigation on the individual vulnerability, we present it later in Sec. V.
As shown in Table IIIa, Bitcoin’s forked projects have a total of 104 vulnerabilities. Among the 11 projects, only Bitcoin Cash and Qtum have few vulnerabilities, while eight projects have at least 10 vulnerabilities each out of the 32 patches investigated. In particular, Dogecoin and Ravencoin did not patch around half of the total 32 vulnerabilities. On the contrary, Ethereum’s forks present a better result, with only Optimism having four vulnerabilities out of the six patches investigated. The other four projects have at most one vulnerability each, with Avalanche and Polygon fully patched.
For the result of fixed cases, the forked projects of Bitcoin and Ethereum have fixed a total of 138 vulnerabilities (119 for Bitcoin and 19 for Ethereum). While Bitcoin’s 11 forked projects have fixed 119 vulnerabilities, five of them, Dogecoin, Bitcoin SV, Zcash, Horizen, and Ravencoin, fixed only six vulnerabilities in total. Three projects, Qtum, Bitcoin Cash, and Litecoin, contribute to 63% of all the fixed cases. Similar to the result above regarding the vulnerable cases, Ethereum’s forked projects also perform better in the fixed cases. While Optimism fixed only one vulnerability, the other four projects have fixed at least half of the investigated patches. Indeed, when comparing the ratio of the fixed/vulnerable cases between Bitcoin’s and Ethereum’s forked projects — 119/104 vs. 19/6, we notice that Ethereum’s forks are more active in fixing propagated vulnerabilities. Another aspect for measuring the project’s activeness on patching vulnerabilities is the patch delay, which we provide a detailed analysis in Sec. V-C.
D. Vulnerability Reporting and Response
As an ethical research and one contribution of this paper, we have spent significant efforts reporting all the 110 discovered vulnerabilities (including 101 TP automatically detected by BlockScope and 9 FN manually identified by us during evaluation) to the developers of the affected forked projects via multiple channels. In Table V, we summarize the latest developers’ response and actions to our vulnerability reports.
TABLE V: Developers’ response to our vulnerability reports.
Forked Project | Fixed | Accepted | ACK | Pending | Reject | Sum |
---|---|---|---|---|---|---|
Dogecoin | 11 | 3 | 2 | – | – | 16 |
Bitcoin Cash | – | – | – | – | – | – |
Litecoin | 2 | – | 3 | 1 | – | 6 |
Bitcoin SV | – | – | 8 | 2 | 2 | 12 |
Dash | 1 | 5 | 3 | 1 | – | 10 |
Zcash | – | 9 | 1 | 1 | – | 11 |
Bitcoin Gold | 7 | – | – | 1 | – | 11 |
Horizen | – | – | 4 | 7 | – | 11 |
Qtum | – | – | – | – | – | – |
DigiByte | – | – | – | 11 | – | 11 |
Ravencoin | – | 9 | 3 | 1 | 1 | 15 |
Sum | 30 | 9 | 33 | 28 | 4 | 104 |
As of 26 July 2022. Specifically, “Fixed” means that the vendor has adopted our reports to fix the issues, “Accepted” represents that the developers accepted our reports and were exploring appropriate patch migration, “ACK” suggests that the vendor has acknowledged our reports but did not explicitly indicate to fix the issues, “Pending” means that we have not received any response yet, and lastly, “Reject” means that the vendor has denied and refused to fix the vulnerabilities. We can see that around 74 of our 110 vulnerability reports received positive response, which demonstrates that the impact of our work. We further classify developers’ response into three categories:
Positive/Active Response. Among the 13 forked projects with vulnerabilities, around half of them responded to our vulnerability report positively, namely Dogecoin, Ravencoin, Dash, Bitcoin Gold, Litecoin, and Binance. Specifically, Dogecoin acknowledged all of our reports and quickly fixed 11 serious vulnerabilities, while the others are scheduled or under the community discussion for appropriate patch migration. Meanwhile, Ravencoin accepted nearly all the reports. The developers fixed nine of them and acknowledged three except one rejection and one pending due to the compatibility consideration. Similarly, the developers of Dash approved nearly all the reports and informed us that they had fixed five vulnerabilities under the development branch, which will be merged into a new release in the future. Bitcoin Gold also fixed seven vulnerabilities in one release after around four months receiving our reports, with another one acknowledged and three under pending, while Litecoin fixed two of the vulnerabilities and claimed that they had noticed the other three. Lastly, Binance immediately acknowledged our report on BSC and rewarded us a bug bounty with the promise of fixing it. During this reporting process, we found that developers are more likely to fix a vulnerability with authoritative proofs, especially those with CVE numbers. For instance, the Dogecoin developers quickly released a new version of the Dogecoin core after they fixed CVE-2021-3401 and CVE-2019-15947. However, for the other vulnerabilities with no CVE assigned, they just acknowledged them and kept them on the to-do list.
Neutral Response. In this category, developers also accepted our reports but did not have intention to fix any of them yet. Specifically, Bitcoin SV’s developers quickly acknowledged 8 of the 12 reports, and Zcash similarly acknowledged 9 of the 11 reports. However, both rejected a few (2 for Bitcoin SV and 1 for Zcash) due to incompatibility, and we have not received further updates from them. Meanwhile, Horizen acknowledged four vulnerability reports with the other seven still under pending, and Celo acknowledged the only report.
Negative/Inactive Response. Unfortunately, the response from the rest of three projects is not active and worrisome. Specifically, Bitcoin Cash, DigiByte, and Optimism did not give response to any of our reports. The worst case is DigiByte because it ignored 11 vulnerabilities, including some critical ones like CVE-2021-3401 and CVE-2019-15947.
V. INVESTIGATING THE PROPAGATION AND PATCHING PROCESSES OF DISCOVERED VULNERABILITIES
In this section, we conduct a deep investigation of the vulnerabilities discovered in Sec. IV. Specifically, in Sec. V-A, we aim to understand how these vulnerabilities are propagated from Bitcoin and Ethereum to their forked projects. Furthermore, in Sec. V-B, we diagnose some other propagation that caused our detection to fail (both FP and FN). Lastly, we perform a patch delay analysis in Sec. V-C to understand the patching processes of the cases that were already fixed in forked projects before our detection.
A. Revealing the Vulnerability Propagation from Bitcoin/Ethereum to Their Forked Projects
To reveal how a vulnerability is propagated from Bitcoin and Ethereum to the forked projects, we manually check all the 110 vulnerabilities, including 104 from Bitcoin forks and 6 from Ethereum forks, respectively, and categorize them into three types, as shown in Fig. 4. To simplify the description in this section, we apply “Bitcoin” to represent both Bitcoin and Ethereum, unless explicitly specified. The first type, as illustrated in Fig. 4a, refers to the vulnerabilities that were introduced when the project was initially forked from Bitcoin. For better understanding and simplicity, we call it the fork type. The second type, as depicted in Fig. 4b, is similar to the first type except that it fetched and merged vulnerable commits.
Useful information for enthusiasts:
- [1]YouTube Channel CryptoDeepTech
- [2]Telegram Channel CryptoDeepTech
- [3]GitHub Repositories CryptoDeepTools
- [4]Telegram: ExploitDarlenePRO
- [5]YouTube Channel ExploitDarlenePRO
- [6]GitHub Repositories Keyhunters
- [7]Telegram: Bitcoin ChatGPT
- [8]YouTube Channel BitcoinChatGPT
- [9] Bitcoin Core Wallet Vulnerability
- [10] BTC PAYS DOCKEYHUNT
- [11] DOCKEYHUNT
- [12]Telegram: DocKeyHunt
- [13]ExploitDarlenePRO.com
- [14]DUST ATTACK
- [15]Vulnerable Bitcoin Wallets
- [16] ATTACKSAFE SOFTWARE
- [17] LATTICE ATTACK
- [18] RangeNonce
- [19] BitcoinWhosWho
- [20] Bitcoin Wallet by Coinbin
- [21] POLYNONCE ATTACK
- [22] Cold Wallet Vulnerability
- [23] Trezor Hardware Wallet Vulnerability
- [24] Exodus Wallet Vulnerability
- [25] BITCOIN DOCKEYHUNT
Contact me via Telegram: @ExploitDarlenePRO