
Bitcoin is a decentralized, pseudonymous cryptocurrency that is one of the most used digital assets to date. Its unregulated nature and inherent anonymity of users have led to a dramatic increase in its use for illicit activities. This calls for the development of novel methods capable of characterizing different entities in the Bitcoin network.
In this paper, a method to attack Bitcoin anonymity is presented, leveraging a novel cascading machine learning approach that requires only a few features directly extracted from Bitcoin blockchain data. Cascading, used to enrich entities information with data from previous classifications, led to considerably improved multi-class classification performance with excellent values of Precision close to 1.0 for each considered class. Final models were implemented and compared using different machine learning models and showed significantly higher accuracy compared to their baseline implementation. Our approach can contribute to the development of effective tools for Bitcoin entity characterization, which may assist in uncovering illegal activities.
Index Terms—Bitcoin analysis, Bitcoin anonymity, cascading classifiers, entities classification, graph model, blockchain
I. INTRODUCTION
Bitcoin was born in 2009 and since then its value and popularity has been rapidly increasing until its current state, in which it is the most used, assessed and priced cryptocurrency of all. Bitcoin is a pure peer-to-peer cryptocurrency [25] where all transactions are stored in a public shared ledger called blockchain that cannot be manipulated or changed [6]. Bitcoin is decentralized, which means that it is not controlled by any financial institution but it is regulated by everyone in the Bitcoin network: its blockchain architecture maintains the system without ambiguity [26].
While transactions within the Bitcoin network are openly available, Bitcoin user identity is non-transparent and protected by anonymity. This circumstance, combined with the unregulated nature of the Bitcoin market, has brought a lot of new actors to the Bitcoin network using cryptocurrency for illicit operations. Approximately one-quarter of Bitcoin users and half of all Bitcoin transactions are associated with illegal activity [9], accounting for an annual amount of around $72 billion (report 2018).
Conventional law-enforcement strategies tackling illegal financial operations such as money laundering or transactions funding criminal operations are typically based on complete knowledge of each actor’s identity, while details about financial transactions are controlled by banks and thus unknown [24]. Within the Bitcoin network, these circumstances are reversed – incomplete knowledge of identities restricts traceability and transparency of operations, in turn promoting further increase of illegal activities. This calls for novel methods to attack anonymity within the Bitcoin network, aiming to uncover Bitcoin entity categories.
Among the most active categories of entities is the exchange, which represents a digital marketplace where traders can buy and sell cryptocurrencies using different fiat (money made legal tender by a government decree) or other digital currencies. Exchanges thus constitute the “front and exit doors” to the cryptocurrency world and are ideal to hide illicit operations, as documented in [22]. Another category is the darknet market. These markets are e-commerce platforms where users can find drugs, weapons and any kind of goods or services that are illegal in most countries. These cryptomarkets use electronic currencies to facilitate licit and illicit transactions among their users [5]. Further, so-called mixers represent services that allow users to obscure operations, as presented in [23]. At the same time mixed transactions increase the privacy of the users, and they can be used for money laundering of illegal funds.
Being able to classify anonymous Bitcoin entities according to such categories would increase transparency and would facilitate linking blockchain information with real actors to uncover illegal activities. Current techniques attacking anonymity often try to cluster addresses and apply heuristic assumptions combined with labelled data from external sources like markets, forums or social media in order to determine address owners in the real world [20]. However, gathering external data and combining them with Bitcoin information is tedious and could be limited due to privacy restrictions. This motivates the implementation of a model able to characterize different behaviours in the Bitcoin network by analyzing the pure blockchain information only; by extracting transactions and by recognizing patterns using machine learning approaches.
In this paper, we present a novel approach to decrease Bitcoin anonymity based on a cascading machine learning model, using entity, address and motifs data as inputs. We apply a ”cascade” of classifiers, performing a first entity classification based on address, 1_motif, and 2_motif data, which is then used as input for a second classification step, which combines those classification results with entity information from the blockchain. Notably, our approach only requires a few features that can be directly extracted from Bitcoin blockchain data.
In order to compare benefits and limits of the proposed approach, two experiments are presented: firstly, a simple classifier is trained based on pure entity information gathered from the blockchain. In the second experiment, a final classifier is trained using the enriched data set generated by our cascading approach. We aimed to detect six different types of Bitcoin entity behaviours. Overall, three classifier models are tested and compared: Adaboost, Random Forest and Gradient Boosting.
The rest of the paper is organized as follows. Section II describes the related work. After that, Section III presents the graph model used and Section IV shows an overview of the used data sets. Section V describes the implemented machine learning models and Section VI presents the obtained results. Finally, in Section VII, we draw conclusions and provide guidelines for future work.
II. RELATED WORK
User anonymity has probably been the key factor for the success of cryptocurrencies and has promoted illegal activities within the Bitcoin network. Yet, several studies determine that current measures adopted by the Bitcoin protocol are not sufficient to protect the privacy of its users [19], [1], opening up possibilities to attack Bitcoin anonymity. One of the first transaction analysis is documented in [30] where typical bebehavior of Bitcoin users are detected based on how they spend cryptocurrencies, how they keep the balance in their accounts, and how they move Bitcoins between their various accounts. Herrera-Joancomartí [12] presents a review on Bitcoin anonymity, concluding that anonymity can be reduced by address clustering or by gathering information from various peer-to-peer networks. This technique is also advocated in [15], where conservative constraints (patterns) are applied for address clustering, and in [17] where information gathered from online forums is used to characterize the CryptoLocker, a family of ransomware. Similarly, in [8], information scraped from online forums and social media is determinant to simulate an attacker and to summarize activity of both known and unknown Bitcoin users. In [3], a generic method to deanonymize a significant fraction of Bitcoin users by correlating their pseudonyms with public IP addresses is described. Reid et al. [29] demonstrates how it is possible to associate many public keys with each other, using a map of the topological network and external identifying information in order to investigate a large theft of Bitcoins.
Several recent studies have exploited machine learning algorithms for Bitcoin analysis. In [13], an unsupervised learning model is presented with the aim to identify atypical transactions related to money laundering. Monamo et al. [21] introduce a k-means classifier for object clustering and fraudulent activity detection in Bitcoin transactions. Another study on detection of anomalous behavior, suspicious users and transactions is presented in [27], where three unsupervised learning methods are applied to two graphs generated by the Bitcoin transaction network. Further, a supervised machine learning algorithm is used by [11] to uncover Bitcoin anonymity using a method for predicting the type of yet-unidentified entities. In [2], data mining techniques are used to implement and train a classifier to identify Ponzi schemes in the Bitcoin blockchain and in [18] a Bayesian optimized recurrent neural network (RNN) and a Long Short Term Memory (LSTM) are implemented to predict the direction of Bitcoin price in USD.
Recently, an interesting approach is given in [28], where the concept of motifs is introduced to blockchain analysis. Authors performed an analysis of the transaction directed hypergraph in order to identify several distinct statistical properties of exchange addresses. They were able to predict if an address is owned by an exchange with > 80% accuracy. The introduction of hypergraphs (or dirhypergraphs) proved beneficial due to their significant advantages over a complex graph structure typically derived from Bitcoin networks. In [14], the motif concept is further developed and is combined with multiple features (entity, address, temporal, centrality) to obtain a comprehensive entity classification into five categories: Exchange, Service, Gambling, Mining Pool and DarkNet marketplace. Using a total of 315 features, a global accuracy of 0.92 could be achieved.
Inspired by the good classification results presented in [14], we present here a novel machine-learning-based approach to attack Bitcoin anonymity, making use of motifs as introduced by Ranshous et al. and allowing for multi-class classification of Bitcoin entities as in [14], yet aiming to provide a straightforward methodology that relies on fewer, well-defined features. To achieve this, we introduce a novel cascading machine learning model for Bitcoin data analysis. The main idea is to implement a cascade of classifiers, so that outgoing classification results can be joined and can be used to enrich a final classification.
III. GRAPH MODEL
A. Blockchain Graph Model
Bitcoin transactions have a natural graph structure, with a fundamental example being the address-transaction graph (Figure 1). This graph is directly obtained by using the information gathered from the blockchain and provides an estimation of the flow of Bitcoins linking public key addresses over time. The vertices represent the addresses ((a_1, a_2, …, a_N)) and the transactions ((tx_1, tx_2, …, tx_M)). The directed edges (arrows) between entities and transactions indicate the incoming relations, while directed edges between transactions and entities correspond to outgoing relations. Each directed edge can also include additional features such as values, time-stamps, etc.
To improve anonymity in the network, users are encouraged to generate a new Bitcoin address for each new transaction, which is a common advice for the correct usage of Bitcoin(^1). Due to this procedure, several addresses belong to the same logical user, so that a simplification is possible by introducing the concept of entities. An entity is defined as a person or organization that controls or can control multiple public key addresses. This definition allows us to transform the address-transaction graph into the entity-transaction graph (Figure 2).
The new graph is obtained by grouping addresses belonging to the same user into entities (address clustering). This operation is not intuitive, however several heuristic properties have already been presented with the aim to help the clusterization process, for example in [1], [15] and [7]. In the obtained graph, vertices represent the entities ((e_1, e_2, …, e_K)) and the transactions ((tx_1, tx_2, …, tx_M)). Similar to the address-transaction graph, directed edges between entities and transactions indicate the incoming relations, while directed edges between transactions and entities correspond to outgoing relations. The entity-transaction graph (2) summarizes the network well and
(^1)https://bitcoin.org/en/protect-your-privacy
constitutes an easily understandable representation of the money flow within the network.
B. Motifs Graph Model
Graph motifs were introduced in [16] and were motivated by applications in bioinformatics, specifically in metabolic network analysis. However, as shown in Section II, prior studies such as [28] have introduced the concept of motifs to Bitcoin analysis. In this paper, a definition of $N_{motif}$ is used, starting from the generalized concept introduced in [14].
Definition 1: A $N_{motif}$ is a path from the entity-transaction graph with length $2N$ that starts and ends with an entity. Let $(e_1, .., e_M) \in E$ be a class of entities and $(t_1, .., t_N) \in T$ be a class of transactions, with $M \leq N + 1$, then:
$$N_{motif} = (e_1, t_1, …, t_N, e_M)$$
in which at least one output from each transaction must be an input to the next transaction.
The term branch is used here to refer to a path in the motif graph that begins and ends with an entity passing through exactly one transaction. If a single branch of the graph has the same entity as input and output ($e_j = e_{j+1}$), the branch is called Direct Loop, otherwise it is called Direct Distinct.
From the motif definition it is clear that all transactions are ordered in time, which means that $\tau(t_1) < \tau(t_2) < .. < \tau(t_N)$, where $\tau$ represents a transaction time.
Here, we use the $1_{motif}$ and $2_{motif}$ concepts. The $1_{motif}$ represents the relation between two entities (at least one distinct), while the $2_{motif}$ is the relation between three entities (at least one distinct) involved in two consecutive transactions.
IV. Data Overview
We considered the whole Bitcoin blockchain data created until February 5th 2019, 08:13:31 AM, corresponding to 561,620 blocks, which contain about 380,000,000 transactions and involve more than 1,000,000,000 addresses. This data was then combined with information available on the WalletExplorer², a benchmark platform for entities detection, which represents a collection of information about different known entities that have been detected until today. The data set is thus composed of 311 different samples, divided into six classes (see Table I):
- Exchange: entities that allow their customers to trade fiat currencies for Bitcoins (or vice versa)
- Service: entities that offer Bitcoin payment methods as solutions to their business (financial services, trading, lending, etc.)
- Gambling: entities that offer gambling services (casino, betting, roulette, etc.)
- Mining Pool: entities composed of a group of miners that work together sharing their resources in order to reduce the volatility of their returns
- Mixer: entities that offer a service to obscure the traceability of their clients’ transactions
- Marketplace: entities allowing to buy any kind of goods or services that are illegal in most countries paying with Bitcoin
As shown in Table I, the Exchange is the top class represented by more than 60%
²https://www.walletexplorer.com/
Class | Abbreviation | # Entities | # Address | % Address |
---|---|---|---|---|
Exchange | Ex | 137 | 9,943,512 | 61.63 |
Gambling | Gmb | 76 | 3,054,238 | 18.93 |
Marketplace | Mrk | 20 | 2,349,210 | 14.56 |
Mining Pool | Pool | 25 | 76,104 | 0.47 |
Mixer | Mxr | 37 | 475,714 | 2.95 |
Service | Serv | 16 | 235,629 | 1.46 |
Total | 311 | 16,134,407 | 100 |
TABLE I: Overview of WalletExplorer data used for this study
of samples, while the Mining Pool class is the least represented with just 0.47% (even though it has more distinct entities than the Marketplace and the Service).
Cross-references between Bitcoin blockchain data and labelled data from the WalletExplorer allow us to re-size the original data set by removing all the unlabelled and unusable data. As such, we focus our analysis on known entities only. From this new data set, four dataframes (2-dimensional labelled data structure or data table with samples as rows and extracted features as columns) were extracted for the proposed analysis:
- Entity dataframe contains all features related to an entity that can be directly extracted from the blockchain. They are: the amount of BTC received/sent, the balance of the entity, the number of transactions in which this entity is the receiver/sender, and the number of addresses belonging to this entity used for receiving/sending money. (This dataframe was composed of 311 samples and 7 features)
- Address dataframe contains all features related to Bitcoin addresses. Features are: the number of transactions in which a certain address is detected such as receiver/sender, the amount of BTC received/sent from/to this address, the balance, uniqueness (if this address is just used in one transaction) and siblings. (This dataframe was composed of 16,134,407 samples and 7 features)
- 1 motif dataframe contains the information directly extracted from the 1-motif graph. In this case, each row contains: the amount received/sent in the transaction, number of distinct addresses used for receiving/sending money, number of similar received/sent transactions between the entities in the branch, the fee, and if the branch realizes a Direct Loop or Direct Distinct path. (This dataframe was composed of 58,076,963 samples and 9 features)
- 2 motif dataframe contains information gathered from the 2-motif graph. The features analyzed are: the number of addresses as input/output for the first and second path in 2-motif graph, the amount received/sent in the first and second branch, the fee of both considered transactions, number of similar sent transactions between the entities in the first and second branch, Direct Loop or Direct Distinct path for the first and the second branch and Direct Loop or Direct Distinct path considering the whole 2-motif path, see Figure 3. (This dataframe was composed of 83,443,055 samples and 18 features)
V. MACHINE LEARNING
A. Classifier Models
To demonstrate benefits and limits of our approach, we conducted two different experiments. Firstly, we created a simple classifier, called $C_{\text{entity}}$ (Figure 4), merely based on the samples stored in the entity dataframe, containing (seven) entity-related features that can be directly extracted from the blockchain. This classifier was evaluated via a cross-validation process (see Section V-B). Results from cross-validation were considered as our baseline classification. The simple classifier was implemented in three versions applying Adaboost, Random Forest and Gradient Boosting models as those previously yielded good classification results for Bitcoin data [28].
In the second experiment, prior to entity classification according to the six classes (Table I), we built three separate classifiers, based on the additionally available address, $1_{\text{motif}}$, and $2_{\text{motif}}$ dataframes and their respective features ($7 + 9 + 18 = 34$ features). Outgoing information from these classifications was processed, as shown in Figure 6, in order to create a set of six new features for each classifier, which were then used to enrich (extend) the entity dataframe. Finally, a new classifier $C_{\text{final}}$ was generated to obtain final entity classification based on this enriched entity dataframe and its 25 features ($7$ belonged to the entity dataframe and $6 \times 3$ were generated from the three classifiers $C_{\text{address}}, C_{\text{motif1}}, C_{\text{motif2}}$). With this cascading approach, new entity-related characteristics were added to the entity dataframe, ultimately improving the classification as demonstrated in the following sections.
The first step was to split the address, $1_{\text{motif}}$ and $2_{\text{motif}}$ dataframes into two parts called A-data set (for training) and B-data set (for testing) with a proportion of $70/30$. The A-data set was used to compute cross-validation of the three $C_{\text{address}}, C_{\text{motif1}}, C_{\text{motif2}}$ classifier models (Figure 5). After that, the B-data set was used as input for the trained classifiers $C_{\text{address}}, C_{\text{motif1}}, C_{\text{motif2}}$ in order to obtain classification results based on completely new, unseen data.
Classification results essentially assign one of the six possible output classes to each entry in the input dataframe. As each entry has its original (ground truth) label obtained from the WalletExplorer, we can join input label and computed output class and perform a group-by and count operation as illustrated in Figure 6: we count how many times a sample belonging to a particular entity has been detected in each of the considered classes. This value is then normalized as indicated in the following formula:
∀ξ ∈ E \frac{| P_ξ |j}{\sum{i=1}^N P_ξ |_i} \times 100 \quad \text{with} \quad j
where E is the entities set and N represents the number of considered classes ( (N = 6 \text{ in this study}) ). The term ( | P_ξ |j ) represents how many times a sample originally labelled with entity ξ generates a prediction belonging to the class j, while the term ( \sum{i=1}^N P_ξ |_i ) counts all the predictions generated from samples with labelled input belonging to entity ξ.
These normalized values form a dataframe containing 311 samples (one for each known entity as in the entity dataframe) and six new features, representing the percentage of being classified as belonging to one of the six classes. These features were added to the entity dataframe for data enrichment, constituting our cascading machine learning system. The elements of the enriched entity dataframe were used to implement and evaluate the final classifier, called ( C_{\text{final}} ), and a cross-validation process (Section V-B) was applied to compute its performance.
To allow for better comparison between experiments, we implemented all classifier models ( C_{\text{address}}, C_{\text{motif1}}, C_{\text{motif2}}, \text{ and } C_{\text{final}} ) with Adaboost, Random Forest and Gradient Boosting models. Specifically, all Adaboost classifiers were generated with the number of estimators set to 50 and the learning rate set to 1. All Random Forest models were implemented with the number of estimators set to 10, a Gini function to measure the quality of the split and without a maximum depth of the tree. All Gradient Boosting models were implemented with the number of estimators set to 100, the learning rate set to 0.1 and the maximum depth for limiting the number of nodes set to 3.
B. Evaluation Metrics
All classification models were evaluated by extracting and comparing classification metrics via a cross-validation process. The goal of cross-validation is to analyze the prediction capabilities of the model in order to detect problems such as over-fitting or selection bias [4]. Here, we used stratified K-fold cross-validation, with a value of K equal to 5. This method involves dividing the whole data set into K equal partitions or folds.
Each fold is composed of data ensuring a good representative sample of the whole population by keeping the same proportion of classes present in the original data set (stratification). Then, K-1 folds are used to train the model and the one left-out fold is used to evaluate the predictions obtained by the trained model. The entire process is repeated K times, until each fold has been left out once, testing all possible combinations. During this process, the following metrics were computed:
- Accuracy or Score is defined as the number of correct predictions divided by the total number of predictions and is given as percentage
- Precision is the number of positive predictions divided by the total number of the positive class values predicted. It represents a measure of a classifier’s exactness given as a value between 0 and 1, with 1 relating to high precision
- Recall represents a measure of a classifier’s completeness given as a value between 0 and 1
- F1-score is the harmonic mean of Precision and Recall. It takes values between 0 and 1, with 1 relating to perfect Precision and Recall
[ F_{1\text{-score}} = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}} ]
- Matthews Correlation Coefficient (MCC) is a metric yielding easy comparison with respect to a random baseline, suitable for unbalanced classes. It takes values between $-1$ and $+1$. A coefficient of $+1$ represents a perfect prediction, 0 an average random prediction and $-1$ an inverse prediction. As shown in [10], let $K$ be the number of classes and $C$ be a confusion matrix with dims $K \times K$, the MCC can be calculated as:
[ MCC_{\text{part1}} = \sqrt{\sum_{k} (\sum_{l} C_{kl}) (\sum_{f,g \mid f \neq g} C_{fg})} ]
[ MCC_{\text{part2}} = \frac{\sum_{k} (\sum_{l} C_{lk}) (\sum_{f,g \mid f \neq g} C_{fg})}{MCC_{\text{part1}} \times MCC_{\text{part2}}} ]
[ MCC = \frac{\sum_{k} \sum_{l} \sum_{m} C_{kk} C_{lm} – C_{kl} C_{mk}}{MCC_{\text{part1}} \times MCC_{\text{part2}}} ]
In Section VI results for the baseline model ($C_{entity}$) and for the final model obtained after cross-validation using the enriched dataframe ($C_{final}$) are presented and compared. We report global metric values for Accuracy/Score and MCC averaged over the $K=5$ cross-validation runs and per-class values for Precision, Recall and F1-score when evaluating the final models.
C. Hardware and Software Configuration
All analyses were run on a cluster of three virtual machines, each one with 16 CPUs Intel(R) Xeon(R) Silver 4114 CPU @ 2.20 GHz, 64 GB RAM DDR4 memory with 2,666 MHz, and 500 GB of Hard Disk SATA. Apache Spark3 v2.4.0, set in cluster mode was used to manage stored data using Apache Hadoop4. The various classifier models were implemented and evaluated using Python’s Scikit-learn5 library. All scripts were executed within the Jupyter-notebook6 environment.
VI. Results
Considering the simple classifier $C_{entity}$ from the first experiment, the Gradient Boosting model yielded a better average score (61.90% accuracy) and MCC (0.44) than Random Forest and Adaboost classifiers, as shown in Table II (upper section).
3https://spark.apache.org/
4https://hadoop.apache.org/
5https://scikit-learn.org/
6https://jupyter.org/
Useful information for enthusiasts:
- [1]YouTube Channel CryptoDeepTech
- [2]Telegram Channel CryptoDeepTech
- [3]GitHub Repositories CryptoDeepTools
- [4]Telegram: ExploitDarlenePRO
- [5]YouTube Channel ExploitDarlenePRO
- [6]GitHub Repositories Keyhunters
- [7]Telegram: Bitcoin ChatGPT
- [8]YouTube Channel BitcoinChatGPT
- [9] Bitcoin Core Wallet Vulnerability
- [10] BTC PAYS DOCKEYHUNT
- [11] DOCKEYHUNT
- [12]Telegram: DocKeyHunt
- [13]ExploitDarlenePRO.com
- [14]DUST ATTACK
- [15]Vulnerable Bitcoin Wallets
- [16] ATTACKSAFE SOFTWARE
- [17] LATTICE ATTACK
- [18] RangeNonce
- [19] BitcoinWhosWho
- [20] Bitcoin Wallet by Coinbin
- [21] POLYNONCE ATTACK
- [22] Cold Wallet Vulnerability
- [23] Trezor Hardware Wallet Vulnerability
- [24] Exodus Wallet Vulnerability
- [25] BITCOIN DOCKEYHUNT
Contact me via Telegram: @ExploitDarlenePRO