Decoding Illicit Bitcoin Transactions: A Multi-Methodological Approach for Anti-Money Laundering and Fraud Detection in Cryptocurrencies PDF Free Download

1 / 147
0 views147 pages

Decoding Illicit Bitcoin Transactions: A Multi-Methodological Approach for Anti-Money Laundering and Fraud Detection in Cryptocurrencies PDF Free Download

Decoding Illicit Bitcoin Transactions: A Multi-Methodological Approach for Anti-Money Laundering and Fraud Detection in Cryptocurrencies PDF free Download. Think more deeply and widely.

Decoding Illicit Bitcoin Transactions: A Multi-Methodological Approach for
Anti-Money Laundering and Fraud Detection in Cryptocurrencies
by
Ardeshir Shojaeinasab
A Dissertation Submitted in Partial Fulfillment of the
Requirements for the Degree of
DOCTOR OF PHILOSOPHY
in the Department of Electrical and Computer Engineering
©Ardeshir Shojaeinasab, 2024
University of Victoria
All rights reserved. This dissertation may not be reproduced in whole or in part, by
photocopying or other means, without the permission of the author.
ii
Decoding Illicit Bitcoin Transactions: A Multi-Methodological Approach for
Anti-Money Laundering and Fraud Detection in Cryptocurrencies
by
Ardeshir Shojaeinasab
Supervisory Committee
Dr. Homayoun Najjaran, Supervisor
(Department of Electrical and Computer Engineering)
Dr. Riham ALTawy, Departmental Member
(Department of Electrical and Computer Engineering)
Dr. Yun Lu, Outside Member
(Department of Computer Sicence)
iii
ABSTRACT
This dissertation examines the challenges of detecting illicit activities in cryptocur-
rency transactions, with a focus on Bitcoin. It begins by analyzing cryptocurrency
mixing services and their obfuscation techniques. The research then provides a com-
prehensive evaluation framework for these services, conducting an assessment of all
available services and academic proposals. Following this, the study introduces a
novel framework that uses statistical patterns to identify potential money laundering
and clustering cryptocurrency addresses that can reveal real-world identities involved
in illicit transactions.
The study then leverages the Elliptic dataset, a graph representation of Bitcoin
transactions, to classify illicit activities. While classical machine learning methods
struggled with the imbalanced nature of financial fraud data, Graph Neural Networks
(GNNs) - specifically Graph Convolutional Networks and Graph Attention Networks
- proved more effective. By considering the graph topology and connections between
nodes, GNNs significantly reduced false negative rates in detecting illicit transactions.
To enhance transparency, the research employs Explainable AI techniques, partic-
ularly SHAP values, to interpret the decision-making process of GNN models. This
approach not only improves model trustworthiness but also provides insights into the
key features and graph structures that contribute to illicit activity detection.
The thesis concludes by presenting a comprehensive toolkit for combating digital
financial crimes. It demonstrates that despite the perceived anonymity of blockchain
technology, effective methods exist to unveil illicit activities, thus enhancing the secu-
rity and integrity of cryptocurrency transactions. This work bridges the gap between
technological advancement and regulatory compliance, establishing a new standard
in the fight against cryptocurrency-based crime.
iv
Contents
Supervisory Committee ii
Abstract iii
Table of Contents iv
List of Tables viii
List of Figures x
Acknowledgements xii
Dedication xv
1 Introduction 1
1.1 Statement of the Problem . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Research Objectives and Questions . . . . . . . . . . . . . . . . . . . 5
1.3 Scope and Limitations of the Research . . . . . . . . . . . . . . . . . 6
2 Background 7
2.1 GeneralDenitions............................ 7
2.1.1 Cryptocurrency Mixers . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Obfuscating Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 All the Proposed Mixing Frameworks . . . . . . . . . . . . . . . . . . 15
2.4 Academic Proposed Mixing Approaches . . . . . . . . . . . . . . . . . 15
2.4.1 Decentralized Mixing Frameworks . . . . . . . . . . . . . . . . 16
2.4.2 Centralized Mixing Frameworks . . . . . . . . . . . . . . . . . 20
2.5 Real-World Mixing Approaches . . . . . . . . . . . . . . . . . . . . . 24
2.6 Comparison Between Academic and Real-World Services . . . . . . . 34
2.7 Possible Attacks on Mixing Services . . . . . . . . . . . . . . . . . . . 36
v
2.8 Provision of an Evaluation Framework to Assess Mixing Services . . . 38
3 Money Laundry Identification in Bitcoin 47
3.1 DataGathering.............................. 47
3.1.1 Retrieving Blockchain Data . . . . . . . . . . . . . . . . . . . 47
3.1.2 Collecting Labels for Mixing Transactions . . . . . . . . . . . 48
3.2 Extracting Statistical Pattern . . . . . . . . . . . . . . . . . . . . . . 49
3.3 First Phase of Detection; Uncovering Transaction-Level Patterns . . . 50
3.3.1 Inputs and Output Counts Pattern . . . . . . . . . . . . . . . 50
3.3.2 Address Types Pattern . . . . . . . . . . . . . . . . . . . . . . 51
3.3.3 Pattern of Amount Fraction in Transfer . . . . . . . . . . . . . 51
3.4 Second Phase of Detection; Uncovering Chain-Level Patterns . . . . . 52
3.4.1 Deposit Transactions Footprint in the Mixing Services . . . . 52
3.4.2 Examining Sweeper Transactions . . . . . . . . . . . . . . . . 52
3.4.3 Defining the Transaction Chain . . . . . . . . . . . . . . . . . 53
3.4.4 Implementing Algorithm of Identifying Mixing Transaction Chain 55
3.4.5 Identifying the Start of the Mixing Chain . . . . . . . . . . . . 56
3.4.6 Investigating the Transaction Chain Started from Sweeper Trans-
action ............................... 56
3.4.7 Transaction Chain Length Pattern . . . . . . . . . . . . . . . 57
3.4.8 Transaction Chain Time Pattern . . . . . . . . . . . . . . . . 57
3.5 Identification of Mixing Service Addresses in the Network . . . . . . . 57
3.6 Identification of Addresses Associated with Mixing Service Users . . . 58
3.6.1 Clean-Side Money Addresses . . . . . . . . . . . . . . . . . . . 58
3.6.2 Dirty-Side Money Addresses . . . . . . . . . . . . . . . . . . . 58
3.6.3 Analyzing the Possible Causes of the Observed Patterns . . . 59
3.7 Expanding Identified Addresses . . . . . . . . . . . . . . . . . . . . . 61
3.7.1 Address Clustering . . . . . . . . . . . . . . . . . . . . . . . . 61
3.8 Labeling Detected Clusters . . . . . . . . . . . . . . . . . . . . . . . . 61
3.8.1 Mapping IP Addresses to Network Address Groups . . . . . . 61
3.9 Two-Phase Detection Algorithm . . . . . . . . . . . . . . . . . . . . . 62
3.9.1 Suspected Transactions Identifier Implementation . . . . . . . 62
3.9.2 Implementation of a Two-Phase Algorithm for Detecting Mix-
ing Transactions with High Accuracy . . . . . . . . . . . . . . 63
3.10Evaluation................................. 65
vi
4 Bitcoin Illicit Transactions Identification 67
4.1 TheEllipticDataSet........................... 67
4.1.1 Graph Construction . . . . . . . . . . . . . . . . . . . . . . . 67
4.1.2 Notes on Feature Construction . . . . . . . . . . . . . . . . . . 69
4.1.3 Research and Practical Implications . . . . . . . . . . . . . . . 71
4.2 Fraud Activity Classification . . . . . . . . . . . . . . . . . . . . . . . 71
4.2.1 Autoencoders for Feature Extraction . . . . . . . . . . . . . . 71
4.2.2 Classical Machine Learning Classifiers . . . . . . . . . . . . . . 72
4.3 Model Training and Evaluation . . . . . . . . . . . . . . . . . . . . . 75
4.3.1 TrainingProcess ......................... 75
4.3.2 ModelEvaluation......................... 76
4.3.3 Interpretation of Results . . . . . . . . . . . . . . . . . . . . . 77
4.3.4 Comparative Analysis . . . . . . . . . . . . . . . . . . . . . . 79
4.4 Insights and Interpretations . . . . . . . . . . . . . . . . . . . . . . . 80
4.5 Graph Neural Network Based Approaches . . . . . . . . . . . . . . . 82
4.5.1 Graph Convolutional Networks (GCNs) . . . . . . . . . . . . . 82
4.5.2 GCN Implementation for Bitcoin Transaction Classification . . 83
4.5.3 Community Detection Using the Louvain Algorithm . . . . . . 83
4.5.4 Graph Attention Network (GAT) . . . . . . . . . . . . . . . . 86
4.5.5 Comparative Analysis of GCN and GAT Models . . . . . . . . 91
5 Information Unveiling of the Elliptic Dataset and Enhanced Tech-
niques for Identifying Illicit Activities 93
5.1 Elliptic Dataset Deanonymization Attack . . . . . . . . . . . . . . . . 93
5.1.1 Feature Correlation Analysis . . . . . . . . . . . . . . . . . . . 93
5.1.2 Normalization and Scaling . . . . . . . . . . . . . . . . . . . . 95
5.1.3 Derivation of min diff feature 6 . . . . . . . . . . . . . . . . . 95
5.1.4 Linear Transformation . . . . . . . . . . . . . . . . . . . . . . 95
5.1.5 Application of the Transformation . . . . . . . . . . . . . . . . 96
5.1.6 Formulas for Inputs and Outputs . . . . . . . . . . . . . . . . 96
5.1.7 Consistency of Feature Mapping . . . . . . . . . . . . . . . . . 96
5.2 Deanonymization Algorithm . . . . . . . . . . . . . . . . . . . . . . . 97
5.3 DateRevelation.............................. 98
5.4 Additional Transaction Characteristics . . . . . . . . . . . . . . . . . 98
5.5 Results and Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 98
vii
5.6 Enhanced Fraud Activity Identification on the Deanonymized Elliptic
Dataset .................................. 99
5.6.1 Application of the Two-Phase Mixing Detection Algorithm . . 99
5.6.2 First Phase: Candidate Transaction Identification . . . . . . . 99
5.6.3 Second Phase: Chain Analysis and Feature Extraction . . . . 99
5.7 Integration of Chain Features into Graph Neural Networks . . . . . . 101
5.7.1 Evaluation of Extracted Transactions . . . . . . . . . . . . . . 101
5.8 Assessment of Enhanced Dataset Across Entire Transaction Set . . . 103
5.9 Model Interpretability and Explainability . . . . . . . . . . . . . . . . 104
5.9.1 ShapleyValues .......................... 104
5.9.2 SHapley Additive exPlanation (SHAP) Values . . . . . . . . . 105
5.9.3 Feature Importance Analysis using SHAP . . . . . . . . . . . 105
5.9.4 Feature Importance Analysis using GNNShap . . . . . . . . . 106
5.9.5 Application of Explainable AI Techniques in Our Use Case . . 107
5.10 Elliptic Classification Explainability . . . . . . . . . . . . . . . . . . . 108
5.10.1 Global Feature Importance . . . . . . . . . . . . . . . . . . . . 108
5.10.2 Feature Value Impact . . . . . . . . . . . . . . . . . . . . . . . 109
5.10.3 CaseStudies............................ 110
5.10.4 Graph Structure Importance (GNNShap) . . . . . . . . . . . . 112
5.10.5 Insights and Implications . . . . . . . . . . . . . . . . . . . . . 113
5.11Discussions ................................ 114
5.11.1 Distinctive Aspects of Mixing Transactions . . . . . . . . . . . 115
5.11.2 Importance of Peeling Chain Features . . . . . . . . . . . . . . 116
5.11.3 Global Feature Importance . . . . . . . . . . . . . . . . . . . . 116
6 Conclusions 117
6.1 Summary of Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.2 Contributions to the Field . . . . . . . . . . . . . . . . . . . . . . . . 118
6.3 Implications for Practice . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.4 Limitations & Future Research Directions . . . . . . . . . . . . . . . 120
Bibliography 122
viii
List of Tables
Table 2.1 Functionalities employed by all mixing solutions proposed in academia. 23
Table 2.2 Functionalities employed by all mixing services available in market 33
Table 2.3 The evaluation of all mixing solutions proposed in academia and
market. Rows colors stand as Blue: Decentralized Academic,
Orange: Centralized Academic, Green: centralized Real-World,
and Pink: Decentralized Real-World . . . . . . . . . . . . . . . . 45
Table 3.1 Data gathered from mixing services in 2019 for training phase . 49
Table 3.2 Method evaluation using generated dataset during time . . . . . 66
Table 4.1 Architecture of the autoencoder used for feature extraction. . . . 72
Table 4.2 Classification Report for Logistic Regression . . . . . . . . . . . 76
Table 4.3 Classification Report for Decision Tree . . . . . . . . . . . . . . 77
Table 4.4 Classification Report for Random Forest . . . . . . . . . . . . . 77
Table 4.5 Classification Report for Support Vector Classifier . . . . . . . . 77
Table 4.6 Classification Report for CNN . . . . . . . . . . . . . . . . . . . 78
Table 4.7 Comparative Analysis of Classification Models . . . . . . . . . . 79
Table 4.8 GCN Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Table 4.9 Performance metrics of the GCN classifier. . . . . . . . . . . . . 86
Table 4.10GAT Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Table 4.11Performance metrics of the GAT classifier. . . . . . . . . . . . . 90
Table 4.12Performance comparison of GCN and GAT models . . . . . . . 91
Table 5.1 Significant Correlations with In-degree and Out-degree . . . . . 94
Table 5.2 Derivation of min diff feature 6 . . . . . . . . . . . . . . . . . . 95
Table 5.3 Performance Metrics of GCN and GAT Models Trained and Eval-
uated on the Enhanced Model and the Mixing Transactions Fil-
teredDataset ............................ 102
ix
Table 5.4 Performance Metrics of GCN and GAT Models Trained and Eval-
uated on the Enhanced and the All the Dataset Transactions . . 104
x
List of Figures
Figure 1.1 Overview of the Thesis Structure and Key Components and Con-
tributions............................... 4
Figure 2.1 Overview of swapping process [95] . . . . . . . . . . . . . . . . 12
Figure 2.2 Sample of an Aggregating address [95] . . . . . . . . . . . . . . 15
Figure 2.3 Overview of Peeling Chain schema . . . . . . . . . . . . . . . . 15
Figure 2.4 Overview of CoinShuffle’s mixing process [90] . . . . . . . . . . 17
Figure 2.5 Overview of TumbleBit’s mixing graph [61]. Tumbler’s view on
mixing associations is blinded. . . . . . . . . . . . . . . . . . . 22
Figure 2.6 Sample of created communities (aggregation addresses) in Bit-
coinFog[77]............................. 43
Figure 2.7 Overview of ShapeShift’s mixing process [103] . . . . . . . . . . 44
Figure 2.8 Daily transactions in TC ETH pools. There was a panic exit
when the OFAC sanctions were announced [100] . . . . . . . . . 44
Figure 3.1 Pattern of suspected mixing withdrawal transaction in first phase 50
Figure 3.2 Sample withdrawal transaction of MixTum Service . . . . . . . 50
Figure 3.3 Schematic view of sweeper transaction . . . . . . . . . . . . . . 52
Figure 3.4 Tree view of transaction chain of an address we got out cleaned
money from a mixing service . . . . . . . . . . . . . . . . . . . 53
Figure 3.5 Schematic view of the transaction chain . . . . . . . . . . . . . 54
Figure 3.6 Sample real sweeper transaction beginning of transaction chain 56
Figure 3.7 Dirty-Side Money and Connection of It to the Transaction Chain 60
Figure 3.8 A full mixing transaction chain concluded by the third algorithm 65
Figure 4.1 TSNE Visualization of Fraud and Non-Fraud Transactions Con-
veyed from Their Features . . . . . . . . . . . . . . . . . . . . . 68
Figure 4.2 Transaction Frequency in Elliptic Categorized by Label over each
TimeStep.............................. 70
xi
Figure 4.3 TSNE plot showing fraudulent and non-fraudulent transactions
post Autoencoder latent space transformation . . . . . . . . . . 73
Figure 4.4 Confusion Matrices for Various Classifiers . . . . . . . . . . . . 78
Figure 4.5 Confusion Matrices for CNN . . . . . . . . . . . . . . . . . . . 80
Figure 4.6 ROC and Precision-Recall Curves for Various Classifiers . . . . 80
Figure 4.7 Confusion matrix of the GCN classifier. . . . . . . . . . . . . . 87
Figure 4.8 Confusion matrix of the GAT classifier. . . . . . . . . . . . . . 90
Figure 5.1 Significant Correlations with the In-degree and Out-degree Columns
Generated by Constructing the Graph . . . . . . . . . . . . . . 94
Figure 5.2 Confusion matrices for GCN and GAT models trained and eval-
uated on the Enhanced Model and the Mixing Transactions Fil-
tered Dataset. Left: GCN model; Right: GAT model. . . . . . 102
Figure 5.3 Confusion Matrix of GCN and GAT Models Trained and Eval-
uated on the Enhanced and the All the Dataset Transactions.
Left: GCN model; Right: GAT model. . . . . . . . . . . . . . . 103
Figure 5.4 Global feature importance based on mean —SHAP value— . . 109
Figure 5.5 Impact of feature values on model predictions . . . . . . . . . . 110
Figure 5.6 SHAP waterfall plot for an illicit transaction prediction . . . . . 111
Figure 5.7 SHAP waterfall plot for a correctly classified licit transaction . 112
Figure 5.8 GNNShap visualization of graph topology importance . . . . . 113
Figure 5.9 GNNShap bar plot of edge importance . . . . . . . . . . . . . . 114
xii
ACKNOWLEDGEMENTS
I would like to express my deepest gratitude to my supervisor, Dr. Homayoun
Najjaran, for his unwavering support, wise advice, and invaluable guidance through-
out my PhD journey. His mentorship has been instrumental in the completion of this
dissertation.
I am deeply grateful to Dr. Behnam Bahrak and Dr. Masoud Jalayer
for their unwavering support and guidance in research. Their invaluable insights and
encouragement have been essential. Additionally, I can always rely on them as friends.
I also extend my sincere thanks to Nick Smart from Crystal Blockchain and
Soroujit Bhaduri from Mastercard Co. for their continuous ideas and support.
Special thanks to my significant co-authors: AmirPasha Motamed,Todd
Charter, and Alireza Arbabi, whose collaboration has enriched my research.
On a more personal note, I am grateful for the emotional and friendly support
I received at the start of my PhD journey from Abtin and Pouya, who helped me
settle into my new situation. Masoud and Arghavan, as well as my ex-housemate
Khaled, also provided immense support.
My best friends Saleh,Mehrshid,Niloofar,Alireza, and Behdad have been
a source of remote support and encouragement, for which I am deeply thankful.
I am fortunate to have supportive friends such as Mehran G.,Ali,Mahya K.,
Mehran F.,Navid,Amir,Assef, and Mahya R.. Special thanks to Maryam for
her unconditional support, care, and consideration throughout this journey.
A heartfelt thanks to my parents and my brother for always believing in me
and providing unwavering support and encouragement from my childhood to this
very moment. Your faith in me, along with your constant love and sacrifice, has been
the bedrock of my journey. I am eternally grateful for the nurturing environment you
created, allowing me to pursue my dreams and achieve my goals. This accomplishment
is as much yours as it is mine.
I am also grateful for the financial and industry support from NTWIST and
Mastercard Co., and for the software support from Crystal Blockchain.
Lastly, I would like to thank my committee members for their guidance and con-
structive feedback.
xiii
PREFACE
This dissertation encapsulates my research journey and contributions to the field of
Artificial Intelligence, Data Augmentation, Blockchain Analysis, and Explainable
AI. Over the course of my PhD studies, I have authored several significant
publications that have been instrumental in shaping this work. I would like to
highlight some of my key publications:
Shojaeinasab, A., Charter, T., Jalayer, M., Khadivi, M., Ogunfowora, O.,
Raiyani, N., Yaghoubi, M., & Najjaran, H. (2022). Intelligent manufacturing
execution systems: A systematic review. Journal of Manufacturing Systems,
62, 503-522. Elsevier.[93] - This paper is a highly cited paper(over a hundred)
and first research I conducted in my PhD journey.
Shojaeinasab, A., Motamed, A. P., & Bahrak, B. (2023). Mixing detection on
bitcoin transactions using statistical patterns. IET Blockchain, 3 (3), 136-148.
Wiley Online Library.[95] - This work is completely used in Chapter 3
Arbabi, A., Shojaeinasab, A., Bahrak, B., & Najjaran, H. (2023). Mixing So-
lutions in Bitcoin and Ethereum Ecosystems: A Review and Tutorial. arXiv
preprint arXiv:2310.04899.[28] - As the equally contributed to the first author,
this work is completely used in Chapter 2
Shojaeinasab, A., Jalayer, M., Baniasadi, A., & Najjaran, H. (2024). Unveiling
the black box: a unified XAI framework for signal-based deep learning models.
Machines, 12 (2), 121. MDPI.[94] - This work and the implementation for that
is used in a part of chapter 5
Jalayer, M., Shojaeinasab, A., & Najjaran, H. (2023). A Model Identification
Forensics Approach for Signal-Based Condition Monitoring. International Con-
ference On Flexible Automation and Intelligent Manufacturing, 12-19. Springer.[67]
- This work and the implementation for that is used in a part of chapter 5
Ahang, M., Jalayer, M., Shojaeinasab, A., Ogunfowora, O., Charter, T., & Na-
jjaran, H. (2022). Synthesizing rolling bearing fault samples in new conditions:
A framework based on a modified CGAN. Sensors, 22 (14), 5413. MDPI.[24]
xiv
Khadivi, M., Charter, T., Yaghoubi, M., Jalayer, M., Ahang, M., Shojaeinasab,
A., & Najjaran, H. (2023). Deep reinforcement learning for machine schedul-
ing: Methodology, the state-of-the-art, and future directions. arXiv preprint
arXiv:2310.03195.[68]
Each of these works has contributed to the advancement of knowledge in their
respective areas and has played a crucial role in the development of this dissertation.
I am deeply grateful to my supervisor, Dr. Homayoun Najjaran, for their continuous
support and guidance. I also extend my heartfelt thanks to all of my co-authors.
xv
DEDICATION
Dedicated to my parents and my brother.
Chapter 1
Introduction
The advent of blockchain technology has revolutionized the financial sector by intro-
ducing cryptocurrencies as decentralized digital assets [78]. These digital currencies
promise enhanced privacy and security; however, the inherent transparency and im-
mutability of blockchain ledgers pose significant challenges to maintaining individual
privacy [96, 58]. Despite this transparency, illicit actors increasingly utilize cryptocur-
rencies for money laundering due to their cross-border nature, speed, and generally
low transaction costs, making them an attractive option for concealing ill-gotten gains.
Cryptocurrencies, while offering new opportunities, have also become a tool for
money laundering by criminals seeking to conceal illicit funds. This activity extends
beyond crimes typically associated with the digital world, such as darknet markets and
ransomware, to encompass a wide range of illegal activities like drug trafficking and
fraud. By 2024, money laundering using cryptocurrencies is projected to encompass
all forms of crime, according to Chainalysis[46].
This development poses a challenge for law enforcement, requiring expanded ex-
pertise in cryptocurrency investigation beyond specialized cybercrime units. The
global nature of cryptocurrencies necessitates a comprehensive understanding of both
blockchain analysis techniques and traditional money laundering methods. However,
the inherent transparency of blockchain technology presents a significant advantage
for investigators. With the right tools and data, blockchain analysis can provide valu-
able intelligence, enabling proactive identification of criminal activity and providing
concrete evidence of illicit financial flows. This transparency allows for the break-
ing down of sophisticated money laundering networks that might otherwise remain
undetected.
Traditionally, money laundering involves concealing the origins of illegally ob-
2
tained funds to make them appear legitimate. This process typically involves three
stages: placement, layering, and integration. Placement refers to the initial introduc-
tion of illicit funds into the financial system. Layering involves conducting multiple
transactions to obscure the path of funds. Finally, integration reintroduces the laun-
dered money into the legitimate economy, masking its illegal origin.
Chainalysis, in its annual Crypto Crime Reports[23], has shed light on the flow
of funds from identified illicit wallets associated with crypto-native crimes (e.g., ex-
change hacks, scams, darknet markets) to various conversion services. These services,
including centralized exchanges, DeFi platforms, gambling sites, mixers, and bridges,
facilitate the conversion of cryptocurrencies into fiat currency, other cryptocurren-
cies, or provide other services. This on-chain activity, termed ”crypto-native money
laundering,” can be tracked and analyzed with greater precision and speed compared
to traditional financial systems because of the inherent transparency of blockchain
technology.
Balancing transparency with privacy remains a challenge in the cryptocurrency
space. Techniques such as network modeling, profiling, and network-based detection
facilitate the monitoring of fund transfers between cryptocurrency addresses [102].
Consequently, any association of a user’s address with a suspicious address can lead
to the flagging of both the user’s address and the associated funds as suspicious,
thereby compromising privacy and anonymity.
In response to these challenges, cryptocurrency mixing services have emerged,
designed to obfuscate the origins and destinations of transactions and enhance user
privacy. However, these services also pose significant risks, as they can be exploited
for money laundering, evading sanctions, transferring ransomware proceeds, and other
illicit activities. For example, SilkRoad extensively used crypto mixers to disguise its
transactions [50], and ransomware such as WannaCry utilized these services to hide
illicit funds [38]. Additionally, groups like the Lazarus Group have exploited these
services to obscure over $991 million in stolen funds [64, 23].
The pressing need for a sophisticated understanding of cryptocurrency mixing
services, given their growing use and the complex ethical and technical issues they
introduce, motivates this dissertation. This research contributes to the field by first
conducting a detailed literature review of mixing solutions used both in academia
and industry, providing a comprehensive overview of existing methodologies and their
applications (Chapter 2). Additionally, we develop an evaluation framework for as-
sessing these mixing services, aiding in the systematic analysis of their effectiveness
3
and risks.
Building on this foundational understanding, I introduce a novel approach to de-
tecting mixing activities aimed at enhancing anti-money laundering efforts in Bitcoin
(Chapter 3). This approach involves identifying statistical patterns and clustering
addresses to link them to real-world identities, thereby improving the traceability
and accountability of transactions.
Chapter 4 presents the Elliptic dataset, a publicly available anonymized dataset
on illicit Bitcoin transactions. We explore the dataset and its statistical properties,
initially employing classical machine learning solutions while ignoring the graph as-
pect of the data. Due to the dataset’s imbalance towards licit labels, these methods
were insufficient. Consequently, we applied various Graph Neural Network (GNN)
models to better handle this skewness.
In Chapter 5, we aim to deanonymize the Elliptic dataset to apply our mixing
detection algorithm introduced in Chapter 3. The idea is to deanonymize the data,
apply our unsupervised mixing detection algorithm, extract features from it, and en-
hance these features to improve our GNN-based methods. We assess our unsupervised
mixing detection algorithm using the labels from the Elliptic dataset. Additionally,
we integrate explainable AI techniques, particularly SHAP, to interpret the decision-
making processes of the GNN, ensuring transparency in model determination.
In summary, this dissertation enhances the understanding of mixing services and
proposes new methods for their detection. It also contributes to the broader discourse
on balancing privacy and transparency in blockchain technologies. The conclusion
(Chapter 6) reflects on the findings and discusses potential avenues for future research.
Furthermore, Figure 1.1 provides a block diagram that illustrates the structure and
flow of the thesis, detailing how each chapter contributes to the overall research
objectives.
1.1 Statement of the Problem
The proliferation of cryptocurrencies has introduced a paradigm shift in financial
transactions, emphasizing decentralization and anonymity. While this evolution of-
fers numerous benefits, it has also inadvertently created a haven for illicit activities,
particularly money laundering. The inherent pseudonymity of blockchain technology
poses significant challenges in tracing and identifying these illegal transactions.
Traditional financial monitoring systems are often ill-equipped to penetrate the
4
Chapter 2
General Definitions
Literature Review
Provision of an
Evaluation
Framework
Considering the
Possible Attacks to
the Services
Evaluating the Mixing
Services
Chapter 3
Mixing Data
Acquisition
Unsupervised Bitcoin
Mixing Detection
Algorithm
Evaluation and
Discussion
Chapter 4
Model Evaluation and
Comparision
Elliptic Dataset
Investigation
Feature Selection
Performing Classical
Machine Learning
Classification for Bitcoin
Fraud Detection
GNN-based
Classification
Chapter 5
Elliptic
Deanonymization
Attack
Performing the Mixing
Detection Algorithm on
the Deanonymized
Elliptic
Extracting Features
from the Mixing
Detection Algorithm
Performing GNN-
based Classifications
on the Enhanced
Dataset
Model Evaluation and
Comparision
Theory of SHAP and
GNNShap
Providing a
Comprehensive
Model Explainability,
ExAI
Chapter 6
Key Findings and
Practical
Contributions
Limitations
Future Directions
Figure 1.1: Overview of the Thesis Structure and Key Components and
Contributions.
5
complex layers of blockchain transactions. Consequently, there is a critical need
for innovative methods to effectively detect and analyze fraudulent activities within
digital currencies, especially on platforms like Bitcoin and Ethereum. This thesis
addresses the gap in existing methodologies by introducing advanced deanonymiza-
tion techniques and leveraging graph neural networks to uncover hidden patterns in
blockchain transactions.
1.2 Research Objectives and Questions
The primary objective of this research is to develop and validate effective strategies for
detecting and analyzing money laundering and criminal activities in the blockchain.
To achieve this, the study focuses on:
Examining existing methods of money laundering in cryptocurrencies and cur-
rent deanonymization efforts.
Developing a novel methodology for detecting mixing transactions in Bitcoin in
an unsupervised manner.
Utilizing graph neural networks to detect illicit transactions in a supervised
manner.
Addressing the imbalanced nature of financial fraud data, which is predomi-
nantly skewed towards licit activities.
Exploring the applicability of Explainable AI techniques, particularly SHAP
values, to enhance the interpretability and transparency of the developed mod-
els.
Evaluating the performance of our developed mixing detection algorithm on
public anonymized datasets without prior knowledge of feature names.
The guiding research questions are:
How can advanced statistical and neural network methodologies be applied to
identify and analyze illicit transactions in cryptocurrencies?
In what ways can Explainable AI contribute to the understanding and validation
of models used in detecting blockchain-based money laundering?
6
1.3 Scope and Limitations of the Research
The scope of this research is primarily focused on the application of statistical algo-
rithms and graph neural networks to detect and analyze money laundering and fraud
in Bitcoin and Ethereum blockchain transactions. It specifically concentrates on un-
earthing the hidden transactions associated with mixing services like Tornado Cash.
While the research makes significant strides in blockchain forensics, it is important to
acknowledge its limitations. The complexity and ever-evolving nature of blockchain
technology and cryptocurrency practices mean that the methodologies developed may
need continuous adaptation and refinement. Additionally, the study’s focus on Bit-
coin and Ethereum may not directly translate to other cryptocurrencies with different
structures or privacy mechanisms. The reliance on available data and the evolving
nature of blockchain anonymity techniques also pose challenges to the universality
and long-term applicability of the findings. Despite these limitations, the research
provides a foundational framework for future studies and practical applications in the
field of cryptocurrency forensics and anti-money laundering efforts.
7
Chapter 2
Background
2.1 General Definitions
The introduction of mixers marked a significant transformation in cryptocurrencies,
directly addressing the challenges of trust, security, privacy, and efficiency. Essential
to the operation and growing acceptance of these blockchain-based mixers are the
foundational technologies and principles. We will provide a brief overview of some of
these key concepts below before going over how the mixers operate in detail:
Blockchain. Introduced in 2008 by the pseudonymous Satoshi Nakamoto for
the cryptocurrency Bitcoin, a blockchain is a decentralized and distributed digital
ledger technology used to record transactions across multiple computers in a way that
ensures the data can only be modified once it’s been recorded. Once a block of data
has been added to the blockchain, it becomes virtually immutable, protected from
alteration without altering all subsequent blocks and the consensus of the network.
This characteristic ensures data integrity and transparency. Each block in the chain
contains a number of transactions, and every time a new transaction occurs on the
blockchain, a record of that transaction is added to every participant’s ledger. This
decentralized nature of the system ensures that no single entity has control over the
entire blockchain, and all transactions are publicly recorded, ensuring transparency
and trustworthiness [57].
Transactions. At the heart of Bitcoin or any other cryptocurrency function-
ing is its transaction-based public ledger. Transactions here are a play of Unspent
Transaction Outputs (UTXOs), which are used wholly in transactions. Given the
improbability of a UTXO matching an exact spending amount, most Bitcoin transac-
8
tions result in two outputs. One is received by the intended recipient, while the change
is sent back to the sender at a new address. Additionally, transaction metadata en-
compasses public keys, UTXOs, transaction size, and its unique hash. With inputs
signed using the sender’s private key, validity is readily verifiable via the sender’s
public key [78].
Consensus Mechanisms. One of the revolutionary aspects of blockchain tech-
nology is the elimination of central authority, ensuring transactions’ validation through
consensus mechanisms. These algorithms ensure all nodes in the network agree upon
the truth. Bitcoin, for example, utilizes the Proof-of-Work (PoW) mechanism, where
participants (miners) solve cryptographic puzzles to validate transactions and add
new blocks. However, concerns regarding energy consumption and scalability have
led to the exploration of alternative consensus mechanisms. Proof-of-Stake (PoS) is
one such alternative where validators are chosen to create new blocks based on the
number of coins they hold and are willing to ”stake” as collateral. Each consen-
sus mechanism has its trade-offs in terms of security, decentralization, and efficiency
[78, 70].
Cryptographic Hash Functions. A cornerstone of blockchain technology, cryp-
tographic hash functions, ensure data integrity and security. These functions take an
input and produce a fixed-size string of bytes, typically a digest that is unique to each
unique input. The SHA-256, predominantly used in Bitcoin, is one such algorithm.
These functions are crucial for generating public and private keys, forming blocks,
and maintaining data consistency and integrity across the decentralized network [80].
Privacy in Cryptocurrencies. While cryptocurrencies offer a robust mech-
anism for transaction security, privacy remains a nuanced challenge. Traditional
banking systems provide privacy by restricting access to transactional information.
Cryptocurrencies, on the other hand, announce all transactions publicly. However,
privacy is attempted by keeping public keys anonymous, akin to stock exchanges. A
key challenge arises when key owners are revealed, potentially unveiling other linked
transactions [78]. Furthermore, users’ identities can be deanonymized by associating
pseudonyms with IP addresses or by abusing Bitcoin’s anti-DoS measures [34].
Smart Contracts. Beyond traditional transactions, the blockchain ecosystem
has given rise to smart contracts. These are digital protocols designed to autonomously
execute tasks when predefined conditions are met. Such contracts eliminate third-
party interventions and can engender automatic payments, quality controls, and es-
tablish trust amongst stakeholders [71]. Not limited to Bitcoin, these contracts have
9
been enhanced on other platforms due to the introduction of features like Turing-
completeness and blockchain-awareness [44].
Decentralized Finance (DeFi). DeFi represents a conglomerate of decentral-
ized applications aiming to recreate or enhance traditional financial systems (like
lending, borrowing, and derivatives) without intermediaries using blockchain technol-
ogy. Predominantly built upon Ethereum, these platforms leverage smart contracts
to ensure transparency, openness, and global accessibility. As DeFi platforms grow,
they offer the potential for a more inclusive financial system and present challenges
and complexities related to security and regulatory oversight [91].
Layer-2 Solutions. As blockchain networks like Bitcoin and Ethereum became
more popular, they faced scalability issues, with transaction speeds being a primary
concern. Layer-2 solutions are protocols built on top of a primary blockchain (Layer-1)
to increase the transaction throughput. One notable example is Bitcoin’s Lightning
Network, which facilitates off-chain transactions by opening bilateral channels be-
tween parties, thus providing faster and cheaper transactions. Similarly, Ethereum is
exploring various Layer-2 solutions, like rollups, to alleviate its scalability constraints
[86].
Zero Knowledge Proof (ZKP). An intriguing cryptographic tool, ZKP allows
for claim verification by an individual without revealing any supportive information.
This concept, which emerged in the 1980s, was pioneered by Goldwasser, Micali, and
Rackoff from MIT [56].
In summation, the landscape of cryptocurrencies is underpinned by a melange of
technical innovations and cryptographic techniques, which collectively aim to create
a secure, transparent, and efficient digital monetary ecosystem.
2.1.1 Cryptocurrency Mixers
In order to analyze how mixing services provide anonymity for their users, we will
first explain the definition of mixing. Then, we will take a detailed look at the
various obfuscation methods and techniques used by academic and real-world services.
Additionally, we will evaluate these methods based on their assessed functionalities.
Mixing Definition. The mixing process in cryptocurrencies, often referred
to as coin mixing or coin tumbling, is a technique employed to enhance the privacy
and anonymity of transactions within blockchain networks. It involves the pooling
of multiple cryptocurrency transactions from various sources and then redistributing
10
them in a manner that obscures their origin and destination. The process typically
operates through specialized mixing services or protocols that amalgamate the funds,
making it challenging to trace specific coins back to their original owners[55]. By
breaking the transactional linkage between addresses and transactions, coin mixing
provides a degree of confidentiality, ensuring that the flow of funds remains private
and reduces the ability to associate specific transactions with identifiable individuals
or entities. This process plays a crucial role in preserving the privacy and anonymity
of cryptocurrencies in an increasingly surveilled financial landscape [51]. A high-level
schema of the mixing process is depicted in Figure 2.1.
Before discussing the mixing solutions, it’s important to introduce two key terms
that will be used later: Anonymity Set and Taint Analysis. Anonymity Set refers
to the number of participants involved in the mixing process, and Taint refers to
the amount of cryptocurrency in an account that came from another account. Taint
rates between different addresses can be extracted by tracing back from an address
to the origins of its received funds. Taint analysis involves traversing the transaction
graph to identify possible connections and relations among different graph vertices
(cryptocurrency addresses). This type of analysis helps to deanonymize the activities
and connections behind the blockchain.
2.2 Obfuscating Techniques
The objective of this subsection is to present a comprehensive overview of the ob-
fuscating techniques used by mixing services for Turing-complete and non-Turing-
complete cryptocurrencies. Subsequently, we will examine how frequently each method
is used in both academic solutions and real-world services.
Swapping. Swapping stands as a prominently employed methodology for ob-
scuring the association between senders and recipients on a cryptocurrency network
[103]. The fundamental concept revolves around interchanging the inputs and out-
puts among diverse participants to keep their relationships secret, as depicted in
Figure 2.1. By swapping different coins among participants, an intricate meshwork
of transactions is created, making it tough to track where the original money came
from.
Aggregating Funds. Chang and colleagues discovered specific repetitive trans-
action patterns in the Bitcoin network, including what they referred to as ”sweeper
11
transactions” [47]. These transactions, depicted in Figure 2.2, involve a large number
of input addresses and one or two output addresses referred to as ”Aggregation Ad-
dresses”. In essence, this method consolidates funds from various addresses into one
or two aggregation addresses, resulting in a substantial balance in the newly formed
address. By centralizing all funds into a single input and sending them to the recipient
from there, the aggregation address functions as a tool to obscure transactions, which
breaks the traceable links between senders and recipients by creating a many-to-one,
then one-to-many association.
Peeling Chain. A peeling chain is a set of transactions generated by mixing
services that form a chain to distribute outputs. The unique property of the peeling
chain is that transactions in the chain are similar to normal user transactions with one
input and two outputs [60]. Therefore, utilizing peeling chains makes mixing trans-
actions more indistinguishable from a normal user. The schematic of peeling chains
is depicted in Figure 2.3. A recent research by Gong et al. discussed the technical
dissection of Bitcoin transactions to flag those affiliated with mixing services with a
focus on peeling chains as one of the major repetitive patterns in mixing services,
a granular analysis is focused on several transaction parameters. The transaction
version field designates protocol features, while the lock time parameter signals the
earliest possible block height or time a transaction may be added to the blockchain.
A critical element, the sequence number within each input yet a lower value impli-
cates the transaction in the Replace-by-Fee policy or ties its finalization to lock time.
Parsing these values for uniformity across a dataset can unearth patterns indicative
of automated mixer activities[59].
Fund Splitting. Fund Splitting refers to a process in which users’ input coins are
divided into smaller denominations and sent through various complex paths, involv-
ing multiple participants. Randomization and adding delays in the fund distribution
process further complicate the transaction history, making it challenging to link the
source and destination of coins.
Chain Hopping. Chain hopping is a relatively new and promising technology
applied by cryptocurrency mixers to boost privacy and anonymity. The idea behind
chain hopping is to add an extra layer of security to the mixing process by utilizing
multiple blockchain networks and switching between them. This approach makes it
12
Figure 2.1: Overview of swapping process [95]
difficult to trace funds, even if the original source of the cryptocurrency is known.
Randomized Fee. In most mixing services, particularly those that are central-
ized, users are required to pay fees to use the mixing service. Employing a constant
fee structure could lead to a detectable usage pattern, potentially compromising the
mixing process graph. To counteract this concern, Bonneau et al. [41] propose the
adoption of randomized fees. Implementing a stochastic fee characterized by a contin-
uous range spanning from 0 to a predetermined mixing fee value could eliminate any
discernible pattern associated with mixing fees within the mixing transaction graph.
Randomized Delays. One of the mixing detection patterns, especially in multi-
round and multi-output mixing is looking for time-based patterns in committed trans-
actions to the blockchain. If a mixer consistently uses a fixed time delay in all mixing
transactions, mixing patterns can be inferred from this property. To tackle this prob-
lem, mixers can set random delays for broadcasting transactions. This way, any
predictable timing patterns are avoided, making mixing detection more difficult.
Third-Party Blinding. In centralized mixing setups, a central entity undertakes
the mixing process on behalf of participants, affording it comprehensive insight into
the connections between senders and recipients. This arrangement raises concerns
about potential information leakage, as users must rely on a third party’s honest ex-
ecution of mixing without disclosing those details. To address this challenge, certain
prior studies [99, 61] have proposed mixing techniques that aim to shield the mixer
entity from knowledge about the involved users. In simpler terms, these mechanisms
are structured such that the mixer lacks awareness of the associations between specific
inputs and corresponding outputs.
Blindcoin [99] attained this objective through the utilization of the blind signature
13
scheme outlined by Chaum [99], effectively concealing linkage details from the mixer.
Heilman et al. [61] introduced a novel approach where both senders and recipients
interact with the mixer using cryptographic RSA-based puzzles, ensuring that the
mixer entity remains oblivious to the linking information. The high-level schema of
TumbleBit is depicted in Figure 2.5. Further elaboration on these methods is pro-
vided in section 3.2.
Off-Chain Transactions. Some mixing services employ off-chain transactions
to diminish the duration and expenses of the mixing process while enhancing privacy.
Due to the inherent untraceability of many off-chain transactions, external observers
cannot discern mixing patterns as some transactions are hidden from them.
Fungibility of Inputs. Fungibility, within the context of mixing, signifies the
requirement that all inputs involved in the mixing process possess equal values. This
criterion obfuscates the linkage between senders and receivers in transactions, mak-
ing the tracing of transactions challenging due to the uniformity of transaction values.
Disconnected Fund Flow. The term ”Being Disconnected” denotes severing
the complete link between a sender and receiver, preventing any verifiable connection
through the transaction graph of the mixer using the maximum-flow algorithm (taint
analysis). This can be achieved through swapping methods or by directly deliver-
ing cleaned funds to the recipient from an independent address that holds adequate
funds. One of the main advantages of maintaining a disconnected mixing graph is
the increasing difficulty for a third party to detect mixing transactions since linking
the disconnected transactions to each other is a hard task for any entities outside of
the mixing process [95].
Address Freshness. The freshness of addresses means that every time mixing
happens, both the mix’s escrow addresses and recipients’ output addresses should
be fresh addresses created specifically for that mixing process. This is required to
prevent connecting the current mixing to other addresses that are not related. More-
over, those mixing services which gives mixing guarantees to the participated users
should use fresh and unique addresses in order to create reliable mixing guarantees
[41]. Therefore, both parties should pick addresses with no other possible source of
incomes. This helps maintaining the privacy and isolation of the mixing process.
14
It’s important to note that this feature is specifically relevant to address-based cryp-
tocurrencies, such as Bitcoin, and isn’t applicable within account-based blockchain
networks like Ethereum, since the interactions with mixers are based on the mixers’
contracts.
Trusted-Execution-Environment. A Trusted Execution Environment (TEE)
is a secure and separate space within a computer’s main working area, which keeps
sensitive information and tasks safe from potential threats that could affect the reg-
ular parts of the computer. TEEs make sure that only trusted programs can access
in the protected space, making it a shielded area for sensitive activities [104]. A TEE
could host the mixing process, ensuring that the transaction mixing occurs in a pro-
tected and isolated environment with remote attestation (i.e., a third party can verify
the correctness of the mixer’s operations, like Intel SGX [27, 75] and ARM TrustZone
[29]). In this setup, the TEE securely manages the mixing process, safeguards private
keys, and shields transaction details. It also establishes secure communication with
the mixer’s server, enhancing protection against potential threats. By employing a
TEE, users benefit from safer and more private cryptocurrency transactions within
the mixer.
Zero-Knowledge proofs. Zero-knowledge mixers enable a crucial element of
trustless mixing by allowing participants to prove the validity of their transactions
without revealing sensitive information, such as the sender, recipient, or transaction
amount. Through ZKPs, users can mathematically confirm that they possess the nec-
essary information to spend their funds, ensuring the integrity of the mixing process
while maintaining confidentiality. This approach adds an extra layer of confidentiality
and security to the mixing process.
Other techniques. Researchers have designed various academic mixing methods
primarily leveraging cryptography, distinct from the mentioned functionalities. These
methods mostly involve using cryptographic techniques like RSA-based puzzles (in
TumbleBit[61] and BSC[62] papers), Ring Signatures (in obius [76] paper, notably
in Monero cryptocurrency [26]), fair-exchange protocol [32], and several other tech-
niques. These methods will be discussed comprehensively in the forthcoming sections.
Khalilov et al. proposed a literature on mixing functionalities based on cryptographic
concepts [69].
15
Figure 2.2: Sample of an Aggregating
address [95]
30
28
25.5 18
17.6 16.2
2
2.5
7.5 0.4
1.4
Mixer Address
Recipient Address
Figure 2.3: Overview of Peeling
Chain schema
2.3 All the Proposed Mixing Frameworks
Following a comprehensive investigation of the concealment techniques used by many
mixing services to ensure funds remain untraceable, this section delves into an exam-
ination of the prominent mixing frameworks proposed in the literature. Additionally,
we evaluate the practical mixing services that are popular among cryptocurrency
users. For our analysis of the mixing frameworks, our focus remains strictly on the
foundational papers associated with each, and the discussions in the BitcoinTalk com-
munity. On the other hand, when it comes to the widely used practical services, we
assess various documented attacks on these platforms and consider the results and
analysis found in previous studies.
2.4 Academic Proposed Mixing Approaches
In addressing the anonymity concerns within cryptocurrencies, various mixing frame-
works have been proposed in academic literature. These efforts, whether decentralized
or centralized, have prioritized users anonymity and privacy rather than maximizing
profits for the mixer entities. In this section, we delve into the architecture of 8
16
decentralized and 5 centralized noteworthy mixing frameworks.
2.4.1 Decentralized Mixing Frameworks
The core philosophy behind the creation of Bitcoin and other cryptocurrencies was to
remove the need for relying on third parties for financial transactions. Consequently,
the cryptocurrency community generally seeks to steer away from centralized ser-
vices, including mixing services. This has led researchers to suggest decentralized
mixing frameworks, aligning with the fundamental aims of cryptocurrencies, to en-
hance users’ privacy and uphold the primary purpose behind their creation. However,
decentralized frameworks generally suffer from a slow mixing process, the need for
honest participants majority, and limited scalability.
CoinJoin. CoinJoin is a privacy-enhancing technique that allows multiple users
to combine their transactions into a single joint transaction [74]. This process makes
it challenging for outside observers to link the original input and output addresses,
thus increasing the privacy of participants. CoinJoin is not a complete mixing pro-
tocol because it neither describes how participants should be selected, nor how the
swapping transaction is actually formed.
CoinShuffle. CoinShuffle was introduced as an enhancement to CoinJoin, pre-
senting it as a fully decentralized system [90]. The protocol initiates with the par-
ticipation of nusers in the mixing procedure. Subsequent to this initiation, each
participant generates a new Bitcoin address, intended to serve as their output ad-
dress in the mixing transaction.
The ensuing step involves the participants shuffling the newly formed output ad-
dresses. This shuffling is performed in a manner analogous to a decryption mix
network, ensuring that the identities of the address creators remain undisclosed. This
procedure unfolds over ndistinct shuffling rounds.
In the initial round, a participant, referred to as Alice, signs her transaction using
her private key. She then encrypts this transaction employing the public keys of the
remaining n1 participants. The resultant encrypted transaction is then relayed to
the subsequent participant, Bob. Upon receipt, Bob decrypts the transaction from
Alice using his public key. Simultaneously, he signs his personal transaction with his
private key, and encrypts it with the public keys of the remaining n2 participants.
Both transactions (the decrypted one from Alice and his own) are then passed to the
17
A’A’
B’
Alice
BobCharlie
B’
A’
C’
Output
Addresses
B’
C’
A’
Input
Addresses
A
B
C
Alice
Bob
Charlie
Announcement Shuffling
Figure 2.4: Overview of CoinShuffle’s mixing process [90]
third participant.
This sequence progresses iteratively, culminating when the final participant re-
ceives n1 encrypted transactions. This last participant decrypts all received trans-
actions using her private key, appends her own transaction to this collection, and then
broadcasts the consolidated list. The schema of this protocol is depicted in Figure 2.4.
The protocol’s success hinges on mutual verification. If participants recognize
their respective transactions within the final aggregated list, they will endorse the
transaction with their signature. If not, they abstain. However, the main drawback
of both CoinJoin and CoinShuffle frameworks is that, since their protocol puts all
inputs and outputs in a single transaction, the anonymity set size is limited to the
transaction size. Also, the absence of mixing fees makes both Sybil and Dos attacks
easy for an attacker (All possible attacks and the resilience of each framework to them
are discussed in section 2.7 and 2.8).
18
Xim. Bissias et al. put forth Xim as a decentralized P2P mixer that employs
a swapping mechanism [37]. Participants aiming to mix their funds are paired at
random and swap their transaction outputs with each other. To mitigate Sybil and
DoS attacks, Xim integrates mixing fees. Participation in a mixing action necessi-
tates expenditure from both parties, rendering such attacks prohibitively costly for
potential attackers.
CoinParty. CoinParty was conceptualized by Ziegeldorf and his colleagues as
a mixing strategy that functions through the collaboration of multiple third-party
mixing peers [105]. Initially, with nparticipating users, nunique escrow addresses
are collaboratively formulated by the mixing peers for users’ funds collection. Af-
terwards, This is followed by the shuffling stage, which is orchestrated similarly to
CoinShuffle. Subsequently, the mixing peers sign the shuffled transactions to finalize
the mixing process. The jointly generated escrow addresses are the same as the Bit-
coin ordinary transactions, resulting in increasing the mixing anonymity. Nonetheless,
the absence of mixing fees in this approach diminishes the incentive for mixing peers
to work honestly, making the method vulnerable to potential Join-and-Abort attacks.
SecureCoin. Ibrahim introduced SecureCoin as a mixing technique that in-
corporates the Joint-Secret-Sharing protocol [66] to decrease transaction size while
preventing sabotage from mixing peers [63]. In this design, all senders collaboratively
establish a unified escrow address for fund aggregation. Subsequently, output ad-
dresses are rearranged in a manner similar to the CoinShuffle protocol. Finally, the
senders collectively provide their signatures for the ultimate transaction, enabling the
transfer of funds from the aggregation address to the designated outputs.
obius. obius is an Ethereum-based mixing approach within the framework
of smart contracts [76]. This method relies on the Ring-Signature [89] scheme and
stealth addresses to obfuscate the senders and recipients associations like the strategy
observed in Monero [26]. A ring signature is a type of cryptographic digital signature
that enables a user to sign a message on behalf of a group (or ”ring”) without dis-
closing which individual member’s private key was used to create the signature. In
terms of mixing, a recipient can withdraw her money from a group of transactions
without revealing the corresponding sender of the withdrawn money.
Within the context of obius, the process unfolds as follows: Senders furnish both
19
their funds and stealth keys to a designated smart contract that orchestrates the mix-
ing of the provided inputs. Subsequently, each recipient can create a ring signature to
withdraw their funds from the contract and transfer them to an ephemeral address,
culminating in the completion of the mixing procedure. Notably, the deterministic
nature of the Ethereum Virtual Machine (EVM) ensures that the mixing works in a
tamper-resistant manner as the entire process occurs on the blockchain, precluding
the involvement of any central authority capable of disrupting the integrity of the
mixing mechanism.
AMR. AMR is a mixer based on zk-SNARKs, aimed at disrupting the traceabil-
ity connection between coins deposited and withdrawn by a user within a blockchain
governed by smart contracts [72]. The AMR configuration involves participants de-
positing a predetermined quantity of coins into a smart contract. Subsequently, this
contract establishes a Merkle tree structure over the deposits, generating a commit-
ment that represents the deposited transactions. When a recipient wishes to retrieve
their funds from the contract, they are required to validate their familiarity with
the committed values associated with specific pre-existing deposit commitments that
contribute to the computation of the Merkle tree root. This validation is achieved
through the employment of zk-SNARK proofs. Additionally, AMR leverages lending
platforms (Like Aave [1] and Compound [2]) to provide users with the opportunity to
receive interest on their deposited assets. This particular strategy serves as an added
motivation for users to retain their funds within the ecosystem.
MixEth. Seres et al. introduced MixEth as a decentralized coin mixing solution
for Turing-complete blockchains [92]. The core concept of MixEth revolves around
the integration of Neff’s verifiable shuffles [81] within the framework of coin mixing.
In the MixEth protocol, participants within the mixer engage in a multi-round shuf-
fling process of their public keys. In each round of shuffling, a shuffler permutes all
public keys using Neff’s method, then commits them to the mixer along with a Zero-
Knowledge proof to prove that the permutation was performed correctly. Afterwards,
as soon as recipients are confident in the adequacy of the shuffling iterations, they
gain the capability to retrieve their assets from the mixing service.
Zether. Zether [42] represents an innovative payment solution tailored for en-
hancing privacy and confidentiality within account-based blockchain systems, like
20
the Ethereum ecosystem. In zether, users who wishes to obfuscate their coins should
establish a new Zether address and deposit their funds into the Zether Smart Con-
tract (ZSC). Subsequently, Zether generates an equivalent amount of Zether tokens
(ZTH) mirroring the deposited funds. With this accomplished, users gain the ability
to send their ZTH tokens with a high degree of anonymity through the Zether smart
contract.
To achieve privacy, the transferring tokens are obfuscated by encrypting the amount
being sent, and the sender’s identity is concealed by selecting a group of other Zether
accounts by the sender to form an anonymity set and being covered among them (sim-
ilar to ring signatures[89] in Monero [26]). This set effectively conceals the sender’s
account among its peers, making it exceedingly challenging for any observer to dis-
cern the origin of the transaction. Zether uses a new type of Zero-Knowledge Proofs
called Σ-Bullets for anonymously transferring funds through its transactions. Due to
the space limitations, we refer the readers to the detailed description provided in [42].
2.4.2 Centralized Mixing Frameworks
While decentralization stands as a core objective within the cryptocurrency com-
munity, its practical realization is not always optimally efficient. When it comes to
mixing, centralized services shows more scalability, speed, and user-friendliness, com-
pared to decentralized ones. Therefore, several researchers have introduced central-
ized frameworks aimed at enhancing user privacy and anonymity within the services,
and mitigate the risk of fund thefts by malicious mixer entities.
MixCoin. Bonneau et al. presented MixCoin as the first academic centralized
mixer [41]. Within this framework, a central mixer undertakes the mixing tasks for
all participants. When a user deposit their coins into a designated escrow address,
the mixer issues a digitally signed message as a guarantee. Should the mixer act ma-
liciously or misappropriate user funds, this guarantee can be publicly disclosed. By
leveraging the mixer’s public key, the veracity of this guarantee can be ascertained.
Consequently, the reputation of a mixer is derived from its track record of honest mix-
ing and user satisfaction. Furthermore, To obfuscate discernible patterns associated
with mixing fees, MixCoin adopts a randomized fee strategy. This tactic is especially
potent during sequential mixing when a user sequentially engages multiple mixers.
21
Integrating randomized mixing fees with varied inter-mixing intervals heightens the
level of anonymity.
BlindCoin. Building upon MixCoin’s foundation, Valenta et al. put forth Blind-
Coin, which leverages chaumian blind signatures [99] and a public ledger to curtail a
centralized mixer’s insight into the linkage between senders and recipients [49]. This
design substantially reduces the susceptibility to permutation leaks. However, it’s
pivotal to acknowledge that, akin to MixCoin, BlindCoin doesn’t offer safeguards
against potential fraudulent activities or coin stealing by the mixers.
TumbleBit. TumbleBit was proposed as a centralized mixing approach aimed
at preventing connections between senders and receivers [61]. In this method, the
sender (Alice) and the receiver (Bob) engage with a central entity called a tumbler
τ. First, Alice and Bob establish payment channels with τ. Then, Alice creates a
2-of-2 escrow transaction denoted as Tescr(A,τ), allowing αbitcoins to be accessed
through a joint signature from both Alice and τ. Similarly, the tumbler forms an-
other shared transaction denoted as Tescr(τ,B), and these transactions should be
committed to the blockchain. Afterwards, Alice and Bob interact with the tumbler
off-chain, employing RSA-based puzzles. These interactions facilitate Bob for receiv-
ing a signature from τfor his transaction Tescr(τ, B), and help τreceiving Alice’s
signature of her transaction Tescr(A,τ). Finally, τmake a transaction from Tescr(A,
τ) to his fresh generated address, and Bob makes another transaction from Tescr(τ,
B) to his own fresh generated address, and the mixing process finished. Additionally,
TumbleBit employs a chaumian blind signature [49] scheme to safeguard the tumbler
from learning about participant associations, as it shown in Figure 2.5.
BSC. Heilman and colleagues introduced Blindly Signed Contracts (BSC) as a
mixing approach that capitalizes on blind signatures principles akin to those in Tum-
bleBit. This method is designed to establish unlinkability in both on-chain and
off-chain contexts. In the on-chain process, the confirmation of all mixing transac-
tions on the blockchain results in a comparatively slow mixing speed. On the other
hand, the off-chain strategy leverages micro-payment channel networks [87, 53] to
keep some mixing transactions off the primary blockchain. This is advantageous as
off-chain transactions exhibit enhanced speed, scalability, and the potential for re-
duced fees compared to the conventional on-chain transactions.
22
Figure 2.5: Overview of TumbleBit’s mixing graph [61]. Tumbler’s view on mixing
associations is blinded.
23
Obscuro. Obscuro is a mixing protocol that capitalizes on the capabilities of
a Trusted Execution Environment (TEE), elaborated upon in section 3.1. This
paradigm allows participants to verify the isolated functionalities through remote
attestation, assuring them of a robust mixing set size [98]. The protocol’s efficacy
was evaluated within the Bitcoin Testnet by its authors, and its adaptability was
demonstrated through a smart contract implementation on the Ethereum network
[45].
Table 2.1: Functionalities employed by all mixing solutions proposed in academia.
Swapping
Aggregating
Peeling Chain
Splitting
Chain Hopping
Random Fee
Random Delay
Third-Party Blinding
Off-Chain
Input Fungibility
Disconnected Fund Flow
Address Freshness
TEE
ZKP
Others
CoinJoin no fee a.an.a. a. a.
CoinShuffle no fee a. n.a. a.
Xim a. n.a. a. a.
CoinParty no fee a. n.a. a. a.
SecureCoin a. a. n.a. a. a.
obius n.a. a.bn.a. a. n.a.
AMR n.a. a.bn.a. a. n.a.
MixEth n.a. a.bn.a. a. n.a.
Zether n.a. a.bn.a. a. n.a.
Mixcoin a. a. a. a. a. a. a.
BlindCoin a. a. a. a. a. a. a.
TumbleBit a. a. a. a.
BSC a. a. a. a.
Obscuro a. a. a. a. a. a. a. a. a.
aIn the following cells of all tables, ”a.” stands for ”applicable”, ”n.a.” stands for ”not applicable”,
and ”-” means that there is not adequate information to determine the corresponding value.
bSince withdrawing money from Ethereum-based mixing services can be done whenever users
choose to (due to the account-based nature of the Ethereum network), users can add random delays
by their own.
0Although mixing fee is donation-based in ChipMixer, its is resilient to these types
24
of attack due to its high liquidity.
Table 2.1 outlines the functionalities utilized by academic mixers to obfuscate
transactions and facilitate the mixing process. Notably, several academic solutions
incorporates swapping methods by permuting the associations of inputs and outputs,
in contrast with aggregating and peeling chain, which weren’t signified in academic
approaches. On the other hand, certain methodologies such as obius, TumbleBit,
and BSC utilize cryptographic puzzles such as ring signatures and RSA-based puz-
zles. Additionally, Obscuro showcased an innovative application of a Trusted Exe-
cution Environment, establishing itself as a pioneering approach within this domain.
AMR and MixEth also leveraged ZKPs in smart-contract-based cryptocurrencies. It’s
important to note that the majority of additional obfuscation techniques, such as ran-
domizing fees/delays and employing fresh addresses, can be implemented within these
mixing frameworks, even though they were not explicitly mentioned in the papers.
Notably, academic frameworks have not extensively addressed Cross-Chain mixing,
despite its significance in the evolving world of decentralized finance.
2.5 Real-World Mixing Approaches
In this section, we will conduct a comprehensive examination of eighteen commonly
employed mixing services accessible within the market. While certain services dis-
cussed below may have become outdated by the time of this paper’s publication, they
merit analysis due to the significant volume of mixing-related data associated with
them.
Regarding centralization and decentralization in practical services, a significant
portion of them lean towards centralization, especially within the Bitcoin ecosys-
tem. However, this term is completely different in the realm of smart-contract-based
cryptocurrencies like Ethereum, Owing to their capacity to execute Turing-Complete
algorithms. To maintain focus on the core subject of this section, discussions related
to this aspect have been moved to Section 2.6. Within this section, the first sixteen
services are centralized, and the last two services (Join-Market and Tornado Cash)
are decentralized. Given that other decentralized services either lack the necessary
robustness or lack adequate reputations, we only discuss those two remarkable ser-
vices. To the date of this publication, we have not found any reputable service that
25
has more powerful functionalities compared to the mentioned services.
BitcoinFog. Bitcoin Fog stands as one of the earliest centralized bitcoin mixing
services, which was exclusively accessible through the Tor network. This service per-
mits the creation of a maximum of five addresses for depositing bitcoins and takes a
(random) fee between 1–3% of the transaction value. The withdrawal process allows
for dispersing bitcoins to a maximum of twenty addresses, spanning a timeframe of
6 to 96 hours. In a study conducted by Moser [77], an evaluation of this mixer was
undertaken by initiating transactions through it and subsequently analysing the re-
sulting transaction graph created by the mixing service.
The graph generated from this evaluation is visualized in Figure 3. Bitcoin Fog
aggregates incoming transactions into distinct aggregation addresses referred to as
”communities”. Notably, some of these communities contain as much as 50,000 BTC.
Following this aggregation, output transactions are derived from these communities
and directed toward their designated recipients. Noteworthy is Bitcoin Fog’s uti-
lization of randomized fees and varying mixing delays, strategically implemented to
hinder external attempts at detecting the mixing graph by analyzing fee structures
and transaction timing patterns.
It’s important to note that BitcoinFog does not mandate participants to send a prede-
termined sum of money to the mixing service. Consequently, if a user transfers funds
that significantly deviate from the average of incoming funds to the mixer, there is a
heightened likelihood of detection due to the small anonymity set provided.
Notably, the operator of this service was arrested in 2021 for allegedly running this
service, which was considered the longest-running bitcoin money laundering service
on the darknet [83].
BitLaundry. BitLaundry represents an uncomplicated mixing service that dif-
fers from Bitcoin Fog in its operational methodology. Unlike Bitcoin Fog, BitLaundry
doesn’t facilitate the deposition of bitcoins into a virtual wallet. Instead, the service
necessitates the specification of destination addresses, the count of outgoing transac-
tions, and a designated time span. This prompts the generation of a one-time-use
address, to which a minimum threshold of coins must be sent by the user. The fee
structure of BitLaundry’s mixing service is split into to part. The first part consti-
tutes 2.49% of the total amount, while the second part involves a charge of a static
amount for each outgoing transaction.
Based on Moser’s evaluation in 2013 [77], it was observed that BitLaundry employed
26
fund aggregation as a technique to obscure associations. However, Moser’s analysis
indicated that due to the limited number of participants within the mixing service,
certain inputs were directly linked to corresponding outputs. Consequently, the mix-
ing process exhibited a reduced level of anonymity.
Blockchain.com. Blockchain.com, a renowned cryptocurrency-focused website
and platform, predominantly offers an array of services pertaining to digital curren-
cies, with a primary emphasis on Bitcoin. Notably, within the Blockchain.com wallet’s
historical features, there existed a service known as ”Send Shared”. This service op-
erates by facilitating Bitcoin exchanges among various users through a shared wallet
structure. During its operation in 2013, it levied a modest mixing fee of 0.5%. At
that time, it stood out as an economical option in the landscape of cryptocurrency
services.
Moser’s examination [77] revealed that this service employed an aggregation technique
to obscure the mixing process. Notably, some of the aggregating addresses contained
substantial sums, approximately 2,000 BTC each. In their examination, they made
eleven transactions to this service and detected eight individual clusters of aggregat-
ing addresses, which shows that the mixer is resilient to the taint analysis attack.
Following the generation of these aggregating addresses, funds underwent a process
of division into several smaller transactions, and these were subsequently subdivided
until they reached their designated destination addresses.
DarkLaunder, Bitlaunder and CoinMixer According to the evaluation con-
ducted by Balthasar et al., in their 2017 study, these particular services were identified
as the least effective mixing services at the time of assessment[52]. The author chose
to examine them together due to their belief that these services were likely under com-
mon ownership and displayed striking similarities in their operations and attributes.
Consequently, an analysis of these services can offer valuable insights into the pitfalls
that mixing services should steer clear of in order to attain a higher level of anonymity.
First problem of these mixers comes from that they require users to create account for
having access to the mixer. Furthermore, they stored this user information alongside
the IP addresses associated with transaction requests, as well as comprehensive trans-
action details in their database. What exacerbates this concern is the vulnerability
of their websites, making it relatively simple for potential attackers to gain access to
this stored data.
27
Moving on to their obfuscating techniques, they employed an aggregation approach
akin to several other services. However, a significant drawback was the aggregation
addresses were static for a long time, which led to a concentration of the mixing
process and made it discernible to potential attackers. Additionally, these services
encountered a scarcity of participants, resulting in notably small anonymity set size
and poor obfuscating permutations. In summary, these services exhibited exceedingly
low levels of security and privacy, rendering them illustrative examples of common
pitfalls to be noticed when implementing a secure mixer.
Helix. Helix, a cryptocurrency tumbler, gained prominence within the dark web
community due to its affiliation with the creators of the renowned dark web search
engine, Grams. The developers assert that Helix utilizes innovative proprietary tech-
nology to not only cleanse Bitcoins but also generate entirely new ones untouched by
the darknet’s transactions. This service was introduced to the public in June 2014.
It is worth noting, however, that Helix’s founder later pleaded guilty to a conspiracy
charge involving the laundering of approximately $300 million. The potential conse-
quences for this individual may include a twenty-year prison sentence and a fine of
either $500,000 or twice the value of the assets involved in the illicit transaction, as
indicated in the references [21, 88].
Helix is accessible only using Tor, and it utilizes random time delays for mixing to
achieve anonymity. It takes 2.5% of the transaction value as a mixing fee, and only
allows withdrawals of 0.02 BTC or more1. Based on the assessment of Balthasar et
al. [52], Helix also uses an aggregating method to accumulate funds into an address,
and then distribute them to the recipients using a peeling chain. However, their ob-
servation shows that wallets and withdrawals of multiple customers are present on
the same transaction, or very close to each other, thus it is easy to identify them. In
short, although linking senders to recipients is a hard challenge in Helix, it is easy to
verify that their transactions were passed from a mixer, by performing a taint analysis.
Alphabay. AlphaBay was a darknet market operating from the year 2014, a year
after the Silk Road market was busted in 2013. In 2017, this platform was forcefully
terminated, and its principal administrator was arrested [10] . However, this service
was relaunched in August 2021 by the self-described co-founder of the AlphaBay.
Prior to its 2017 shutdown, AlphaBay’s management utilized a mixer to obscure the
1This data was captured in 2016
28
origins of the market’s financial transactions. Since Alphabay had approximately
400,000 users in 2017, a significant proportion of users employed the mixer for their
monetary transactions. Consequently, the mixer had a large mixing anonymity set,
making it an interesting subject for studying with the potential to yield valuable
insights about its operation mechanism. Balthasar et al. [52] evaluated this mixing
service by creating 35 different transactions with this mixer and found three main
groups of aggregated funds, which shows that although this mixer had a multitude
of users, it did not use a large number of clusters and its mixing transactions can be
easily found by performing a taint analysis. This mixer also used other third-party
mixing services like Helix, in order to make the mixing process more anonymous. Ad-
ditionally, it utilizes random delays between creating mixing transactions to prevent
any time-based mixing detection.
MixTum MixTum [15], established in 2018, claims to have a separate pool of Bit-
coin from cryptocurrency stock exchanges such as Binance, OKEex, and DigiFinex,
which is used to obfuscate users’ funds with external coins from exchanges. Sho-
jaeinasab et al. [95] have assessed the mixing mechanism of MixTum, revealing its
similarity to typical mixers in aggregating incoming funds into an aggregation ad-
dress and subsequently distributing them through a peeling chain. Furthermore, this
platform ensures privacy enhancement by avoiding the aggregation of transactions
from a single address into a common aggregation address, and also leveraging fee and
delay randomization.
Additionally, MixTum furnishes users with a PGP-signed guarantee letter containing
details about the mixing process, and its PGP fingerprint is publicly accessible on its
website. This provision enables users to validate any concerns or complaints regarding
improper mixing of funds from the service. This method is similar to MixCoin [41]
guarantee letter, with the distinction that MixCoin signs its guarantee letter using
its Bitcoin private key instead of a PGP fingerprint.
CryptoMixer. In 2016, CryptoMixer [9] was introduced as a Tor-accessible Bit-
coin mixing service. This service functions by aggregating user funds into aggregation
addresses, followed by their distribution to recipients using peeling chains [95]. Ad-
ditionally, CryptoMixer offers a guarantee letter, akin to the concept introduced in
the MixCoin paper [41], signed using the mixer’s private key. This feature allows
users to publish their complaints from the mixer using a valid evidence. Moreover,
29
CryptoMixer provides customizable obfuscation options, including the number of in-
put and output addresses, mixing delay, and service fee. Furthermore, upon sending
funds to the mixer, users are assigned a unique identification code. This code ensures
that a user’s coins are not mixed with those from previous transactions in subsequent
dealings. It is worth noting, however, that there is a lack of documented information
regarding CryptoMixer’s operational details [52].
Blender. In 2017, Blender[12] was introduced as a centralized mixing service,
and it employed several techniques, such as tailored mixing fees, mixing delays, and
distinctive mixing codes, much like CryptoMixer, to establish a level of anonymity.
Furthermore, Blender required that the input address and at least one of its output
addresses be of the P2SH 2type. This mixer’s mixing process is claimed to resemble
MixTum and CryptoMixer, involving the consolidation of input funds into aggre-
gation addresses and the utilization of peeling chains to distribute mixed funds to
recipients [95].
ChipMixer. Established in 2017, ChipMixer quickly gained popularity as a top-
notch mixing service. However, in March 2023, the FBI seized the operation for
processing over $3 billion in illegal transactions [17]. ChipMixer operates by utilizing
units known as ”chips.” Users deposit their funds into the mixer and receive ”chips”
of a pre-specified value, consisting of bitcoins from various people. The “chip” is
basically just BTC in a new address for which the participants are provided the
private key, and chips are always created before a user deposits their funds into the
mixer. Interestingly, Mixing fees are completely donation-based in this system, with
a ”Pay What You Want” strategy, contributing to its popularity as a free mixing
service. Between 2017 and 2020, ChipMixer is estimated to have laundered about
53,000 bitcoins [84].
As the chips are fungible, tracing the funds in this system is a challenging task.
Researchers like Pakki et al. [84] and Wu et al. [103] examined this service by
making several transactions through it. Their findings suggest that the service pools
funds into a few aggregation addresses divides the funds into many interchangeable
addresses (chips), and then employs multi-path mixing to distribute these chips to
users.
2P2SH: Pay-to-Script-Hash
30
Wasabi Wallet. Wasabi Wallet [16] is a publicly available Bitcoin wallet de-
signed for desktop computers, emphasizing user privacy and security. It incorporates
a trustless mixing technique, akin to the CoinJoin method. When users initiate
transactions within this wallet, Wasabi executes a CoinJoin transaction, generating
distinct groups of funds with varying denominations. Subsequently, funds from these
generated groups are transferred to the intended recipients [103]. The transaction
cost for utilizing Wasabi consists of the standard mining fee plus an additional 0.3%
fee.
ShapeShift. ShapeShift [8], founded by Erik Voorhees in 2014, is a cryptocur-
rency exchange platform that emphasizes privacy. It allows users to trade different
cryptocurrencies without revealing their identities, making it suitable for exchang-
ing various cryptocurrencies across different blockchain networks. However, although
this service was not designed for a mixing purpose, it can be used for cross-chain
mixing due to its provided coin exchanging anonymity. To illustrate, as reported by
the Wall Street Journal, this service has been linked to facilitating at least $9 mil-
lion in money laundering activities over several years [14]. Consequently, in response
to mounting pressure from governments and media outlets, ShapeShift applied the
Know-Your-Customer (KYC) policy and it mandates account creation for accessing
its services. Also, ShapeShift stated that they utilize anti-money laundering methods
in their system, and rejected the mentioned accused report.
Figure 2.7 illustrates an example of using this service as a mixer. First, Alice sends 3
BTC to ShapeShift and receives 127.11 Ether in Ethereum later. Since these chains
are totally different from each other, there is not any direct link between senders and
receivers, and Alice can use the exchanged Ethers as a mixed fund. Afterwards, the
Bitcoin sent by Alice will be organized as a peeling chain to distribute Bitcoins to
other users (e.g., Bob and Charlie in this figure) who swap other cryptocurrencies for
Bitcoin [103].
Bitmix. Bitmix.bz [11] was introduced in 2017, and currently serves mixing for
Bitcoin, Dash, and Litecoin. This service claimed to utilize several methods like
dust-attack prevention, randomized fees and delays, and guarantee letters in order
to achieve anonymity. The mixing mechanism of this service is almost like most of
the previous described services; aggregating funds into some addresses, then using a
peeling chain for funds distribution [103].
31
Sudoku Wallet. Sudoku Wallet was announced in 2019 within the bitcointalk
forum. The service is a single-use wallet that outputs private keys rather than on-
blockchain transactions. As the creators described, the mixing fees are randomized
from 0.5% to 1% plus the CoinJoin fee and there is not any database and no log is
saved in their system. Furthermore, they asserted that this service performs CoinJoin
transactions using its incoming funds and sends them to newly generated addresses.
Afterwards, the private keys of the corresponding addresses are given to the users to
spend them [19].
Although the creators of this service explained their mixing protocol, the interac-
tion of Pakki et al. [84] with this mixer showed different results from the creators’
claim. After transacting 3 times with the mixer and analyzing transactions using the
Chainalysis tool, they did not identify any evidence of CoinJoin transactions. Also,
the mixing fees of their transactions were inconsistent, as one of them had a mixer fee
of 0 BTC while another had a fee of 90%. Consequently, it can be inferred that this
mixer does not offer enough safety and reliability, besides its poor implementation.
JoinMarket. JoinMarket [13] is a decentralized peer-to-peer (P2P) mixing ser-
vice designed to connect users seeking to engage in CoinJoin transactions. This
platform consisted of two primary participant categories: Market Makers and Mar-
ket Takers. Market Makers are responsible for crafting CoinJoin transactions, while
Market Takers are individuals seeking to mix their funds. Takers pay a fee to makers
which works as an incentive, and notably, all actions within the system occur without
reliance on any centralized authority. JoinMarket is remarkable within the Bitcoin
network as one of the scarce decentralized mixing services, widely adopted by users
seeking to obfuscate the origins of their funds.
Tornado Cash. Tornado Cash is an open-source software project on the Ethereum
Network launched in December 2019 [85], stands as the predominant mixer within
the Ethereum ecosystem. When a user wishes to blend their assets, they transfer
their tokens into the Tornado Cash smart contract. The funds are secured using a
Merkle tree cryptographic commitment scheme, hiding both the amount and source
of the funds. Later, when taking funds out of the Tornado pool, a withdrawer proves
the ownership of a commitment without revealing which specific commitment is be-
ing spent by using zero-knowledge proofs. This provides unlinkability between the
32
deposited and withdrawn funds.
In August 2022, the United States Department of the Treasury’s Office of Foreign
Assets Control (OFAC) imposed sanctions on Tornado Cash due to its involvement
in laundering a sum exceeding $7 billion [22]. After that, OFAC added the Tornado
Cash website and addresses used this mixer to the ”Specially Designated Nationals
And Blocked Persons” list, thereby prohibiting U.S. citizens from engaging in any
transactions involving the Tornado Cash website or addresses listed in the sanctions.
Afterwards, the rate of using Tornado Cash dropped significantly for more than 77%,
as depicted in Figure 2.8. It is noteworthy that a subset of Tornado Cash users
have tried to diffuse banning from OFAC to other addresses by performing a dusting
attack. Wang et al. conducted an assessment of the OFAC’s sanctions’ effects on
ZKP-based mixers [100].
Post-Tornado Cash Services. After the time OFAC imposed sanctions against
Tornado Cash, several new services appeared as alternative mixing solutions for Tor-
nado Cash, like RailGun [4], Cyclone [3], zkBob [7], and Typhoon Cash [5]. However,
these services mostly suffer from insufficient liquidity, and as of the time of this
research paper, there has been no Ethereum mixing service capable of matching Tor-
nado Cash’s pre-sanctions liquidity. Given their limited popularity and infrequent
technical improvements in comparison to Zether and Tornado Cash, we do not delve
into a detailed discussion of these alternatives in this paper. Interested readers are
encouraged to refer to the Elliptic report on Tornado Cash alternative solutions[65]
for more information. Additionally, Barbereau et al. analyzed the taxonomy of
Ethereum mixing services, with regards to post-sanctions options like RailGun and
zkBob, as documented in their work [33].
Table 2.2 outlines the functionalities utilized by real-world mixing services, and as
depicted, most services are centralized. A repetitive pattern in most services involves
the use of aggregation techniques to accumulate funds at a designated address and
using a peeling chain to distribute these funds to individual users. This approach is
favored over alternatives such as swapping, diverging from what is typically observed
in academic frameworks. In addition, third-party blinding has not been used by
services, and it is clear that mixer entities do not want to blind themselves from the
mixing information of their users. Moreover, since forcing users to deposit coins only
in a specific denomination to achieve fungibility is not desirable for most users, this
functionality is not used by any service. However, a solution for users to increase
33
Table 2.2: Functionalities employed by all mixing services available in market
Birth Date
Active?
Centralized
Swapping
Aggregating
Peeling Chain
Splitting
Chain Hopping
Random Fee
Random Delay
3rd-Party Blinding
Off-Chain
Input Fungibility
Disconnected Fund Flow
Address Freshness
TEE
ZKP
BitcoinFog 2013 ----
BitLaundry 2013 - - --
Blockchain.com 2013 - ---
DarkLaunder 2015 -
Bitlaunder 2015 -
CoinMixer - - -
Helix 2017 - - - -
Alphabay 2017 - - -
MixTum 2018 -
CryptoMixer 2016 - -
Blender 2017 - -
ChipMixer 2017 -a. -
Wasabi Wallet 2018 -- -
ShapeShift 2014 - - -
Bitmix.bz 2017 -
Sudoku Wallet 2019 - - - - -- -
Join-Market 2017 a. a.an.a. a.
Tornado Cash 2019 a.an.a. n.a.
aSince performing a CoinJoin transaction in Join-market and withdrawing money from Tornado
Cash can be done whenever users choose to, users can add random delays by their own.
34
the obfuscation of their mixed funds is to fragment their output transactions to align
with the average funds sent to the mixer by all users [77]. In addition, it is worth
noting that several services may use off-chain transactions in platforms like payment
channel networks, but it is a challenging task for an external observer to ascertain the
utilization of off-chain transactions, as they occur in a separate layer distinct from
the main Bitcoin network transactions.
2.6 Comparison Between Academic and Real-World
Services
By examining the data in Tables 2.1 and Table 2.2, it is clear that academia and
the market display considerable differences, which are mostly attributable to their
different goals for the creation of mixing services.
In the context of Bitcoin mixing services, since the Bitcoin scripting language
is not Turing-Complete, it is necessary to construct mixers in a higher layer of the
Bitcoin network. In particular, most academic endeavors have focused on developing
decentralized mixing approaches to achieve the fundamental objective of cryptocur-
rencies, which is to minimize regulatory control exerted by centralized entities. Var-
ious techniques such as multi-round digital signing (e.g., CoinJoin [74], CoinShuffle
[90]) and Joint-Address-Creation (e.g., CoinParty [105], SecureCoin [63]) have been
applied within decentralized mixing frameworks to achieve this objective.
Even in centralized approaches, there have been novel innovations for blinding
the centralized mixer party from having any information about associations between
inputs to outputs, by leveraging cryptographic schemes like Blind Signature which
were used in TumbleBit [61] and BSC [62]. Moreover, Obscuro [98] suggested utilizing
Trusted-Execution Environment (explained in Section 2.1.1) to monitor the mixing
workflow which is performed by a mixer entity.
However, in the practical mixing services found in real-world applications, most
of the methods discussed in academic papers have not been fully implemented. As
outlined above, these services generally aggregate funds into an address and then
distribute them using a peeling chain [95, 77, 52, 103]. This approach is commonly
used perhaps because it is a well-established and straightforward technique to imple-
ment. Furthermore, there has not been any significant attack on this method that
can link inputs and outputs to each other. Additionally, a mixer can easily add its
35
own supply funds to the aggregation addresses, enhancing the anonymity pool. These
steps are easy to develop by the services, and there is no need for complex imple-
mentations, resulting in a widespread and common method to use in mixing services.
Also, centralized services tend to avoid blinding themselves from the mixing data,
and we haven’t come across any service employing blinding techniques to obscure
sender-recipient relationships.
Nevertheless, despite this gap, several services have adopted academic methods
to ensure their users of their reliability and enhance the security of their mixing
process. Specifically, the MixCoin guarantee letter [41] has been incorporated into
multiple mixers, and some services have integrated features such as fee and delay
randomization to prevent attackers from identifying mixing transactions based on fee
or time patterns.
To talk about other services that use alternative methods (rather than aggre-
gating), some services like Wasabi Wallet [16] act as an intermediary responsible for
collecting mixing transactions from users and executing a CoinJoin transaction (which
is decentralized in its nature) in a centralized manner. Conversely, Join-Market [13]
introduced a decentralized P2P network, enabling participants to generate CoinJoin
transactions in a decentralized way. Additionally, leveraging anonymous cryptocur-
rency exchanges for cross-chain mixing shows promise, leaving no trace of mixing
association due to the isolation among different chains.
Regarding all these observations, in our investigation, it was evident that a ma-
jority of Bitcoin users opted for centralized mixers as opposed to decentralized al-
ternatives, primarily due to their user-friendliness, intuitive interface, streamlined
operations, and the absence of a requirement for advanced technical knowledge.
Besides all these inferences, things are totally different when it comes to Turing-
complete-based cryptocurrencies. In such cryptocurrencies like Ethereum, there is
no need to setup a mixing framework at a higher layer of the cryptocurrency net-
work, and mixing algorithms can be directly coded and executed in a smart contract,
without the need for any centralized intermediary. This feature gives smart contract
developers a wider range of options to create untraceable mixing solutions, result-
ing in introducing advanced mixing methods like Zero-Knowledge proofs in Tornado
Cash [85] or ring signatures in obius [76], all fully decentralized. To the date of
this publication, Tornado Cash remains the most popular service of its kind, though
its usage has been significantly dropped by more than 77% after OFAC sanctions
enforced [23, 22, 100]. Given these features of Turing-Complete cryptocurrencies,
36
employing centralized mixing frameworks in these networks seems almost pointless.
2.7 Possible Attacks on Mixing Services
Mixing services exist to ensure transaction anonymity, aiming to protect both senders
and receivers from being identified at any point in the mixing process. Here we
introduce 6 most common attacks within the context of mixing services, which can
be performed by a mixer, mixing participants, or any third party who stands outside
of the mixing process:
Sybil Attack. A common method to undermine mixing protocols is for an
attacker to represent a significant portion of the participants. The larger this
deceptive Sybil group, the higher the likelihood that the chosen mixing parties
will include the attacker. Essentially, the attacker aims to dominate the system
to decrease the anonymity set for others. In pairwise mixing protocols, a Sybil
attacker can determine the fund’s destination paired with any address. Proto-
cols such as DarkWallet, SharedCoin, and CoinShuffle are vulnerable to cost-free
Sybil attacks since they don’t charge participants, and creating addresses (or
Sybil identities) in Bitcoin and similar virtual currencies is free.
DoS Attack. DarkWallet, SharedCoin, and CoinShuffle don’t charge users to
join their mix pool, making them vulnerable to denial-of-service (DoS) attacks.
An attacker can create multiple identities, each with a unique IP and Bitcoin
address, and add them to the mix pool for free. If chosen, the attacker simply
doesn’t sign the transaction, disrupting the mix. If they create enough identities,
they can hinder most transactions without any loss. CoinJoin and its variants,
including CoinShuffle, conduct the entire mix in one step and can’t impose
an upfront fee. Any potential fee would require a different, yet-unproposed
protocol. Likewise, Barber et al.’s fair exchange protocol [32] doesn’t have
participation fees, allowing an attacker to disrupt the exchange without any
cost.
Coin Stealing Attack. This type of attack predominantly occurs through
untrusted centralized mixing services that steal user-submitted coins. While
some services attempt to mitigate this risk by employing guarantee letters to
reassure users about their trustworthiness (like the MixCoin guarantee letter,
37
introduced in chapter 4.1.2), this approach does not provide an absolute guar-
antee of protection against such attacks.
Trojan Attack. This attack mirrors the Sybil attack, with the attacker ac-
tively participating in the mixing process. By transacting with the services,
they can trace the mixing pattern and unveil the mixing procedure. Possessing
both the sender and receiver addresses allows the attacker to easily discern the
transaction pattern. This attack poses the most substantial threat to mixing
services as it can expose the entire mixing process, pinpoint mixing transactions,
identify related addresses, and uncover those utilizing the mixing services. Sev-
eral studies have leveraged this attack method to demystify the operations of
leading mixing services in the industry [95, 84, 52, 82]
Intersection Attack. The intersection attack focuses on tracking frequent
activity by specific wallets either owned by one participant or exhibiting dis-
tinct patterns. For instance, in systems like CoinJoin and CoinShuffle, part-
ner peer information is logged on the public blockchain. This allows even
non-participating attackers to exploit the vulnerability. In terms of parallel-
ing anonymous communication methods, every attacker in these protocols acts
as a global passive observer.
ML-based Attacks. In the realm of mixing transactions, a comprehensive
and trusted dataset remains elusive. Nonetheless, the WalletExplorer platform
offers categorized data for various transaction types, notably for mixing ser-
vices such as BitcoinFog, Helix, and Bitlaunder[6]. This platform sources its
labels predominantly from user submissions and extensive forum analyses, sub-
sequently expanding label classifications based on shared input addresses heuris-
tic deanonymization attacks. In a separate endeavor, Blockstream.info has dis-
seminated a dataset, structured utilizing their proprietary CoinJoin detection
methodology and an established method for SharedCoin detection[39]. The re-
liability of these datasets, however, remains a subject of debate. Various studies
have embarked on experimental pursuits and deep learning-based attacks, yet
many confront challenges rooted in the lack of verifiable data.[79, 101, 97]
38
2.8 Provision of an Evaluation Framework to As-
sess Mixing Services
This thesis identifies the most significant functionalities that mixing services utilize to
conceal themselves behind blockchain transactions and maintain anonymity. In this
section, we introduce an evaluation mechanism, along with specific metrics and crite-
ria, to assess various mixing services and explore their vulnerabilities. To this end, the
author examine various attacks designed to reveal information behind cryptocurrency
transactions and undermine anonymity.
In the wake of examining the major challenges posed by attacks on mixing ser-
vices, we can establish a set of criteria for robust mixing services. Our assessment
emphasizes both resilience to attacks and the capacity to obscure cryptocurrency
transactions. After introducing various evaluation criteria, table 2.3 presents an as-
sessment of both academic and real-world mixing solutions. The assessment is based
on their operational mechanisms, which are discussed in Section 2.3.
Prior to this study, Feng et al. conducted a thorough evaluation of mixing services,
employing a comprehensive methodology that examined the effectiveness of protocols
in fully anonymizing user identities, the involvement of centralized authorities in
the mixing process, and the application of service fees[55]. Nevertheless, while their
criteria established a robust framework for evaluating the effectiveness and security of
mixing services in preserving user anonymity and protecting transactions, it did not
encompass all facets of mixing services. The criteria set out in this document aim to
provide an all-encompassing assessment of mixing services, as discussed in academic
literature and observed in real-world applications, across both Ethereum and Bitcoin
platforms.
Centralization Concerns: Centralized tumblers are potential targets for
cyber-attacks, legal inquiries, or subpoenas. Compromised records from these
services can lead to transaction de-anonymization.
Unlinkability: For the highest standard of privacy, it’s essential that the con-
nection between the sender and the receiver be entirely severed. A fully un-
linkable service ensures that taint analysis or similar techniques can not be
traced back to either party, solidifying anonymity. Within this context, we can
categorize services into 2 different types, Fully Disconnected and Correlated
to Pool Size. The second type means that the higher pool size, resulted in a
39
higher anonymization ratio, while in the first type, senders and receivers are
completely disconnected from each other.
Liquidity: High liquidity ensures that there are always sufficient funds for mix-
ing, which can prevent delays and improve the anonymity set. Low liquidity
mixers might pose the risk of having transactions stalled or linked due to insuf-
ficient participants. As the liquidity can be measured only in real-world mixing
services, and some of the available data from mixers’ liquidities are from sev-
eral years ago, we mark those services that have enough funds to avoid linking
senders to recipients as ”High”, and others as ”Low”.
Anonymity Set Size: The size of an anonymity set is pivotal when assessing
the resilience of a mixing service, representing the aggregate of participants or
inputs in a specific mixing event. An expansive anonymity set obfuscates the
discernment of a coin’s flow, while a limited set delineates transaction pathways,
potentially undermining user privacy. This metric retains its significance across
both traditional and contemporary mixing services. Prominent traditional ser-
vices like CoinShuffle and CoinJoin fundamentally depend on the integrity of
their anonymity sets. In the domain of modern mixers, certain attack vectors
have illuminated a paradox. Even as these services generate addresses randomly
for each mixing request, there remains a necessity to consolidate funds into a
restricted number of addresses to facilitate mixing operations. The inputs tied
to these consolidation, or ”sweeper,” transactions inherently influence the size
of the anonymity set.
Log Storing: Certain centralized mixing services retain mixing logs within
their database, which leads to a security risk as sensitive mixing data, such as
sender details, recipients, and transaction amounts, could potentially be exposed
from the mixing service database. As there is not any reliable data about
this metric on services, and nothing can be concluded without accessing to the
implementation behind each tumbler, we mark those services who require users
to signup before mixing as Signup Required (Signup Req.), since it can be an
evidence that they store users’ information in their database, which is a security
risk. However, it is recommended to use TOR browser for mixing, in order to
prevent mixer server from linking the IP addresses to the corresponding crypto
addresses.
40
Cost of Mixing (Mixing fee): The economic viability of using a mixer is
crucial for widespread adoption. Services that offer competitive fees while main-
taining robust security and privacy features are likely to be preferred. Moreover,
an unpredictable fee structure might aid in obfuscation.
Auditability: Some users may prefer services that have undergone third-party
audits, ensuring that the underlying code and mechanisms are sound and free
of vulnerabilities.
Address Reusing: The redirection of mixed coins to initial addresses or ad-
dress clusters linked to a user’s identity can vitiate the mixing process, thereby
reintroducing traceability. Note that this metric is not applicable for smart-
contract-based cryptocurrencies, due to their account-based nature.
Cross-Chain Mixing: As the cryptocurrency ecosystem grows, the ability
for a mixer to support transactions between different blockchains (cross-chain
transactions) can be a valuable feature. It can help users obscure their tracks
even further by moving funds between different cryptocurrencies.
Sybil / Dos Attack Resistance: In scenarios where an adversary populates a
tumbler with a majority of its own addresses, the integrity of the anonymity set
is compromised, thereby facilitating the tracing of coins(Sybil Attack). More-
over, the efficacy of a tumbler can be compromised if a significant portion of
its users are colluding or have established identities. In most cases, leveraging
mixing fees can be a good obstacle for both Sybil and Dos attacks.
Amount Distinctiveness: Unique transaction amounts can be potential mark-
ers, facilitating traceability post-mixing. Mixing distinct amounts, like 1.23456789
BTC, can inadvertently act as identifiers.
* Malicious Punishment: Having an accountable system in place ensures
that if any malicious activity or faulty mixing occurs, the entity or entities
responsible can be identified. This introduces a level of trust and security for
users of the service.
Coin Stealing Attack Resistance: When a third-party takes the responsi-
bility to perform mixing, there is a great potential for them to steal the given
funds by avoiding sending them the corresponding recipients. This scenario is
41
more likely to happen in terms of centralized services, and several mixers have
used mixing guarantee letters, introduced by MixCoin paper [41].
Trojan / Intersection Attack Resistance: A mixer’s effectiveness is com-
promised when it uses predictable or discernible transaction mixing patterns.
Such patterns, like uniform mixing amounts, and timely patterns, are vulner-
able to taint analysis. This vulnerability becomes particularly evident when
adversaries deploy Trojan or Intersection attacks, potentially exposing the mix-
ing pattern. Several studies have highlighted successful attacks on prominent
Bitcoin services[37, 77, 82, 95, 84]. We assess two important factors in terms of
pattern obfuscating: Fee and delay randomization, as discussed in chapter 2.1.1.
In terms of mixing structure, since the structure of most real-world services has
been revealed by prior attacks, we do not discuss them in the table. However,
it is vital for a service to be resistant to these types of attacks, and employing
dynamic mixing methods can be an innovative way to reach this milestone.
Regulatory Compliance: Depending on the jurisdiction, mixing services
might be subject to regulations. Ensuring compliance can prevent potential
legal complications for both the service providers and their users. As and exam-
ple, ShapeShift [8] employed a Know-Your-Customer policy after being accused
of money laundering, which resulted in lower anonymity and more traceability
for its users. Due to the lack of reliable data about this metric for most services
(except ShapeShift), we don’t depict this metric in the table. Nevertheless,
Those services that require users to sign up before mixing, are suspected to
regulatory compliance.
Front-Running Resistance: Front-running is one of the main challenges in
terms of smart-contract-based cryptocurrencies like Ethereum. When talking
about mixing, some actions like depositing funds alter the state of the mixing
contract, while other actions like withdrawing funds have to rely on the state of
the contract to form the cryptographic proofs, which grant them to withdraw
their funds. Therefore, if there are multiple concurrent deposit actions issued
to the contract, some withdrawing actions will get invalidated by those trans-
actions that modify the state of the contract, and this process can be happened
unintentionally by ordinary users, or deliberately by malicious ones [43, 54, 72].
Discussing mentioned smart-contract based methods (M¨obius, AMR, MixEth,
42
and Tornado Cash), front-running is not applicable in obius because of the
nature of ring signature scheme [89], not applicable in MixEth due to its shuf-
fling mechanism, and also not cost-efficient in AMR and Tornado Cash [85, 72].
Also, Zether places conflicting transactions in a pending state, and execute them
only after resolving the conflicts to prevent potential front-running situations.
43
Figure 2.6: Sample of created communities (aggregation addresses) in BitcoinFog
[77]
44
Ethereum
Blockchain
Bitcoin Blockchain
Address
3Pdj5g
Address
0x70faa28
Temp
Address #1
Alices Input
3 BTC
Bob’s Output
2 BTC
Charlies Output
0.5 BTC
Temp
Address #1
Alices Output
127.11 ETH
Figure 2.7: Overview of ShapeShift’s
mixing process [103]
Figure 2.8: Daily transactions in TC
ETH pools. There was a panic exit
when the OFAC sanctions were
announced [100]
45
Table 2.3: The evaluation of all mixing solutions proposed in academia and market. Rows colors stand as Blue: Decentralized
Academic, Orange: Centralized Academic, Green: centralized Real-World, and Pink: Decentralized Real-World
Centralization
Unlinkability
Liquidity
Anonymity Set Size
Log Storing
Random Delay
Random Fee
Mixing Fee?
Auditable
Address Reusing?
Cross-Chain Mixing
Sybil/Dos Resistance
Coin Stealing
Resistance
Front-Running
Resistance
CoinJoin Cor. Pool Size - Limited to Tx Size a. no fee n.a.
CoinShuffle Cor. Pool Size - Limited to Tx Size a. no fee n.a.
Xim Fully Disconnected - Participants Count a. n.a.
CoinParty Cor. Pool Size - Participants Count a. no fee n.a.
SecureCoin Cor. Pool Size - Participants Count a. a. n.a.
obius Cor. Pool Size - Max 24 users a. n.a. n.a.
AMR Cor. Pool Size - Up to Merkle Tree size a. n.a. n.a.
MixEth Cor. Pool Size - Participants Count a. n.a. n.a.
Zether Cor. Pool Size - Anonymity Set Count a. n.a. n.a.
Mixcoin Dep. Implementaion - Participants Count a. Guarantee Letter n.a.
BlindCoin Dep. Implementaion - Participants Count a. Guarantee Letter n.a.
TumbleBit Fully Disconnected - Participants Count a. a. n.a.
BSC Fully Disconnected - Participants Count a. a. n.a.
Obscuro Dep. Implementaion - Participants Count a. a. n.a.
BitcoinFog Cor. Pool Size High High Participants No Signup Req. - n.a.
BitLaundry Cor. Pool Size Low Low Participants No Signup Req. n.a.
Blockchain.com Cor. Pool Size High High Participants Signup Req. - -n.a.
DarkLaunder Cor. Pool Size Low Low Participants No Signup Req. n.a.
Bitlaunder Cor. Pool Size Low Low Participants No Signup Req. n.a.
CoinMixer Cor. Pool Size Low Low Participants No Signup Req. n.a.
Helix Cor. Pool Size High High Participants No Signup Req. -n.a.
Alphabay Cor. Pool Size High High Participants No Signup Req. -n.a.
MixTum Cor. Pool Size High High Participants No Signup Req. Guarantee Letter n.a.
CryptoMixer Cor. Pool Size High High Participants No Signup Req. Guarantee Letter n.a.
Blender Cor. Pool Size High High Participants No Signup Req. n.a.
ChipMixer Cor. Pool Size High High Participants No Signup Req. a. 1n.a.
Wasabi Wallet Cor. Pool Size High Limited to Tx Size No Signup Req. - -n.a.
ShapeShift Fully Disconnected High n.a. Signup Req. - - n.a.
Bitmix.bz Fully Disconnected - High Participants No Signup Req. Guarantee Letter n.a.
Sudoku Wallet - Low Low Participants No Signup Req. - n.a.
Join-Market Cor. Pool Size High Limited to Tx Size No Signup Req. a. a. n.a.
Tornado Cash Cor. Pool Size High Max 220 *No Signup Req. a. n.a.
1Although mixing fee is donation-based in ChipMixer, it is resilient to these types of attack due to its high liquidity. *This value is corresponded to
the maximum size of the Tornado Cash’s merkle tree [20].
46
As illustrated in the evaluation table 2.3, our analysis is based on pre-defined
criteria and offers a comprehensive overview of the current mixing service landscape.
Contrary to real-world applications where centralized services dominate, academic
research mainly focuses on decentralized solutions. This academic inclination matches
user preferences for enhanced privacy and reduced risk from regulatory intervention.
Centralized services come with their own set of inherent risks, including the po-
tential for fraudulent activities and theft of funds, as highlighted in the ’Coin Stealing
Resistance’ criterion of our evaluation table. On the flip side, most academic solutions
are largely immune to such vulnerabilities, with CoinParty being a notable exception.
A significant security vulnerability emerges when a mixing service links multi-
ple peeling chains together through aggregation transactions. In a peeling chain,
transactions are divided into smaller parts, which are then distributed to their final
destinations. However, as transactions near the end of the peeling chain, the remain-
ing amounts often become too small to be further divided. Some services input these
residual amounts into an aggregation transaction, which serves as the starting point
for the next peeling chain[95].
By doing so, a traceable link is created between the two separate peeling chains.
This linkage could potentially allow an attacker to correlate all transactions related
to a particular service, thereby undermining the anonymity of users. In essence,
transactions from different chains could be combined into one large cluster, which
could be analyzed collectively to identify patterns or isolate individual users.
Another crucial issue revolves around the transparency of the underlying algo-
rithm. Services that disclose their algorithms not only gain more user trust but also
invite community contributions. Those solutions using Turing-complete smart con-
tract technology are naturally more credible, due to their auditable and decentralized
nature. Comparing early assessments, like the one by Novetta in 2015, with recent
reports from CypherTrace and Chainalysis, reveals a noticeable increase in the adop-
tion of mixing services[82, 64, 23]. However, sanctions on services like Tornado Cash
have led to a shift towards other centralized alternatives[23].
Clearly, each service has its own strengths and drawbacks, and a user who wishes
to mix her funds should choose the most appropriate mixer based on her important
evaluation metrics. Also, an innovative approach to achieve higher anonymity is to
utilize various mixing services in several consecutive rounds, which is referred to as
multi-round mixing.
47
Chapter 3
Money Laundry Identification in
Bitcoin
In this chapter of the thesis, the methodology to identify the Bitcoin mixing services
is introduced. This methodology includes the collection of required data and the iden-
tification of recurring patterns in the mixing transactions. Ultimately, this procedure
leads to the creation of an algorithm that can detect these transactions.
3.1 Data Gathering
To investigate mixing transactions, we need labels for transactions that participated
in mixing services processes, and we need access to the full blockchain transactions
information.
3.1.1 Retrieving Blockchain Data
In order to obtain the transaction data stored in the blockchain, we utilized a complete
node Bitcoin client and extracted the necessary information by making use of its
JSON-RPC APIs. By executing the Bitcoin client and leveraging its implemented
RPC functions, we were able to access details pertaining to blocks and transactions.
This includes data such as block hash, input and output addresses of transactions,
transferred values, timestamps, and other essential information encompassed within
the blockchain.
48
3.1.2 Collecting Labels for Mixing Transactions
Mixing services are used by individuals with different intentions, for example, to
conceal their financial transactions for tax evasion, criminal purposes, or privacy con-
cerns. What is clear is that no matter what their intentions are, individuals are
reluctant to disclose their mixing transactions, and as expected, there is no dataset of
transactions related to mixing services. Although some labels for money laundering
service addresses are available, errors may occur due to clustering methods and heuris-
tics that label other addresses according to the previously labeled addresses. During
this process, ordinary addresses may be attributed to money laundering services, and
studying their transactions will be misleading. Due to the lack of an intended dataset,
we needed to collect accurate data independently.
In order to assemble a dataset of mixing transactions, we selected three of the most
recognized mixing services, which have been extensively discussed on platforms such
as BitcoinTalk and Reddit. Utilizing these services, we engaged in their process by
depositing money into their specified addresses and subsequently having it returned
to us, via their laundering process, to our recipient address. Each engagement with
these services resulted in the creation of two mixing transactions: a deposit and a
withdrawal. The chosen mixing services, along with the quantity of mixing trans-
actions we generated for each during the pattern recognition phase, are detailed in
Table 3.1.
The creation of more comprehensive data sufficient for deep learning solutions,
however, proved to be a costly endeavor. This cost is two-fold: it is both time-
consuming and financially demanding. In order to capture both sides of the trans-
actions the deposit and the withdrawal we were required to wait until the
deposited transaction was utilized and exited from the UTXOs. This process of-
ten entailed significant delays. Additionally, the financial burden of accumulating a
sizable dataset was considerable, due to the service fees involved.
For these reasons, the process of gathering a larger dataset presented considerable
challenges. However, the dataset we were able to compile provided us with a sufficient
basis to utilize statistical patterns as a viable solution.
49
Table 3.1: Data gathered from mixing services in 2019 for training phase
Name URL Number of Transactions
MixTum mixtum.io 22
Blender bitcoin-blender.io 16
CryptoMixer cryptomixer.io 10
3.2 Extracting Statistical Pattern
The collection of a comprehensive dataset of mixing transactions can be both time-
consuming and financially demanding. To address these constraints, we opted to
extract the mixing pattern using simple statistical methods that are appropriate for
small data sets.
Traditional methods of money laundering in Bitcoin provided some degree of
anonymity in the transfer of funds. However, these methods had significant weak-
nesses that led to the development of modern mixing services. The most significant
limitation of the traditional approach was the service time, which required individuals
to wait for N-1 other individuals to mix their money. To achieve a higher degree of
anonymity in the network, individuals had to mix their money with a larger num-
ber of mates, which corresponded to a higher value of N (the anonymity parameter).
However, this approach had the disadvantage of making it easier to track the transfer
of funds, as the transaction was not fully cleared due to the limited number of mixing
mates.
Modern mixing services, on the other hand, incorporate advanced parameters
that make them more sophisticated. One of these parameters is a random time delay,
where most services offer a random delay of 1 to 1440 minutes between receiving and
sending transactions. This makes it more difficult to track the money flow based on
time patterns. Another parameter is the limit for the number of recipient addresses,
which is typically infinite in modern mixing services. This allows for an unlimited
number of recipient addresses, thereby making it challenging to trace the transfer of
funds as the money is divided among multiple destinations.
When there is no direct money transfer from the first stage of the money launder-
ing process to the subsequent stages, it becomes practically impossible to track the
money path and establish a link between the stages. This makes it highly challenging
to determine the origin of the funds and the complete money laundering process.
50
Figure 3.1: Pattern of suspected mixing withdrawal transaction in first phase
3.3 First Phase of Detection; Uncovering Transaction-
Level Patterns
To uncover the recurring patterns in mixing transactions, we investigated mixing
transactions we created. These patterns were found in withdrawal transactions. At
the transaction level, we could not find any pattern in the deposit transactions be-
cause the structure of deposit transactions was typical in the whole blockchain, and
senders created them by using their wallet application to deposit money to the mixing
service. Nevertheless, the withdrawal transactions created by the mixing service were
discriminative and followed a specific pattern that we will explain.
In figure 3.2, we show one of the transactions received from the Mixtum mixing
service due to using its service.
Figure 3.2: Sample withdrawal transaction of MixTum Service
3.3.1 Inputs and Output Counts Pattern
Through a detailed examination of the transactions, we found a series of common
patterns that can be transformed into detection algorithms. In all the transactions
received from the mixing services under study, the withdrawal transactions followed
51
a 1 : 2 structure, with one input and two outputs. The first output address was the
destination of the laundered money, and the second was the change address of the
mixing service, which deposited the remaining funds at a new address.
3.3.2 Address Types Pattern
We also uncovered a pattern in the type of addresses used within the 1 : 2 structure
of the candidate transactions. Based on our analysis of the withdrawal transactions
from the Blender and MixTum services, we inferred the following two characteristics:
Input address type is P2SH.
At least the type of one of the two output addresses is P2SH.
It is important to note that this pattern was not fully present in the transactions
from the CryptoMixer service.
3.3.3 Pattern of Amount Fraction in Transfer
Another pattern observed in the withdrawal transactions of mixing services was the
fraction of the amounts transferred in each output. In the 1 : 2 transactions, the
input had a high value of Bitcoin, with a threshold of more than 1 Bitcoin considered
an appropriate value. Furthermore, in 97% of cases, the amount of the P2SH output
was more than five times greater than the amount of the non-P2SH output.
In figure 3.1 the paper presents the structure and characteristics of mixing with-
drawal transactions in the schematic, which is used in the first phase filter of trans-
actions.
The aforementioned patterns in withdrawal transactions were found consistently
across multiple mixing services and provide strong evidence for the presence of mixing
activity. The use of statistical techniques and a thorough examination of transaction
patterns enabled us to identify these characteristics and develop algorithms to detect
mixing transactions in a more accurate and efficient manner. However, to further
enhance the accuracy of our proposed empirical method, additional investigations of
transaction patterns at a higher level of abstraction within the blockchain have been
conducted.
52
Figure 3.3: Schematic view of sweeper transaction
3.4 Second Phase of Detection; Uncovering Chain-
Level Patterns
The results from the initial phase of detection, which involved filtering transactions
at the transaction level, showed a high rate of false positives. This is due to the
prevalence of similar patterns among exchanges and wallets, which leads to misiden-
tification of transactions as mixing transactions. To address this issue, a two-phase
filter algorithm was developed to improve accuracy.
3.4.1 Deposit Transactions Footprint in the Mixing Services
In the first phase of detection, we focused solely on withdrawal transactions from
mixing services, neglecting the potential information that could be gleaned from de-
posit transactions. In this section, we aim to determine whether analyzing deposit
transactions can increase the accuracy, reduce false positives, and improve the recall
of our detection method.
3.4.2 Examining Sweeper Transactions
Chang et al. identified several repetitive transaction patterns in the Bitcoin network,
including what they referred to as ”sweeper transactions” [48]. These transactions,
depicted in Figure 3.3, involve a large number of input addresses and one or two
53
output addresses. In essence, they combine scattered funds from multiple addresses
into one or two new addresses, resulting in a high balance in the newly created address.
The structure of these transactions is typically many-to-one.
3.4.3 Defining the Transaction Chain
To better understand the second phase of the filter, it is necessary to introduce the
concept of the transaction chain. The sequence t1,t2,t3,..,tnis a transactions chain
where tiis a transaction and OUT (ti) is the set of outputs of transactions tiand
IN(ti) is the set of inputs of transactions tiand for each i(0 in1):
a:aOUT (ti)aIN (ti+ 1) (3.1)
The transaction chain is a sequence of transactions in which at least one of the
outputs of each transaction is used as an input in the subsequent transaction. Figure
3.4 provides a visual representation of the transaction chain as a tree, with each node
corresponding to an address on the Bitcoin network, edges representing the transfer
of funds, and edge values indicating the amount transferred between addresses in the
chain. Figure 3.5 provides a schematic view of the structure of a mixing transaction
chain, based on our current understanding.
Figure 3.4: Tree view of transaction chain of an address we got out cleaned money
from a mixing service
54
Figure 3.5: Schematic view of the transaction chain
55
3.4.4 Implementing Algorithm of Identifying Mixing Trans-
action Chain
In this section, we will implement the algorithm for obtaining a transaction chain of
mixing service from a candidate mixing transaction.
Algorithm 1 Detecting transaction chain of mixing service based on candidate trans-
actions
Input: A candidate intermediate mixing transaction
Output: Full Chain of Mixing Transactions , Anomalies De-
tected
1: AnomalyThreshold 4
2: AnomalyDetected 0
3: while Meet a sweeper transaction, Go Backward do
4: if AnomalyDetected == AnomalyThreshold then
5: Break
6: end if
7: if Algorithm3(transaction) == False then
8: AnomalyDetected++
9: end if
10: Append transaction hash to the chain list
11: end while
12: while While greater output is spent, Go Forward do
13: if AnomalyDetected == AnomalyThreshold then
14: Break
15: end if
16: if Algorithm3(transaction) == False then
17: AnomalyDetected++
18: end if
19: Append transaction hash to the chain list
20: end while
The process of identifying a mixing transaction chain involves tracing the flow of
funds from the deposit transaction to the eventual mixing transaction. Algorithm
1 outlines this process by moving backward from a feed transaction until reaching a
sweeper transaction and then moving forward until reaching the end of the transaction
56
chain or an anomaly that exceeds the established threshold.
3.4.5 Identifying the Start of the Mixing Chain
A critical aspect of the mixing process is that the deposit money to the mixing service
remains unspent for a considerable amount of time. However, when this money is
eventually spent, it is often seen as the input of a sweeper transaction. This is
because these transactions aggregate the money from various addresses belonging to
the mixing service into one or two addresses for easier access and use during the
intermediate stage of the transaction chain. This process can be seen in Figure 3.6
which depicts the collection of money at various addresses belonging to the mixing
service and its transfer to two addresses via a sweeper transaction.
Figure 3.6: Sample real sweeper transaction beginning of transaction chain
3.4.6 Investigating the Transaction Chain Started from Sweeper
Transaction
As previously described, the deposit transaction of our funds to the mixing service led
us to a sweeper transaction. The inputs of this sweeper transaction were addresses
belonging to the mixing service with smaller amounts that were aggregated into one
address in the output. From this sweeper transaction, we constructed a forward
transaction chain. The subsequent transactions in the chain were found to have the
mixing transaction pattern that we previously described in algorithm 3. The resulting
chain was a sequence of mixing transactions that led to the delivery of dirty funds
to the final destination in a clear and separate manner. This method of constructing
a transaction chain reduces the time complexity of searching for transactions and
increases the recall of our algorithm. Without the transaction chain, we would have
had to randomly search through each transaction in the blockchain to find the mixing
transactions. However, with the transaction chain, it is sufficient to identify a suspect
57
mixing transaction and then expand the chain both forward and backward using the
pattern of the transaction chain and the formation described in algorithms 3 and 1.
This allows us to quickly locate other mixing transactions.
The issue of false positives remains unresolved, even with the construction of the
transaction chain. In the next phase, we will implement a two-phase filtering method
to reduce false positives further.
3.4.7 Transaction Chain Length Pattern
After conducting a statistical analysis of the transaction chains of mixing services,
we found that the median chain length is 52, which is significantly shorter than the
median length of 10 for wallet transactions with similar patterns. This heuristic can
effectively reduce false positives associated with wallet transactions. However, it is
not applicable to exchanges with similar patterns as those of the transaction chains
of mixing services.
3.4.8 Transaction Chain Time Pattern
Our statistical analysis of the transaction chains of the mixing services also led to a
pattern that reduces false positives related to the exchange services. The average time
interval between transactions in the transaction chain related to the mixer service
is approximately 32 minutes, which is significantly higher than the average of 10
minutes for exchange transactions and more than 100 hours for wallet transactions.
This heuristic can be used to detect and eliminate a significant portion of transaction
chains that are not related to mixing services.
3.5 Identification of Mixing Service Addresses in
the Network
Based on the analysis conducted in previous sections, we were able to identify and
tag certain transactions that are associated with mixing services or money laundering
activities. We developed the following patterns to help us identify the addresses
involved:
1. For intermediate transactions in the extracted transaction chains, the input
address belongs to the mixing service.
58
2. In the intermediate transactions of the extracted transaction chains, the output
address, which is in the form of P2SH and has a higher value, is attributed to
the mixing service (if both addresses are P2SH, we use the amount as a criterion
for labeling).
3. In the starting transaction of the extracted transaction chains (sweeper trans-
action), all addresses are considered to belong to the mixing service.
3.6 Identification of Addresses Associated with Mix-
ing Service Users
Having tagged the addresses of the mixing service in the detected transactions, the
next step is to identify and tag the addresses of the individuals using the service.
3.6.1 Clean-Side Money Addresses
The intermediate transactions in the extracted transaction chains from the algorithm
3 correspond to the transfer of laundered money from the mixing service to a destina-
tion address. Based on the patterns and analysis we have presented, we can identify
the destination addresses of the individuals using the mixing service as the address
with a lower amount in the output of the intermediate transactions, which is not in
the form of P2SH. If both addresses are in the form of P2SH, we use the amount as a
criterion. These addresses, as mentioned previously, correspond to the clean money
side of the money laundering cycle.
3.6.2 Dirty-Side Money Addresses
Another address that can be linked to individuals using the mixing service is the
address of dirty money (incoming funds), which can be traced by tracking back the
input transactions of the sweeper transaction at the start of the transaction chain
related to the mixing service. The algorithm for identifying the addresses of individ-
uals using the mixing service (money launderers) is presented in Algorithm 2, which
shows the dirty money side of the transaction chain.
59
Algorithm 2 Identifying dirty addresses of mixing service user
Input: A Sweeper Transaction
Output: Address of the Sender (Dirty Money Addresses)
1: while Tx Has Unchecked Input do
2: address pop address from input of tx
3: transaction previous transaction of address(redeemed transaction)
4: Label input addresses of transaction as Dirty Addresses
5: end while
To aid in better understanding, we provide a schematic view of the mechanism
for detecting the dirty side of Bitcoin transactions in Figure 3.7. In this figure, we
illustrate how a sweeper transaction collects all of the dirty money that users transfer
to the mixing service. When a user requests money laundering, the service generates
an exclusive address for that user. The user then transfers the dirty money to the
generated address, which is one step behind the sweeper transaction. Thus, the user
addresses are those inputs of the transactions that are one step before a sweeper
transaction. Algorithm 2 identifies the sweeper transaction and for all of its inputs,
finds their previous transaction (which they redeemed) and marks the input of those
transactions as addresses of users of that mixer.
3.6.3 Analyzing the Possible Causes of the Observed Pat-
terns
We can make informed guesses about the reasons for the existence of the obtained
patterns by taking a closer look at them. In the early days of Bitcoin, MultiSig ad-
dresses were commonly used when people wanted to own bitcoins jointly. However,
since 2012, when P2SH addresses were added to the official Bitcoin client, their pop-
ularity increased for multi-owner assets. The use of P2SH addresses is associated
with smaller transactions, reduced transaction fees, and greater ease of operation,
compared to MultiSig addresses. Therefore, it is possible to analyze the pattern of
address types, assuming that mixing services often have more than one owner, and
that the addresses related to these services are often P2SH.
Another pattern is the one-to-two structural pattern of intermediate transactions.
As we have already explained, today’s mixing services transfer the amount to the
outgoing address(es) from another address where they have much money, in a separate
60
Figure 3.7: Dirty-Side Money and Connection of It to the Transaction Chain
61
path from the sender’s money (this is the reason for the amount pattern). Usually,
a single outgoing address is considered, and another address is the change address of
the remaining Bitcoin amount of the mixing service.
3.7 Expanding Identified Addresses
As discussed in the research background, previous efforts to identify addresses have
focused on categorizing them and linking them to specific identities. We leveraged
these efforts to expand our dataset and uncover additional information.
3.7.1 Address Clustering
As previously noted, address clustering is a common approach to deanonymizing Bit-
coin transactions. This method involves grouping addresses based on the assumption
that they belong to the same entity or individual.
We applied this technique to the addresses associated with both money launderers
and money laundering services, with the aim of expanding our dataset as much as
possible. By clustering related addresses, we were able to identify additional entities
and individuals involved in money laundering.
3.8 Labeling Detected Clusters
While the primary objective of this study is to group and analyze Bitcoin addresses
and transactions, it is also important to accurately label and associate real-world
information with these addresses. To achieve this, we have leveraged previous research
in our own analysis to obtain additional information and labels for various address
groups.
3.8.1 Mapping IP Addresses to Network Address Groups
Another method previously described in the literature is to personalize the Bitcoin
client to record the IP address of the transaction sender on the network. To implement
this approach with enhanced security, we customized the client and deployed it on
five servers located in different regions of the world, including Australia, China, Iran,
62
Russia, and the United States. By collecting IP addresses from these servers, we were
able to label various address clusters.
However, it should be noted that this method may not be accurate in cases where
TOR and similar services are utilized. Nonetheless, if multiple addresses within a
cluster share the same IP address, this strengthens the reliability of our results.
3.9 Two-Phase Detection Algorithm
In this section, we propose a two-step algorithm for identifying mixing transactions
based on the recurring patterns we discovered at the transaction and chain levels in the
previous section. The first step involves checking transactions at the transaction level
to identify suspicious transactions, while the second step examines these transactions
in more detail at the chain level and considers them as mixing transactions if certain
conditions are met. Our algorithm also determines the addresses of the mixing service
and its users according to specified heuristics.
3.9.1 Suspected Transactions Identifier Implementation
In this section, we present an algorithm to identify suspected transactions related to
withdrawal transactions in mixing services based on identified patterns. Although
Algorithm 3 represents this implementation, its accuracy alone is insufficient, and we
will discuss its limitations and extensions in later sections. The primary objective
of this algorithm is to rapidly identify potential mixing transactions within a block
using a transaction-level filter.
63
Algorithm 3 Finding suspected transactions of a block as a candidate mixing trans-
action
Input: A Block of Transactions
Output: Intermediate Mixing Transactions List of the
Block
1: repeat
2: Ensure transaction has exactly 1 input
3: Ensure transaction has exactly 2 outputs
4: Ensure address types XOR equals 1
5: Ensure one of outputs is 5 times or more greater than another one
6: Ensure value of input is more than 1 BTC
7: If all above conditions met Then CandidateMixingTranactions[]
transaction
8: until All transactions have been checked
3.9.2 Implementation of a Two-Phase Algorithm for Detect-
ing Mixing Transactions with High Accuracy
To improve the accuracy and recall in identifying mixing transactions, this paper
proposes a two-phase algorithm that integrates the transaction-level filter presented
in Algorithm 3 and the higher-level abstraction achieved by introducing transaction
chains in Algorithm 1. The resulting algorithm is a finalized approach that leverages
both two-phase filtration and chain expansion, leading to enhanced accuracy and
recall.
64
Algorithm 4 Final two-phase algorithm
Input: A Block As a Seed of Algorithm
Output: Labled Mixing Addresses , Labled Dirty Money Addresses, Labled Cleaned
Money Addresses
1: onceFilteredTxs Algorithm3(Input Block)
2: while Exist onceFilteredTxs do
3: Potential Tx Pop Tx
4: transactionChain , anomaliesDetected Algorithm1(Potential Tx)
5: if anomaliesDetected >5then
6: Continue
7: end if
8: if Length(transactionChain) <53 then
9: Continue
10: end if
11: timeMedian MEDIAN( timeDifference(transactionChain) )
12: if timeMedian >80 OR timeMedian <20 then
13: Continue
14: end if
15: Append transactions of chain to the database
16: Label input addresses as Mixing Service Address
17: Label greater P2SH output address as Mixing Service Address
18: Label non-P2SH output address as clean recipient ad-
dress(Destination)
19: Algorithm2 ( transactionChain [0] )
20: end while
Algorithm 4 takes a block as input and returns mixing transactions with high
confidence. In Algorithms 1 and 4, various thresholds were set for exceptions in the
pattern. For example, relay transactions are rarely seen among intermediate chain
transactions that do not fit into our pattern. The presented algorithm utilizes all
patterns and significantly reduces false positives, while the execution time is more
efficient than using Algorithm 3 alone. Based on the patterns identified at the trans-
action and chain levels, this algorithm combines the previously introduced algorithms
to identify mixing transactions and their related addresses in a given block, which is
the final goal of this research.
65
Figure 3.8: A full mixing transaction chain concluded by the third algorithm
In figure 3.8, we provided a schematic of the final result of algorithm 4 as all
mixing service transactions we detect.
3.10 Evaluation
Assessing the efficacy of a detection algorithm for deanonymization tasks has consis-
tently posed a significant challenge, often equalling, if not surpassing, the complexity
involved in proposing the actual detection algorithm and framework. This issue is evi-
dent in notable deanonymization research at the network level, which aims to identify
the IP addresses of real identities concealed behind Bitcoin addresses and clusters.
Biryukov et al. often encountered issues due to an absence of data corresponding
to clusters external to their own. Consequently, they produced and tested their al-
66
Table 3.2: Method evaluation using generated dataset during time
Total TXs
Examined
#Known Non-mixing
TXs #Known Mixing TXs #Known Mixing TXs
Identified
#Known Non-mixing TXs
Identified
#Total TXs
Identified
Experiment #1 104449 45 15 (5 each) 15 0 9324
Experiment #2 98273 45 15 (5 each) 15 0 9057
Experiment #3 112832 45 15 (5 each) 15 0 12448
Total 315554 135 45 45 0 30829
gorithm using a self-generated dataset. However, the team candidly acknowledged
the limitations of the anonymity metric utilized, which failed to consider transactions
from clusters not including their own. Biryukov et al., therefore, proposed that fu-
ture research should aim to develop a more encompassing anonymity metric, one that
more accurately mirrors the realities of extensive networks [35, 36].
Our research, focused on the detection of mixing services, confronts similar chal-
lenges due to its deanonymization nature. Given that the primary objective of this
investigation is to identify money laundering activities and consequently unveil illicit
actions, there is a distinct absence of readily available and labeled datasets for mixing
services. In order to evaluate the algorithm’s performance, we found it necessary to
execute additional transactions, specifically aiming to identify those that were not
implicated in the pattern recognition phase. To further ensure the validity of our
model, we conducted a series of random non-mixing transactions, affirming that the
model does not falsely label all transactions, irrespective of their mixing status. This
approach also ensured the efficient performance of our two-phase detection filtering
system.
Table3.2 illustrates the performance of our algorithm. Notably, transactions iden-
tified as using mixing services constitute 9.7% of all transactions. Furthermore, all
recognized mixing transactions were accurately identified by our algorithm, highlight-
ing the effectiveness of our approach. The trends within the identified transactions
can be contextualized by examining the data provided in Chainanalysis’ crime re-
ports for 2022 and 2023. As indicated in these reports, illicit activities in 2020 were
fewer than those in 2019, but experienced an uptick in 2021[30, 31]. Additionally,
the Novetta white paper disclosed a rising trend in the usage of mixing services [82].
This trend is also discernible in the data represented in table3.2.
67
Chapter 4
Bitcoin Illicit Transactions
Identification
In this chapter, we explore one of the most reliable datasets publicly available, devel-
oped through a collaborative effort by IBM, MIT, and Elliptic Co. This exploration
is aimed at understanding how we can effectively classify illicit transactions using var-
ious methodologies. The dataset provides a comprehensive framework for evaluating
the efficacy of different transaction classification techniques and offers a robust basis
for comparative analysis across multiple models.
4.1 The Elliptic Data Set
Elliptic, a leading cryptocurrency intelligence company, is committed to the mission
of curbing illicit financial activities within the cryptocurrency ecosystem. As part of
its initiative to empower research and anti-money laundering (AML) efforts globally,
Elliptic has released The Elliptic Data Set. This dataset is a graph network of Bitcoin
transactions, distinguished by its richly handcrafted features and extensive labeling,
and is presented as the largest publicly available labeled transaction dataset in the
cryptocurrency domain.
4.1.1 Graph Construction
The Elliptic Data Set has been meticulously constructed to map Bitcoin transac-
tions onto real entities, distinguishing between licit (e.g., exchanges, wallet providers,
miners, legitimate services) and illicit categories (e.g., scams, malware, terrorist orga-
68
Figure 4.1: TSNE Visualization of Fraud and Non-Fraud Transactions Conveyed
from Their Features
nizations, ransomware, Ponzi schemes). This graph represents transactions as nodes
and the flow of Bitcoin currency from one transaction to another as directed edges.
A transaction is categorized as licit or illicit based on the nature of the entity that
controls the private keys for the transaction inputs.
Nodes and Edges
The dataset comprises 203,769 transaction nodes and 234,355 directed payment flows
(edges). Of these, approximately 2% (4,545 nodes) are labeled as illicit (class1),
and 21% (42,019 nodes) as licit (class2). The remaining transactions are unlabeled
concerning licit or illicit nature but include other descriptive features that are crucial
for analysis.
69
Features
Each node within the dataset is associated with 166 features. The first 94 features
provide local information about the transaction—details such as the transaction’s
time step, the number of inputs and outputs, transaction fee, and output volume,
as well as aggregated metrics like average Bitcoin received or spent and the average
number of transactions linked to the inputs or outputs. The next 72 features are
aggregated features, designed by analyzing one-hop backward and forward from
the node. These features capture the maximum, minimum, standard deviation, and
correlation coefficients of transactions within the node’s neighborhood, providing a
deep, contextual understanding of each transaction. Figure 4.1 illustrates the TSNE
of the illicit and licit nodes in red and green, respectively. As it is visible before any
transformation, they look similar in terms of the features.
Temporal Information
Each node is time-stamped, which approximates the confirmation time of the trans-
action by the Bitcoin network. The dataset is organized into 49 distinct time steps,
with each step representing a connected component of transactions that appeared on
the blockchain within a three-hour window. This temporal element allows for the
analysis of transaction patterns over time and can be crucial for identifying emerging
trends in illicit financial activities. Figure 4.2 illustrates the temporal dispersion of
transaction points across different labeling groups within the dataset.
4.1.2 Notes on Feature Construction
The construction of features in The Elliptic Data Set was driven by a heuristic-based
approach, where behaviors like the number of inputs or the reuse of addresses could
indicate potential licit or illicit activities. For example, transactions associated with
legal entities might show patterns of consolidating funds from multiple addresses to
reduce transaction fees, whereas illicit activities might prefer transactions with fewer
inputs to minimize the risk of deanonymization through address clustering.
Challenges in feature construction included managing the vast size of the Bitcoin
blockchain and the diversity in transaction behaviors. Elliptic addressed these chal-
lenges using a high-performance, all-in-memory graph computation engine to process
and extract features efficiently, ensuring comprehensive coverage of the blockchain’s
70
Figure 4.2: Transaction Frequency in Elliptic Categorized by Label over each Time
Step
71
complex structure.
4.1.3 Research and Practical Implications
The availability of The Elliptic Data Set opens new avenues for both academic re-
search and practical applications in the fight against money laundering and other
financial crimes in cryptocurrencies. By providing a rich, well-labeled, and detailed
dataset, Elliptic not only supports the development of advanced machine learning
models, particularly in graph analytics and anomaly detection but also enhances the
capabilities of financial institutions and regulators to monitor and act against illicit
activities effectively.
4.2 Fraud Activity Classification
At this stage, our objective is to develop robust classification models to distinguish
between fraudulent and non-fraudulent transactions. Initially, we will implement
classical machine learning methods to establish a baseline performance. Subsequently,
we will leverage the topological information inherent in the transaction network by
incorporating the graph structure and connections between transactions. This will
involve employing Graph Neural Networks (GNNs) to exploit both the transactional
features and the intricate graph patterns, enhancing our ability to detect fraudulent
activities.
4.2.1 Autoencoders for Feature Extraction
Autoencoders are a type of artificial neural network used for unsupervised learning of
efficient codings. The aim of an autoencoder is to learn a representation (encoding) for
a set of data, typically for the purpose of dimensionality reduction. An autoencoder
consists of two main parts: an encoder and a decoder.
The encoder compresses the input into a latent-space representation, which is a
lower-dimensional space compared to the input. Mathematically, the encoder can be
represented as:
h=f(x) = σ(W x +b),(4.1)
where xis the input, his the hidden representation, Wand bare the weights and
biases of the encoder, respectively, and σis an activation function, such as ReLU or
72
tanh.
The decoder aims to reconstruct the input from the hidden representation. It can
be described as:
x=g(h) = σ(Wh+b),(4.2)
where xis the reconstructed input, and Wand bare the weights and biases of the
decoder.
The autoencoder is trained to minimize the reconstruction error, which is typically
measured by the mean squared error (MSE) between the input xand the reconstructed
input x:
L(x, x) = 1
n
n
X
i=1
(xix
i)2.(4.3)
In this study, an autoencoder was trained exclusively on non-fraudulent transac-
tions. The architecture of the autoencoder used in this work is detailed in Table 4.1.
Layer Type Output Shape Activation Function
Input Layer (166,) -
Dense (Encoder) (100,) Tanh
Dense (Encoder) (50,) ReLU
Dense (Decoder) (50,) Tanh
Dense (Decoder) (100,) Tanh
Output Layer (166,) ReLU
Table 4.1: Architecture of the autoencoder used for feature extraction.
4.2.2 Classical Machine Learning Classifiers
After extracting the latent representations using the autoencoder, several classical
machine learning classifiers were employed to distinguish between fraudulent and
non-fraudulent transactions. The classifiers used include Logistic Regression, Decision
Trees, Random Forest, and Support Vector Classification (SVC). Below is a detailed
description of each classifier:
Logistic Regression Classifier
Logistic regression is a statistical model that in its basic form uses a logistic function
to model a binary dependent variable. In regression analysis, logistic regression (or
73
Figure 4.3: TSNE plot showing fraudulent and non-fraudulent transactions post
Autoencoder latent space transformation
74
logit regression) is estimating the parameters of a logistic model (a form of binary
regression). Mathematically, the logistic function is represented as:
P(y= 1|x) = 1
1 + e(β0+β1x1+β2x2+...+βnxn),(4.4)
where P(y= 1|x) is the probability of the dependent variable ybeing 1 given the
independent variable x, and β0, β1, . . . , βnare the parameters of the model.
The logistic regression model is fitted by maximizing the likelihood function:
L(β) =
n
Y
i=1
P(yi|xi)yi(1 P(yi|xi))1yi.(4.5)
Decision Tree Classifier
Decision Trees are a non-parametric supervised learning method used for classifica-
tion. The model predicts the value of a target variable by learning simple decision
rules inferred from the data features. The decision tree structure includes:
Root Node: The top node that represents the entire dataset.
Internal Nodes: Nodes that represent a ”test” on an attribute.
Branches: Outcomes of the tests.
Leaf Nodes: Nodes that represent class labels.
The model recursively splits the data into subsets based on the value of the at-
tribute that results in the most significant information gain. The information gain is
calculated using entropy or Gini impurity.
Random Forest Classifier
Random Forest is an ensemble learning method that constructs multiple decision trees
during training and outputs the mode of the classes for classification. The algorithm
operates as follows:
1. Bootstrap Sampling: Randomly select subsets of the training data.
2. Tree Construction: Construct a decision tree for each subset.
75
3. Voting: Aggregate the predictions of individual trees by voting.
Random Forest improves predictive accuracy and robustness by averaging the
results of an ensemble of decision trees, reducing overfitting and variance.
Support Vector Classification (SVC)
Support Vector Classification (SVC) is a supervised learning model that uses clas-
sification algorithms for two-group classification problems. The algorithm works by
finding the hyperplane that best separates the classes in the feature space. The key
concepts include:
Support Vectors: Data points that are closest to the hyperplane and influence
its position and orientation.
Hyperplane: A decision boundary that separates the classes.
Margin: The distance between the hyperplane and the nearest data points
from either class.
SVC aims to maximize the margin, creating a robust decision boundary that
generalizes well to unseen data.
4.3 Model Training and Evaluation
In this section, we present the training process and evaluation of classical machine
learning classifiers—Logistic Regression, Decision Tree, Random Forest, Support Vec-
tor Classifier (SVC), and Convolutional Neural Network (CNN)—applied to the Ellip-
tic dataset for classifying licit and illicit Bitcoin transactions. Given the imbalanced
nature of the dataset, with a higher number of licit transactions, special attention
was given to evaluation metrics beyond accuracy, including precision, recall, F1-score,
ROC AUC, and Precision-Recall AUC.
4.3.1 Training Process
1. Data Preprocessing: - The dataset was first cleaned, with features containing
more than 35% missing values being removed. - Remaining missing values were
imputed using the mean for continuous features and the mode for categorical features.
76
- The dataset was then split into training (80%) and testing (20%) sets, ensuring that
the split preserved the original distribution of the classes.
2. Feature Scaling: - Features were normalized using Min-Max scaling to bring
all values into the range [0, 1].
3. Class Balancing: - To address class imbalance, Synthetic Minority Over-
sampling Technique (SMOTE) was applied to the training data to generate synthetic
samples for the minority class (illicit transactions).
4. Model Training: - Each classifier was trained using the preprocessed and
balanced training data. Hyperparameters were tuned using grid search and cross-
validation to optimize model performance.
4.3.2 Model Evaluation
Logistic Regression
Logistic Regression, a baseline linear model, was evaluated for its simplicity and
interpretability. Despite its ease of use, it struggled with class imbalance, showing
high recall for licit transactions but poor recall for illicit ones.
Metric Precision Recall F1-Score Support
Illicit (Class 0) 0.65 0.30 0.41 1015
Licit (Class 1) 0.87 0.97 0.92 5052
Accuracy 0.86
Macro Avg 0.76 0.63 0.66 6067
Weighted Avg 0.84 0.86 0.83 6067
Table 4.2: Classification Report for Logistic Regression
Decision Tree
The Decision Tree model, while providing interpretability, showed limitations in han-
dling the complex decision boundaries required for distinguishing between licit and
illicit transactions.
Random Forest
The Random Forest classifier performed the best among the classical machine learning
models, showing high accuracy and balanced performance across metrics.
77
Metric Precision Recall F1-Score Support
Illicit (Class 0) 0.54 0.54 0.54 1015
Licit (Class 1) 0.91 0.91 0.91 5052
Accuracy 0.85
Macro Avg 0.72 0.72 0.72 6067
Weighted Avg 0.85 0.85 0.85 6067
Table 4.3: Classification Report for Decision Tree
Metric Precision Recall F1-Score Support
Illicit (Class 0) 0.75 0.45 0.57 1015
Licit (Class 1) 0.90 0.97 0.93 5052
Accuracy 0.88
Macro Avg 0.82 0.71 0.75 6067
Weighted Avg 0.87 0.88 0.87 6067
Table 4.4: Classification Report for Random Forest
Support Vector Classifier (SVC)
SVC achieved high recall for licit transactions but was unable to effectively classify
illicit transactions, leading to high false negative rates.
Metric Precision Recall F1-Score Support
Illicit (Class 0) 0.78 0.17 0.28 1015
Licit (Class 1) 0.86 0.99 0.92 5052
Accuracy 0.85
Macro Avg 0.82 0.58 0.60 6067
Weighted Avg 0.84 0.85 0.81 6067
Table 4.5: Classification Report for Support Vector Classifier
Convolutional Neural Network (CNN)
The CNN model showed competitive performance, with high accuracy and excellent
precision-recall AUC, indicating robust performance in handling imbalanced data.
4.3.3 Interpretation of Results
The confusion matrices of the models provide further insights into the performance
of each classifier. High values in the bottom right quadrant indicate a high number
78
Metric Precision Recall F1-Score Support
Illicit (Class 0) 0.66 0.39 0.49 1015
Licit (Class 1) 0.88 0.96 0.92 5052
Accuracy 0.86
Macro Avg 0.77 0.67 0.70 6067
Weighted Avg 0.85 0.86 0.85 6067
Table 4.6: Classification Report for CNN
of true positives (licit transactions correctly identified), while high values in the top
left quadrant indicate a high number of true negatives (illicit transactions correctly
identified).
Figure 4.4: Confusion Matrices for Various Classifiers
Logistic Regression
The logistic regression confusion matrix shows a high number of true positives and a
significant number of false negatives for the illicit class. This indicates that while the
79
model is good at identifying licit transactions, it struggles with detecting illicit ones
due to the imbalance.
Decision Tree
The decision tree confusion matrix displays a better balance between true positives
and true negatives, but still suffers from a considerable number of false negatives,
indicating that the model can miss many illicit transactions.
Random Forest
The random forest confusion matrix highlights a strong performance with a higher
number of true positives and a reduced number of false negatives compared to other
models, making it more reliable for detecting both licit and illicit transactions.
Support Vector Classifier
The SVC confusion matrix reveals a high number of false negatives for illicit transac-
tions, suggesting that the model predominantly predicts licit transactions due to the
class imbalance, resulting in many illicit transactions going undetected.
Convolutional Neural Network
The CNN confusion matrix in figure 4.5 shows a competitive performance with a
relatively high number of true positives and a moderate number of false negatives
for illicit transactions, indicating better handling of the class imbalance compared to
other models.
4.3.4 Comparative Analysis
The following table provides a summary of the evaluation metrics for all the classifiers:
Model Accuracy Prec. (0) Recall (0) F1 (0) Prec. (1) Recall (1) F1 (1)
Logistic Regression 0.86 0.65 0.30 0.41 0.87 0.97 0.92
Decision Tree 0.85 0.54 0.54 0.54 0.91 0.91 0.91
Random Forest 0.88 0.75 0.45 0.57 0.90 0.97 0.93
SVC 0.85 0.78 0.17 0.28 0.86 0.99 0.92
CNN 0.86 0.66 0.39 0.49 0.88 0.96 0.92
Table 4.7: Comparative Analysis of Classification Models
80
Figure 4.5: Confusion Matrices for CNN
Figure 4.6: ROC and Precision-Recall Curves for Various Classifiers
4.4 Insights and Interpretations
From the comparative analysis, it is evident that the Random Forest classifier per-
forms the best among the classical machine learning models, closely followed by the
CNN. Despite their strong performance in terms of accuracy and precision for licit
81
transactions, all models suffer from a significant number of false negatives for illicit
transactions. This issue stems from the class imbalance in the dataset, where the
majority class (licit transactions) overwhelms the minority class (illicit transactions).
Logistic Regression performs well in identifying licit transactions but falls short
in detecting illicit transactions due to the dataset’s imbalance. Its high false negative
rate for illicit transactions indicates a need for improved handling of class imbalance.
The Decision Tree model provides a more balanced performance between precision and
recall but still struggles with high false negatives for the illicit class. This limitation
reduces its applicability in real-world scenarios where detecting illicit transactions is
critical.
The Random Forest classifier stands out with high accuracy, precision, and recall
for both classes. Its ability to reduce false negatives for illicit transactions makes
it a more reliable choice for practical applications. However, it still falls short of
ideal performance, highlighting the need for further improvements. The Support
Vector Classifier (SVC) shows high precision and recall for licit transactions but has
a significant drawback in detecting illicit transactions, resulting in a high number of
false negatives. This makes it less suitable for applications requiring high sensitivity
to illicit activities.
The CNN model demonstrates competitive performance, handling class imbalance
better than other models. Its relatively high recall for illicit transactions makes it a
strong candidate for detecting fraudulent activities in Bitcoin transactions. However,
it is still not immune to the challenges posed by class imbalance, and like the Random
Forest, it shows that even the best-performing models need enhancement to effectively
detect illicit transactions.
Analyzing the ROC and Precision-Recall curves in Figure 4.6 further reinforces
these findings. The ROC curve shows that the Random Forest and CNN have the
highest area under the curve (AUC), indicating better overall performance in dis-
tinguishing between licit and illicit transactions. The Precision-Recall curve, which
is more informative for imbalanced datasets, shows that Random Forest and CNN
maintain higher precision and recall across different thresholds compared to other
models. This demonstrates their robustness in handling the imbalanced nature of the
dataset.
Despite these strengths, the high number of false negatives for illicit transactions
in all models reveals a critical weakness. This indicates that while the models are
effective at identifying licit transactions, they are less capable of detecting illicit ac-
82
tivities, which are the primary target for fraud detection systems. Therefore, future
work should focus on developing more sophisticated models and techniques, such as
Graph Neural Networks (GNNs), that can better leverage the topological and tem-
poral information inherent in the transaction network to improve the identification
of illicit activities.
In conclusion, while classical machine learning techniques provide a strong base-
line, they fall short in effectively classifying illicit Bitcoin transactions due to the
imbalanced nature of the dataset. Advanced methods are required to achieve more
reliable and accurate fraud detection in cryptocurrency transactions, ensuring that
illicit activities are detected and mitigated more effectively.
4.5 Graph Neural Network Based Approaches
Graph Neural Networks (GNNs) are a class of neural networks designed to perform
inference on data structured as graphs. Unlike traditional neural networks, which
operate on fixed-size input arrays, GNNs are capable of capturing the dependencies
and interactions among the entities represented as nodes and edges in a graph. This
capability makes GNNs particularly suitable for applications such as social network
analysis, molecular chemistry, and financial transaction networks.
The fundamental concept behind GNNs is the aggregation of node features from
local neighborhoods. During training, each node aggregates information from its
neighbors, allowing the network to learn hierarchical representations that encode
both the local structure and the global context of the graph.
4.5.1 Graph Convolutional Networks (GCNs)
Graph Convolutional Networks (GCNs) are a type of GNN that extends the concept
of convolution from grid-like data structures, such as images, to graph-structured
data. The core idea of GCNs is to generalize the convolution operation to graph
domains, enabling the network to aggregate and transform feature information from
a node’s local neighborhood.
The propagation rule of a GCN layer can be mathematically expressed as:
H(l+1) =σ˜
D1
2˜
A˜
D1
2H(l)W(l),(4.6)
83
where:
H(l)is the matrix of activations at layer l,
˜
A=A+Iis the adjacency matrix with added self-loops,
˜
Dis the degree matrix of ˜
A,
W(l)is the trainable weight matrix of layer l,
σis a non-linear activation function such as ReLU.
The multiplication by ˜
D1
2˜
A˜
D1
2serves to normalize the adjacency matrix, en-
suring that the feature aggregation accounts for the varying degrees of nodes.
4.5.2 GCN Implementation for Bitcoin Transaction Classifi-
cation
In this study, we applied GCNs to the problem of classifying Bitcoin transactions as
either licit or illicit. To do so, we constructed a graph where each node represents a
transaction and each directed edge represents the flow of Bitcoin from one transaction
to another. We enhanced the feature set by including in-degree and out-degree for
each node to retain information about the connectivity of transactions, especially
after the removal of unlabeled nodes. Also, we have used a community detection
algorithm
4.5.3 Community Detection Using the Louvain Algorithm
Community detection in networks is a fundamental technique used to discover the
modular structure of various systems, which is essential for understanding and ana-
lyzing the relationships within large datasets. One of the most efficient and popular
methods for community detection is the Louvain algorithm, developed by Blondel et
al[40]. This algorithm is particularly noted for its ability to handle large networks
with speed and accuracy.
The Louvain algorithm operates on the principle of optimizing modularity, a scalar
value between -1 and 1 that measures the density of links inside communities as
compared to links between communities. The algorithm proceeds in two distinct
phases that are repeated iteratively:
84
1. Local optimization: Each node in the network is initially considered as its
own community. The algorithm then iteratively reassigns nodes to different
communities, each time choosing the reassignment that results in the highest
gain in modularity. This local optimization is performed greedily, meaning
nodes are moved to the community that provides the most significant increase
in modularity until no further improvement can be made.
2. Aggregation: After achieving a local maximum in modularity, an aggregated
network is created. In this new network, nodes are the communities found dur-
ing the local optimization phase. Links between these new nodes are weighted
by the sum of the weights of the links between nodes in the corresponding
two original communities. This aggregation reduces the size of the network,
simplifying further iterations of the algorithm.
This iterative process continues until the modularity is maximized, indicating a
stable community structure. In this study, we applied the Louvain algorithm to
the Elliptic dataset, which contains features derived from Bitcoin transactions. By
integrating the community structure as an additional feature, we aimed to enrich the
dataset, providing deeper insights into transaction patterns and potentially enhancing
the predictive performance of subsequent analyses. The inclusion of the community
component is hypothesized to capture collective behaviors of transactions that are
not apparent when analyzing transactions in isolation. However, a significant part of
these data might be visible through connections as our model is GNN and considers
connections and topology.
Feature Construction Methodology
The features for each node were derived from the original dataset and augmented
with degree information:
Original Features: The first 166 features include transaction-specific details
and aggregated metrics from one-hop neighborhoods.
Degree Features: Two additional features were added: in-degree and out-
degree, representing the number of incoming and outgoing edges for each trans-
action node.
85
Community Component: Following the steps outlined in Section 4.5.3, we
executed the community detection algorithm and incorporated the resulting
community components as an additional feature.
This augmentation was necessary to preserve important structural information
that could be lost due to the removal of unlabeled transactions, which comprise 77%
of the dataset.
GCN Model Architecture
The GCN model presented in the table 4.8 utilizes a sequence of layers tailored to
process graph data by focusing on extracting localized node features and effectively
classifying nodes based on their structural and attribute-related context. Initially,
a GCN convolutional layer (GCNConv) takes input node features and transforms
them into a reduced dimensionality while employing the ReLU activation function to
introduce non-linearity, assisting in the capture of complex patterns. This is followed
by a batch normalization layer, which normalizes the layer outputs to facilitate stable
training dynamics and mitigate the internal covariate shift.
Subsequent to this, a dropout layer is introduced to minimize overfitting by ran-
domly deactivating some of the outputs during training, ensuring the model does not
become overly dependent on any particular node’s feature. Additional GCN convolu-
tional layers continue to refine the node features, each paired with batch normalization
to ensure continued effective learning rates and stability across training epochs. These
layers gradually refine the features, culminating in a final GCN layer that diminishes
the feature space down to the number of target classes. A linear residual connection
from the initial input to the final output layer enhances the model’s capability to
preserve important initial information throughout the network and potentially mit-
igates the issue of vanishing gradients. The architecture concludes with a softmax
output layer, which translates the logits into a probability distribution, apt for classi-
fication purposes. This GCN setup is strategically designed to effectively leverage the
inherent graph structure and node features, leading to strong performance in node
classification scenarios.
Training and Evaluation
The GCN model was trained using the PyTorch Geometric library. The graph data,
including features and edge indices, was fed into the model. The training process
86
Table 4.8: GCN Architecture
Layer Type Input Features Output Features Activation
conv1 GCNConv 169 32 ReLU
bn1 BatchNorm 32 32 -
Dropout Dropout - - -
conv2 GCNConv 32 64 ReLU
bn2 BatchNorm 64 64 -
conv3 GCNConv 64 2 -
residual Linear 169 2 -
Output Softmax - - Log Softmax
involved minimizing the cross-entropy loss between the predicted and true labels of
the transactions. The model was evaluated on a held-out test set using accuracy,
precision, recall, F1-score, and ROC AUC score as metrics.
The performance of the GCN model is summarized in Table 4.9.
Metric Value
Accuracy 95.19%
Recall 79.39%
Precision 91.19%
F1 Score 84.05%
Sensitivity 99%
Specificity 60%
Table 4.9: Performance metrics of the GCN classifier.
The confusion matrix for the GCN model, illustrated in Figure 4.7, sheds light
on the classification performance. The model excels in accurately identifying 99% of
licit transactions. However, it correctly recognizes only 59.78% of illicit transactions,
leading to a misclassification rate of 40.22% as licit. This performance discrepancy
likely stems from the imbalanced dataset, containing significantly more licit examples
than illicit ones. In summary, while the GCN shows potential for this task, enhance-
ments are necessary, particularly in addressing the class imbalance and reducing false
negatives for illicit transactions.
4.5.4 Graph Attention Network (GAT)
Graph Attention Networks (GATs) are an advanced type of Graph Neural Networks
(GNNs) that incorporate attention mechanisms to assign different importances to
87
Figure 4.7: Confusion matrix of the GCN classifier.
nodes within a neighborhood. The primary advantage of GATs over traditional GCNs
is their ability to focus on the most relevant neighbors during the feature aggregation
process, which can improve the model’s performance on tasks involving complex graph
structures.
The attention mechanism in GATs allows the model to learn which neighbors are
more critical for each node’s representation. This is achieved by computing attention
coefficients for each pair of neighboring nodes and using these coefficients to weigh
the influence of each neighbor’s features.
GAT Architecture and Attention Mechanism
The core component of a GAT layer is the attention mechanism, which can be de-
scribed as follows:
eij = LeakyReLU aT·[W·hi||W·hj],(4.7)
where:
hiand hjare the feature vectors of nodes iand j,
Wis the weight matrix,
88
ais the attention vector,
|| denotes concatenation,
eij is the attention coefficient for edge (i, j).
The attention coefficients are then normalized using the softmax function:
αij =exp(eij )
Pk∈N (i)exp(eik),(4.8)
where N(i) is the set of neighbors of node i.
The node features are updated by computing a weighted sum of the neighbors’
features:
h
i=σ
X
j∈N (i)
αij ·W·hj
,(4.9)
where σis a non-linear activation function such as ELU.
GAT Implementation for Bitcoin Transaction Classification
Integrating attention strategies into graph neural networks, exemplified by GATs, can
greatly improve the model’s capacity to flexibly learn and emphasize features from
intricate input data. In contrast to GCNs that treat all neighboring nodes uniformly
during the feature aggregation step, GATs introduce a mechanism that allocates
different levels of importance to various nodes in a node’s vicinity. This attention-
based method is especially beneficial for managing heterogeneous and irregular graph
topologies, where node relationships are not consistent. For example, in domains
such as social networks, molecular configurations, or technological networks, where
node influence varies widely, GATs can dynamically assign weights to neighbor nodes,
thereby discerning more subtle patterns in the data.
Given these strengths, the decision to explore GATs in addition to GCNs in this
research is driven by the need for a more refined analysis that can leverage the variable
significance of node connections inherent in complex graphs. The adaptive nature of
GATs allows for a more precise and insightful understanding of node relationships,
which is crucial for the tasks at hand, such as node classification or link prediction in
sparse and complex graphs. This approach not only aids in achieving higher accuracy
but also provides deeper insights into the structural and feature-based importance of
nodes within the graph, potentially leading to more robust and interpretable models.
89
GAT Model Architecture
The GAT model outlined in the table 4.10 leverages a series of graph-based convo-
lutional layers to process features from nodes within a graph structure, emphasizing
important relationships through an attention mechanism. The model initiates with
a dropout layer to mitigate overfitting by randomly deactivating a portion of the
nodes during training, followed by the first GAT convolutional layer which expands
the input features into a higher-dimensional feature space, utilizing multiple attention
heads to capture diverse patterns within the graph data. This layer is followed by
normalization to ensure stability and faster convergence.
Subsequent layers include additional dropout and GAT convolutional stages to
further refine the feature representation, each followed by normalization to maintain
training stability. The model uses decreasing numbers of attention heads in deeper
layers, focusing the network’s capacity on salient features as it prepares for classifica-
tion. A linear residual connection is introduced parallel to the convolutional layers to
aid in gradient flow and preserve initial feature information, enhancing the model’s
ability to learn complex patterns without losing critical data. The architecture culmi-
nates in a softmax output layer which applies a log softmax function, transforming the
final features into normalized probabilities, ready for classification. This architecture
effectively addresses the challenges of graph-structured data by focusing on significant
node relationships and maintaining robustness through dropout and normalization.
Table 4.10: GAT Architecture
Layer Type Input Features Output Features Heads Activation/Dropout
Dropout1 Dropout - - - p=0.6 during training
conv1 GATConv 169 64 8 ELU, Dropout p=0.6
norm1 LayerNorm 64 64 - -
Dropout2 Dropout - - - p=0.6 during training
conv2 GATConv 64 64 4 ELU, Dropout p=0.6
norm2 LayerNorm 64 64 - -
conv3 GATConv 64 2 1 -
residual Linear 169 2 - -
Output Softmax - - - Log Softmax
Training and Evaluation
The GAT model was trained using the PyTorch Geometric library. The training
process involved minimizing the cross-entropy loss between the predicted and true
90
labels of the transactions. The model was evaluated on a held-out test set using
accuracy, precision, recall, F1-score, and ROC AUC score as metrics.
The performance of the GAT model is summarized in Table 4.11.
Metric Value
Accuracy 95.32%
Recall 81.19%
Precision 90.26%
F1 Score 84.99%
Sensitivity 99%
Specificity 64%
Table 4.11: Performance metrics of the GAT classifier.
Confusion Matrix
The confusion matrix for the GAT model, depicted in Figure 4.8, provides further
insight into the classification performance, particularly highlighting its high specificity
and sensitivity.
Figure 4.8: Confusion matrix of the GAT classifier.
The results demonstrate that the GAT model effectively leverages the structural
91
and feature information of the transaction network, achieving superior accuracy and
F1-score, making it an exceptionally powerful tool for detecting illicit Bitcoin trans-
actions.
4.5.5 Comparative Analysis of GCN and GAT Models
This section presents a comparative analysis of the Graph Convolutional Network
(GCN) and Graph Attention Network (GAT) models applied to the task of Bitcoin
transaction classification. Both models demonstrate robust performance, with the
GAT model showing marginal improvements in several key metrics.
Performance Metrics Comparison
Table 4.12 presents the primary performance metrics for both models.
Metric GCN GAT
Accuracy 95.19% 95.32%
Recall 79.39% 81.19%
Precision 91.19% 90.26%
F1 Score 84.05% 84.99%
Sensitivity 99% 99%
Specificity 60% 64%
Table 4.12: Performance comparison of GCN and GAT models
The GAT model demonstrates a slight edge in overall accuracy (95.32% vs 95.19%).
More notably, it shows superior recall (81.19% vs 79.39%) and specificity (64% vs
60%) compared to the GCN. This improved performance in detecting illicit transac-
tions is particularly significant given the imbalanced nature of the dataset.
Confusion Matrix Analysis
The confusion matrices provide deeper insights into each model’s classification per-
formance:
GCN: "541 364
84 8324#GAT: "576 329
107 8301#(4.10)
The GAT model correctly identified 576 illicit transactions compared to GCN’s
541, representing a 6.47% increase in the true positive rate for illicit transactions.
92
However, the GCN demonstrated marginally better performance in correctly classi-
fying licit transactions (8324 vs 8301).
Statistical Significance
To assess the statistical significance of the performance difference, we employed Mc-
Nemar’s test. The test statistic is calculated as:
χ2=(|bc| 1)2
b+c(4.11)
where band care the discordant predictions between the two models. In our case:
b= 364 + 107 = 471 (GCN correct, GAT incorrect)
c= 329 + 84 = 413 (GAT correct, GCN incorrect)
This yields χ2= 3.89, corresponding to a p-value of approximately 0.0485. This
result suggests that the difference in performance between GCN and GAT is statisti-
cally significant at the 0.05 level, although marginally.
Discussion
The GAT model’s improved performance, particularly in identifying illicit transac-
tions, can be attributed to its attention mechanism. This allows the model to assign
different weights to neighboring nodes, potentially capturing more nuanced patterns
in transaction networks. The trade-off between precision and recall is reflected in
the marginally higher F1 score of the GAT model (84.99% vs 84.05%), suggesting a
better overall balance.
While both models demonstrate strong performance, the GAT’s improvements
make it a promising approach for this classification task, especially when dealing with
imbalanced datasets in financial transaction analysis. Future work could explore the
integration of these models in ensemble methods or investigate the specific transaction
features that contribute most significantly to the GAT’s improved performance in
detecting illicit activities.
In what follows, we will explore further improvements in feature engineering,
model tuning, and the application of other advanced GNN architectures to enhance
classification accuracy and robustness.
93
Chapter 5
Information Unveiling of the
Elliptic Dataset and Enhanced
Techniques for Identifying Illicit
Activities
In this chapter, our goal is to improve identification efficiency by incorporating addi-
tional features into the model following the deanonymization of the Elliptic dataset.
5.1 Elliptic Dataset Deanonymization Attack
The Elliptic dataset consists of a transaction graph collected from the Bitcoin blockchain.
Each node in the graph represents a transaction, and each edge represents a flow of
Bitcoins between transactions. Every node has 166 features and is labeled as ei-
ther ”licit,” ”illicit,” or ”unknown.” The dataset contains 203,769 nodes and 234,355
edges. This chapter details the method used to deanonymize the dataset by esti-
mating the number of inputs and outputs for each transaction. The concept of this
deanonymization method was initially mentioned abstractly in a Russian blog post
[18].
5.1.1 Feature Correlation Analysis
Initially, we calculate the in-degree and out-degree of each transaction using the
transaction graph. We then compute the correlation between each of the 165 features
94
and the in-degree and out-degree. This analysis revealed significant correlations,
particularly with features feature 6,feature 8,feature 7, and feature 16. The
significant correlations are summarized in Table 5.1.
Table 5.1: Significant Correlations with In-degree and Out-degree
Feature Degree Type Correlation
feature 4 In-degree 0.376710
feature 6 In-degree 0.508025
feature 8 In-degree 0.529497
feature 7 Out-degree 0.636120
feature 16 Out-degree 0.641267
The figure below illustrates the most significant correlations of these features with
the in-degree and out-degree.
Figure 5.1: Significant Correlations with the In-degree and Out-degree Columns
Generated by Constructing the Graph
Additionally, there is a significant correlation between feature 4 and in degree.
Further analysis and the subsequent steps of the algorithm reveal that feature 6
95
represents the transaction fee. This correlation is logical since transactions with a
greater number of inputs tend to have higher transaction fees.
5.1.2 Normalization and Scaling
To transform the feature 6 values into in-degree counts, we utilize the smallest non-
zero difference between consecutive sorted feature 6 values, denoted as min diff feature 6.
The formula used for this transformation is:
inputs count feature 6 = feature 6 min(feature 6)
min diff feature 6 + 1
5.1.3 Derivation of min diff feature 6
The smallest non-zero difference, min diff feature 6, is calculated by sorting the feature 6
values and finding the differences between consecutive values. The minimum of these
differences is used for scaling:
Table 5.2: Derivation of min diff feature 6
feature 6 Values Differences
0.075038 -
0.150075 0.075037
0.225113 0.075038
0.300150 0.075037
min diff feature 6 0.075038
5.1.4 Linear Transformation
We also fit a linear regression model to derive coefficients for a more refined transfor-
mation. The regression model is:
inputs count feature 6 = α·feature 6 + β
where αand βare the coefficients obtained from the regression analysis. For our
dataset, these coefficients are approximately α= 13.33 and β= 2.67.
96
5.1.5 Application of the Transformation
Using the derived formula and coefficients, we can estimate the in-degree for each
transaction. For example:
For feature 6 = 0.150075
inputs count feature 6 = 0.150075 0.075038
0.075038 + 1 2
inputs count feature 6 = 13.33 ·0.150075 + 2.67 4.624
5.1.6 Formulas for Inputs and Outputs
Similar analysis for other features resulted in the following formulas:
inputs count feature 6 = 13.3266685112665 ·feature 6 + 2.62544842444139
inputs unique count feature 8 = 11.9243179897452·feature 8+2.34747189219164
outputs count feature 7 = 50.3777694891647 ·feature 7 + 4.21030186142152
outputs unique count feature 16 = 49.3957564403755·feature 16+4.121809499973
Here, inputs count feature 6 is the total number of inputs to the transaction,
inputs unique count feature 8 is the number of unique inputs to the transaction,
outputs count feature 7 is the number of outputs from the transaction, and
outputs unique count feature 16 is the number of unique outputs.
5.1.7 Consistency of Feature Mapping
The question may arise - why is inputs count mapped to feature 6 and not feature 8?
The answer is straightforward: always inputs count inputs unique count. By sub-
stituting the values of feature 6 and feature 8 from the dataset into the indicated
formulas, we can verify that the inequality:
13.3266685112665·feature 6+2.62544842444139 11.9243179897452·feature 8+2.34747189219164
is always satisfied, allowing us to correctly relate feature 6,feature 8, inputs count,
and inputs unique count. The same logic applies for outputs count and outputs unique count.
97
5.2 Deanonymization Algorithm
We developed an algorithm to deanonymize the dataset by estimating the number of
inputs and outputs for each transaction and revealing the transaction date.
Algorithm 5 Deanonymization of the Elliptic Dataset
Require: Dataset with features and graph information
Ensure: Deanonymized transactions
1: Calculate the in-degree and out-degree for each transaction
2: Identify the features feature 6 and feature 8 for in-degree and out-degree re-
spectively
3: Compute min diff feature 6 from the sorted feature 6 values
4: Apply the transformation:
inputs count feature 6 = feature 6 min(feature 6)
min diff feature 6 + 1
5: Fit a linear regression model to feature 6 and in-degree
STATE Apply the regression formula:
inputs count feature 6 = α·feature 6 + β
6: Repeat the steps for feature 7 and feature 8 to find outputs
7: Use the following formulas for inputs and outputs:
inputs count feature 6 = 13.33 ·feature 6 + 2.67
inputs unique count feature 8 = 11.92 ·feature 8 + 2.35
outputs count feature 7 = 50.38 ·feature 7 + 4.21
outputs unique count feature 16 = 49.40 ·feature 16 + 4.12
8: Estimate the transaction date using:
approx time = 1450468509.80488 + 1155672.19512195 ·elliptic time
9: Combine the results to deanonymize the transactions
98
5.3 Date Revelation
Once we have found the number of inputs and outputs, as well as the number of unique
inputs and unique outputs, we can find transactions where these combinations occur
only once in the entire history of Bitcoin (before the Elliptic dataset was published).
This allows us to find unique transactions and the approximate time for each elliptic
time value:
approx time = 1450468509.80488 + 1155672.19512195 ·elliptic time
Here, approx time is an approximate Unix time for each elliptic time value. Since
elliptic time varies from 1 to 49, the approx time date changes from 2016-01-01 to
2017-10-04, in increments of 1155672.19512195
3600×24 = 13.37584 days.
5.4 Additional Transaction Characteristics
We can also determine other transaction characteristics, such as fee, as well as the
total input (total in value) and output (total out value) volume in satoshis. The
following formulas are derived from correlation and linear regression:
fee feature 4 = 81341.4537626213 + 386323.710952989 ·feature 4
total out value feature 3 = 2742460603.92287 + 15853961614.9796 ·feature 3
The total input value is found from the equality:
total in value = total out value + fee
5.5 Results and Validation
The correlation between feature 6 and the in-degree is calculated, and a high cor-
relation coefficient indicates a strong linear relationship as illustrated above.
The transformation formulas and the deanonymization algorithm are statistically
justified by the high correlation between the features and the properties of the transac-
tion graph. The linear relationship confirmed by regression analysis and the consistent
scaling provided by min diff feature 6 effectively estimate the number of inputs and
99
outputs for each transaction. This method has successfully deanonymized 202,804 of
the transactions in the Elliptic dataset.
5.6 Enhanced Fraud Activity Identification on the
Deanonymized Elliptic Dataset
Following the deanonymization of the Elliptic dataset, we gained access to actual
features and transaction hashes for each transaction. This breakthrough allowed us
to implement and refine our algorithms for fraud activity identification on the Elliptic
dataset, leveraging the complete transaction information and labels now available.
5.6.1 Application of the Two-Phase Mixing Detection Algo-
rithm
We applied our two-phase mixing detection algorithm, previously introduced in Chap-
ter 3, to the deanonymized Elliptic dataset. This algorithm consists of two main
phases:
1. Candidate Transaction Identification (Algorithm 1)
2. Chain Analysis and Filtering (Algorithm 4)
5.6.2 First Phase: Candidate Transaction Identification
In the first phase, we employed Algorithm 1 to identify candidate transactions that
exhibit specific patterns indicative of potential mixing activities. This initial screening
resulted in the identification of 25,703 candidate transactions out of the total 203,769
transactions in the dataset, representing approximately 12.6% of all transactions.
5.6.3 Second Phase: Chain Analysis and Feature Extraction
For the second phase, we performed a detailed chain analysis on all 25,703 candidate
transactions identified in the first phase. This analysis involved examining the peeling
chain associated with each transaction, as illustrated in Figure 2.3. We extracted the
following features from each peeling chain:
100
1. Median Time (minutes): The median time between consecutive transaction
pairs in a peeling chain.
2. Backward Chain Length: The length of the peeling chain from the sweeper
transaction to the candidate transaction.
3. Backward Anomalies: The number of anomalies observed during backward
traversal (up to two tolerated).
4. Backward Patterns: The combination of P2SH and P2KH address patterns
observed in backward traversal.
5. Forward Chain Length: The length of the peeling chain from the candidate
transaction to either the end of the pattern or the next sweeper transaction.
6. Forward Anomalies: The number of anomalies observed during forward
traversal (up to two tolerated).
7. Forward Patterns: The frequency of each pattern (combinations of P2SH to
P2KH addresses and vice versa) observed in forward traversal.
8. Total Chain Length: The overall length of the entire chain.
9. Detection Status: A binary indicator of whether the algorithm identified the
candidate transaction as a mixing transaction after the second phase.
10. Sweeper Transaction ID: The hash of the identified sweeper transaction.
11. Sweeper Status: A binary indicator of whether the starting transaction of the
chain was detected as a sweeper.
12. Sweeper Input Count: The number of inputs in the first transaction of the
chain (potential sweeper).
13. Sweeper P2SH Output Count: The number of P2SH outputs in the first
transaction of the chain.
14. Non-P2SH Output Count: The number of non-P2SH outputs in the first
transaction of the chain.
The resultant data set that contains the peeling chains and their extracted features
was made publicly accessible through this link1.
1https://drive.google.com/drive/folders/1D9gu4NId8CIEZM0of7kaGZetCIwyhaN7
101
5.7 Integration of Chain Features into Graph Neu-
ral Networks
We incorporated these extracted chain features as additional inputs to our GAT and
GCN models. This integration aimed to enhance the models’ performance, particu-
larly in reducing false negatives, which is crucial given the extreme class imbalance
in the dataset.
5.7.1 Evaluation of Extracted Transactions
In the initial phase of our analysis, we aim to evaluate the effectiveness of our classifier
on the 25,703 transactions identified by our statistical-based algorithm, detailed in
Algorithm 4. These transactions were initially extracted through an unsupervised
method that did not utilize the labels provided by the Elliptic dataset. This approach
allows us to assess the intrinsic patterns within the data without prior knowledge of
their classifications.
Given the labels available from the Elliptic dataset, we now seek to determine
whether integrating these labels with the additional information derived from chain
features and the 18 novel features extracted through our chain peeling technique
can enhance the accuracy of our models. The central question we address is: Can
the incorporation of supervised learning elements into our previously unsupervised
framework lead to more precise classification of these transactions?
Training and Results
Following the training of models with enhanced features exclusively on selected trans-
actions, the results were exceptional. As depicted by the confusion matrices in Figure
5.2, the models demonstrated a robust capability to differentiate between illicit and
licit transactions, achieving classification accuracies of 97.62% for illicit and 97.06%
for licit transactions. These results lead to several important insights:
Despite the Elliptic dataset encompassing various scam types within its illicit
classification, our mixing detection algorithm has confirmed its efficacy, as evi-
denced by the high accuracy rates observed.
The success of the deanonymization attack on the Elliptic dataset is substanti-
ated by the alignment of our mixing detection algorithm’s results with those of
102
the dataset, indicating robust performance.
These findings not only validate the effectiveness of our model in detecting and cat-
egorizing complex transaction patterns but also suggest potential for application in
real-world financial monitoring systems to enhance security measures against illicit
activities. However, while promising, these results are specific to the conditions and
constraints of the Elliptic dataset. Further investigations with diverse datasets are
required to evaluate the generalizability of the models.
Further evaluation metrics are detailed in Table 5.3. Both GCN and GAT models
exhibited near-perfect performance metrics, reinforcing the effectiveness of the feature
enhancement and the selected training subset. The corresponding confusion matrices
are presented in Figure 5.2.
Table 5.3: Performance Metrics of GCN and GAT Models Trained and Evaluated on
the Enhanced Model and the Mixing Transactions Filtered Dataset
Metric GCN GAT
Accuracy 96.71% 97.37%
Recall 96.60% 97.34%
Precision 96.74% 97.34%
F1 Score 96.67% 97.34%
Sensitivity 96% 97%
Specificity 98% 98%
Figure 5.2: Confusion matrices for GCN and GAT models trained and evaluated on
the Enhanced Model and the Mixing Transactions Filtered Dataset. Left: GCN
model; Right: GAT model.
103
5.8 Assessment of Enhanced Dataset Across En-
tire Transaction Set
The previous analyses have demonstrated the efficacy of the mixing detection algo-
rithm applied to the Elliptic dataset. To further assess the impact of feature en-
hancement, we explored the integration of peeling chain features into the dataset.
Specifically, we augmented transactional data with these features, assigning zero val-
ues to transactions not associated with a peeling chain. This approach is predicated
on the hypothesis that such features can provide additional discriminative power to
the models, particularly in distinguishing nuanced patterns of illicit activities.
The results of incorporating these chain-based features were markedly positive,
significantly boosting the performance metrics of our models. Figure 5.3 displays the
confusion matrices for both the GCN and GAT models post-enhancement, evidenc-
ing the improved classification accuracy. Additionally, Table 5.4 provides a detailed
comparative analysis of the performance metrics, further substantiating the benefits
of feature enhancement on the overall model efficacy.
Figure 5.3: Confusion Matrix of GCN and GAT Models Trained and Evaluated on
the Enhanced and the All the Dataset Transactions. Left: GCN model; Right: GAT
model.
Prior to the feature augmentation phase, the GAT model was distinguished as the
top performer among the algorithms evaluated. This is evidenced by its performance
metrics detailed in Table 4.11 and visualized in Figure 4.8, where the GAT achieved
an F1 score of 84.99%. However, following the incorporation of peeling chain features,
a marked improvement in the model’s performance was observed. The F1 score of
104
Table 5.4: Performance Metrics of GCN and GAT Models Trained and Evaluated on
the Enhanced and the All the Dataset Transactions
Metric GCN GAT
Test Loss 0.1466 0.1248
Accuracy 95.81% 96.08%
Recall 83.43% 85.60%
Precision 91.28% 90.99%
F1 Score 86.83% 88.05%
Sensitivity 99% 99%
Specificity 68% 73%
the GAT increased to 88.05%, and the model’s ability to correctly identify illicit
transactions rose from 63.65% to 72.60%.
These enhancements in accuracy and F1 score, alongside a significant reduction in
false negatives, underscore the efficacy of integrating chain analysis features into our
graph-based models. Such modifications enable a more refined and accurate detection
of potentially fraudulent transactions, proving particularly valuable in the challenging
conditions posed by a highly imbalanced dataset.
5.9 Model Interpretability and Explainability
To bridge the gap between model performance and understandability, we aim to
enhance the efficacy and trustworthiness of fraud detection systems in the cryptocur-
rency domain. Our primary objectives in this section are:
1. To quantify the impact of individual features on fraud classification.
2. To interpret the influence of graph structures on model decisions.
5.9.1 Shapley Values
The concept of Shapley values, rooted in cooperative game theory, provides a frame-
work for fair feature attribution in complex models. For a model fand a set of
features F, the Shapley value ϕifor feature iis defined as:
ϕi=X
SF\{i}
|S|!(|F|−|S| 1)!
|F|![fS∪{i}(xS∪{i})fS(xS)],(5.1)
105
where Srepresents all possible subsets of features excluding i, and fS(xS) denotes
the model’s prediction using only the features in S.
5.9.2 SHapley Additive exPlanation (SHAP) Values
SHAP values, introduced by Lundberg and Lee [73], extend the concept of Shapley
values to provide a unified measure of feature importance. SHAP values possess
several desirable properties, including local accuracy, which ensures that the sum
of the SHAP values equals the prediction for a specific instance. They also uphold
the property of missingness, meaning that features not present in the model have no
impact. Additionally, SHAP values maintain consistency, guaranteeing that if a model
changes such that it relies more on a feature, the SHAP value for that feature will
not decrease. These properties ensure that SHAP values provide a fair and consistent
attribution of feature importance across different model types and datasets.
5.9.3 Feature Importance Analysis using SHAP
To quantify the impact of individual features on our model’s output, we employed
SHAP values. Our methodology consisted of the following steps:
1. Training a CNN on Individual Features: We trained a CNN on individual
features, deliberately excluding the graph topology. This isolation allowed us
to determine the intrinsic importance of each feature without the influence of
graph structure.
2. Utilizing TreeExplainer for Efficient Computation: To efficiently com-
pute the impact of nodes and edges on the model output, we employed the
TreeExplainer implementation of SHAP. This tool enabled us to handle the
complexities associated with the hierarchical structure of the data.
3. Aggregating SHAP Values for Global Importance: We aggregated SHAP
values across the entire test set to derive a comprehensive measure of global
feature importance. This aggregation provided insights into both individual
feature importance and the significance of graph-structural elements.
106
5.9.4 Feature Importance Analysis using GNNShap
To quantify the impact of individual features on our model’s output, we employed
SHAP values. Specifically, we used the GNNShap method as described by [25]. The
methodology is outlined as follows:
Overview of GNNShap
GNNShap is a Shapley value-based explanation method for GNNs. It aims to explain
the prediction for a target node by distributing the prediction score among various
players, which can include node features, neighboring nodes, and edges. Formally,
given a computational graph τ2(E) for a node E, the prediction ˆyfor Eis decomposed
among players. These players could be node features, neighbor nodes, or edges,
represented as ˆy=f(τ2(E), X2(E)) = Pn
i=0 ϕi, where nis the number of players, and
ϕiis the Shapley value for player i. In this context, we focus on identifying the edges
that are significant for the target node’s prediction.
The main steps in the GNNShap algorithm are as follows:
1. Pruning the Computational Graph: For a GNN with Llayers, we first
obtain the computational graph and prune redundant edges to simplify the
graph structure. This step ensures that only relevant edges are considered for
further analysis.
2. Coalition Sampling: We sample various subgraphs (coalitions) from the
pruned computational graph. This involves creating a binary mask matrix M
of size s×n, where sis the number of sampled subgraphs and nis the number
of players (edges). The entry M[i, j] = 1 indicates the presence of the j-th edge
in the i-th subgraph. Additionally, a weight vector wis generated to store the
weight of each sample.
3. Model Prediction: Using the sampled subgraphs, we compute the prediction
for the target node for each sample, resulting in a prediction vector ˆy, where
ˆy[i] stores the prediction obtained using the i-th sample. To expedite this step,
batching and parallelization techniques are employed.
4. Shapley Value Computation: Finally, we calculate the Shapley values for all
edges in the computational graph using the equation ϕ= (MTM)1MTˆy. This
107
computation assigns an importance score to each edge, reflecting its contribution
to the target node’s prediction.
The explanation subgraph τex is then constructed by selecting edges with the high-
est Shapley values, either using a top-kselection method or a predefined threshold.
It is important to note that Shapley values can be both positive and negative, so the
absolute values of the scores are used to determine importance.
By leveraging GNNShap, we can derive meaningful insights into the contributions
of different edges and node features in the GNN, enhancing the interpretability and
transparency of our models.
5.9.5 Application of Explainable AI Techniques in Our Use
Case
Now that we understand how GNNShap and SHAP work, we can utilize these tech-
niques in our use case to enhance the interpretability of our models. In our analysis,
GNNShap is used to demonstrate how the graph topology impacts model decisions,
while SHAP is employed to explain the influence of individual node features.
Interpreting Graph Topology with GNNShap
GNNShap helps us understand the contributions of different edges and node features
in our Graph Neural Network (GNN) models. It allows us to see how the topology
of the graph influences the model’s predictions. By identifying the significant edges
and their Shapley values, we can gain insights into the structural dependencies of the
model.
Analyzing Node Features with SHAP
SHAP values provide a clear picture of how individual node features impact the
model’s output. By training a Convolutional Neural Network (CNN) on the indi-
vidual features and using SHAP for feature attribution, we can isolate the intrinsic
importance of each feature. This method highlights the role of specific features in
fraud classification and reveals their direct influence on the model’s decisions.
108
Combining Insights from GNNShap and SHAP
In the following sections, we present examples that demonstrate the interpretability
achieved using both GNNShap and SHAP. We explain our GNN-based models on
the Elliptic dataset and show how our feature enhancement impacts model decisions.
By combining the insights from GNNShap and SHAP, we provide a comprehensive
understanding of both structural and feature-based influences on the model’s predic-
tions.
5.10 Elliptic Classification Explainability
Understanding the decision-making process of our GNN model is crucial for its prac-
tical application in detecting illicit Bitcoin transactions. We employ two complemen-
tary approaches: SHAP for individual feature importance and GNNSHap for explain-
ing the impact of graph structure. This comprehensive analysis provides insights into
both the feature-level and topological factors influencing the model’s predictions.
5.10.1 Global Feature Importance
We begin by examining the overall importance of features across the entire dataset
filtered to entities with non-zero chain features.
The top five most influential features, in descending order, are IsDetected?,For-
ward Anomalies,Forward Chain Length,trans feat 5, and IsSweeper.
The feature IsDetected? indicates whether a transaction is detected as licit or
illicit, where a value of zero signifies an illicit transaction and a value of one signifies a
licit transaction. The descriptions of the other features have been provided previously.
It is important to note that these rankings are derived from cases where chain
features are present. If we consider a dataset containing transactions without chain
features, the global feature importance might differ.
This ranking provides valuable insights into the model’s general decision-making
process. Notably, direct indicators of suspicious activity (e.g., ”IsDetected?”, ”For-
ward Anomalies”) have the highest impact, followed by transaction characteristics
and network structure features.
109
Figure 5.4: Global feature importance based on mean —SHAP value—
5.10.2 Feature Value Impact
To understand how specific feature values affect predictions, we analyze their SHAP
value distributions.
Key observations from Figure 5.5 include:
”IsDetected?” shows a clear binary pattern, with detected transactions strongly
associated with illicit predictions and it shows a negative correlation.
”Forward Anomalies” exhibits a negative correlation with illicit predictions as
its value increases the shape value becomes less and negative and the less sharp
110
Figure 5.5: Impact of feature values on model predictions
value is towards not classifying to the target class(illicit).
”Forward Chain Length” demonstrates a more complex relationship, with both
very low and very high values associated with illicit predictions. But it is
definitely positive correlation as the longer length shows more shap values and
more towards illicit identidication.
”IsSweeper” shows a positive correlation with illicit predictions, suggesting it’s
typically a illicit-indicating feature.
5.10.3 Case Studies
To illustrate the model’s decision-making process at an individual transaction level,
we examine two contrasting cases: an illicit and a licit transaction.
111
Illicit Transaction Case
Figure 5.6: SHAP waterfall plot for an illicit transaction prediction
For the illicit transaction (Figure 5.6):
”Forward Anomalies” is the most significant feature pushing towards illicit clas-
sification (SHAP value: 0.15).
”IsDetected?” and ”trans feat 5” also contribute positively to the illicit classi-
fication (SHAP values: 0.11 each).
Interestingly, ”IsSweeper” has a negative SHAP value (-0.14), and it is pushing
the prediction towards ”licit” but is overridden by other features in this case.
Licit Transaction Case
For the licit transaction (Figure 5.7):
The model’s base value is -0.27, indicating a slight bias towards licit classifica-
tion.
”IsDetected?” is the only feature pushing the prediction towards illicit (SHAP
value: 0.04).
112
Figure 5.7: SHAP waterfall plot for a correctly classified licit transaction
”Forward Chain Length” and ”trans feat 5” have the largest negative SHAP
values (-0.08 each), strongly indicating licit activity.
”IsSweeper” contributes significantly to the licit classification (SHAP value:
-0.06), consistent with our earlier observations.
5.10.4 Graph Structure Importance (GNNShap)
While SHAP values provide insights into feature importance, GNNShap extends this
concept to explain the importance of graph structure in the model’s decisions.
Figure 5.8 illustrates the GNNShap explanation for a specific transaction:
The central node (200) represents the transaction being classified.
Neighboring nodes are color-coded in red, indicating their importance.
Edge thickness represents the strength of influence between nodes.
Blue edges lower the probability of illicit classification, while red edges increase
it.
Figure 5.9 quantifies the impact of each edge:
113
Figure 5.8: GNNShap visualization of graph topology importance
Edges 1457 200 and 2102 200 have the largest magnitude impacts, but in
opposite directions.
The edge 37 200 has a moderate negative impact.
Edges 1451 37 and 350 200 have negligible impacts on this particular
classification.
5.10.5 Insights and Implications
The combined SHAP and GNNShap analyses provide several key insights:
1. Multi-faceted Decision Making: The model effectively balances multiple
features and graph structures to make classifications, with no single factor dom-
inating the process.
2. Contextual Importance: Features like ”IsDetected?” and ”Forward Anoma-
lies” play crucial roles, but their impact can be modulated by other features
and graph structures, demonstrating the model’s sophisticated approach.
114
Figure 5.9: GNNShap bar plot of edge importance
3. Transaction Chain Significance: The importance of chain-related features
(length and anomalies) highlights the value of considering the broader context
of each transaction.
4. Local Neighborhood Influence: GNNShap reveals that the model heavily
weighs the immediate connections of a transaction, suggesting that the charac-
teristics of directly linked transactions are crucial for classification.
5. Directional Impact: Both incoming and outgoing transactions contribute
to the classification, but their impacts can differ significantly, indicating the
model’s ability to distinguish between the roles of source and destination in
potentially illicit activities.
6. Complex Interactions: The interplay between node features and graph struc-
ture reveals complex patterns that the model has learned to recognize, justifying
the use of advanced GNN architectures for this task and the reason they perform
significantly better than classical approaches.
5.11 Discussions
These findings have several implications for both model development and practical
application:
Feature Engineering: Future efforts should focus on enhancing chain-related
115
features, detecting anomalies, and capturing important graph structural prop-
erties.
Model Robustness: The reliance on multiple features and graph structures
suggests good resilience against individual feature manipulation, which is crucial
for fraud detection systems.
Interpretability for Compliance: The SHAP and GNNShap visualizations
provide a means to understand and explain the model’s decisions, potentially
aiding in regulatory compliance and case building.
Investigative Guidance: In practical applications, these explanations could
serve as starting points for financial investigators, highlighting which transaction
features and relationships warrant closer examination.
Model Refinement: Understanding the impact of graph structures could
guide the fine-tuning of GNN architectures, potentially by adjusting the number
of graph convolution layers or attention mechanisms to better capture important
multi-hop relationships.
5.11.1 Distinctive Aspects of Mixing Transactions
It is imperative to acknowledge that while the Elliptic dataset categorizes all illicit fi-
nancial transactions uniformly, our algorithm is specifically tailored to identify mixing
transactions. This distinction is critical for several reasons:
Many illicit transactions involve mixing services, but others may utilize alter-
native anonymity-enhancing methods, such as layer-2 solutions or cross-chain
mechanisms that involve smart contract-enabled blockchains.
Legitimate users sometimes utilize mixing services to enhance their privacy,
despite their transactions being entirely lawful.
Despite these complexities, the high correlation between transactions flagged by
our unsupervised algorithm and those labeled as illicit in the Elliptic dataset strongly
indicates our approach’s effectiveness. This correlation suggests that our algorithm
is adept at pinpointing activities that are potentially fraudulent.
116
5.11.2 Importance of Peeling Chain Features
The marked improvement in our models’ performance following the incorporation of
peeling chain features elucidates a significant insight:
While GNNs inherently leverage adjacency and node feature data, they might
not fully capture the temporal dynamics and multi-hop relationships inherent
in peeling chains.
The peeling chain features that we developed encapsulate multi-level transaction
data and integrate temporal dimensions, offering a richer context not accessi-
ble through immediate adjacency alone in GNNs. Additionally, these features
are particularly valuable because peeling chains may include transactions not
present in the Elliptic dataset or omitted due to being flagged as ’Unknown’.
This inclusion significantly enriches the dataset, allowing our models to learn
from a broader spectrum of transaction patterns and potentially unidentified
fraudulent activities.
This enhancement underscores the critical role of domain-specific feature engineer-
ing as a complement to sophisticated machine learning techniques. By incorporating
peeling chain features, we provide a nuanced view of transaction patterns, enriching
the dataset with temporal and multi-hop details that significantly enhance the GNNs’
capacity to detect and analyze potentially fraudulent transactions.
5.11.3 Global Feature Importance
Figure 5.4 presents the global feature importance across our dataset. The top five
most influential features, in descending order, are IsDetected?,Forward Anomalies,
Forward Chain Length,trans feat 5, and IsSweeper.
IsDetected? is a feature that specifies if a transaction is identified as licit or illicit.
A value of zero indicates an illicit transaction, while a value of one denotes a licit
transaction. The descriptions of the other features have been provided previously.
It is important to note that these rankings are derived from cases where chain
features are present. If we consider a dataset containing transactions without chain
features, the global feature importance might differ.
117
Chapter 6
Conclusions
6.1 Summary of Findings
This dissertation has explored the intricate world of cryptocurrency transactions, fo-
cusing on the challenges of detecting illicit activities within decentralized financial
systems. The primary objectives were to develop and validate innovative method-
ologies for identifying and analyzing money laundering and fraudulent activities in
Bitcoin and Ethereum blockchain transactions. Through a comprehensive review of
existing literature, the introduction of novel analytical frameworks, and the appli-
cation of advanced machine learning techniques, this research has made significant
strides in the field of blockchain forensics.
The key findings of this research are summarized as follows:
1. Literature Review and Framework Development: The study began with
an extensive literature review to understand the current practices in cryptocur-
rency mixing services. This review informed the development of a novel ana-
lytical framework that employs statistical patterns to detect potential money
laundering activities. This framework clusters cryptocurrency addresses to un-
cover real-world identities potentially involved in illicit transactions.
2. Elliptic Dataset Analysis: Utilizing the Elliptic dataset, a graph-based rep-
resentation of Bitcoin transactions, this research applied graph neural networks
(GNNs) for subgraph labeling. This approach innovatively identified unlabeled
clusters potentially involved in suspicious activities, significantly enhancing de-
tection capabilities beyond traditional methods.
118
3. Explainable AI and Model Interpretability: A critical aspect of this re-
search was the exploration of model interpretability. By employing Explainable
AI techniques, particularly SHAP values, the study demystified the operational
dynamics of detection models. This transparency not only bolstered the relia-
bility of the findings but also laid the groundwork for future advancements in
the field.
4. Publicly Available Dataset and Tools: One of the significant outcomes of
this research is the creation of an enhanced dataset that includes community
detection and peeling chain features in addition to the original Elliptic dataset.
Furthermore, the development and publication of tools, including the ExAI
tool for model explainability, provide valuable resources for further research
and practical applications.
5. Practical Implications: The outcomes of this investigation align closely with
leading practices in legal and regulatory frameworks. The insights and tools
developed in this thesis provide invaluable resources for law enforcement and
regulatory bodies dedicated to combating digital financial crimes.
6.2 Contributions to the Field
This dissertation makes several significant contributions to the field of blockchain
forensics and anti-money laundering efforts:
1. Novel Analytical Framework: The development of a new analytical frame-
work that combines statistical analysis with clustering techniques to identify
illicit activities in Bitcoin transactions is a notable advancement. This frame-
work offers a practical and effective tool for detecting money laundering activ-
ities within blockchain networks.
2. Graph Neural Networks for Transaction Analysis: The application of
GNNs to the Elliptic dataset represents a pioneering approach to handling the
complexity and scale of blockchain transaction data. This research demon-
strated the effectiveness of GNNs in identifying suspicious clusters within Bit-
coin transactions.
119
3. Integration of Explainable AI: By integrating Explainable AI techniques,
this research has enhanced the interpretability and transparency of complex
detection models. This contribution is crucial for building trust in AI-driven
AML solutions and ensuring their adoption by regulatory and law enforcement
agencies.
4. Publicly Available Resources: The publication of an enhanced dataset,
which includes community detection and peeling chain features, along with
the development of tools like the ExAI tool for model explainability, provides
valuable resources for the research community and practitioners in the field of
blockchain forensics.
5. Practical Toolkit for Practitioners: The methodologies and tools developed
in this dissertation provide a robust, practical toolkit for practitioners in the
field of blockchain forensics. These tools are designed to bridge the gap between
technological advancements and legal compliance, setting a new benchmark in
the fight against cryptocurrency-based crime.
6.3 Implications for Practice
The findings of this research have significant implications for practice, particularly in
the areas of compliance, regulatory usage, and law enforcement:
1. Enhanced Detection Capabilities: The novel analytical framework and
GNN-based methods developed in this research offer enhanced capabilities for
detecting illicit activities in blockchain transactions. These tools can be inte-
grated into existing AML systems used by financial institutions to improve their
effectiveness in identifying and mitigating money laundering risks.
2. Improved Compliance and Regulatory Oversight: The use of Explain-
able AI techniques ensures that the detection models are transparent and their
decisions understandable. This transparency is critical for compliance with reg-
ulatory requirements, enabling financial institutions to demonstrate the efficacy
and reliability of their AML systems to regulators.
3. Law Enforcement Applications: The insights gained from this research pro-
vide law enforcement agencies with advanced tools for investigating and prose-
cuting financial crimes involving cryptocurrencies. The ability to uncover illicit
120
transaction patterns and trace funds can significantly enhance the effectiveness
of investigations into money laundering and other financial crimes.
4. Guidelines for Mixing Service Evaluation: The evaluation framework de-
veloped for assessing mixing services provides a systematic approach to analyz-
ing their effectiveness and risks. Regulatory bodies can use this framework to
evaluate the security and reliability of mixing services, ensuring that they do
not facilitate illicit activities.
6.4 Limitations & Future Research Directions
The field of blockchain and cryptocurrency continues to evolve at a rapid pace, pre-
senting both challenges and opportunities for enhancing transaction security and
fraud detection. The research presented in this dissertation has laid a solid founda-
tion for understanding and addressing the complexities of illicit transactions within
this modern financial landscape. However, as with any burgeoning technology, there
remains significant room for advancement and refinement. This chapter outlines sev-
eral promising avenues for future research, building on the methodologies and findings
of this study to further the development of more robust, efficient, and transparent
systems. These future directions are designed not only to address the current limita-
tions but also to anticipate and mitigate emerging threats in blockchain technology.
By exploring these paths, we aim to fortify the integrity of digital financial systems
and enhance their resilience against illicit activities. Here, we delve into potential
enhancements and expansions that could significantly advance the field of blockchain
forensics and cryptocurrency fraud detection. Looking ahead, several avenues can be
pursued to expand upon the research conducted in this dissertation:
1. Expansion to Other Cryptocurrencies: Future research could extend the
methodologies developed in this dissertation to other cryptocurrencies beyond
Bitcoin and Ethereum. Each cryptocurrency has unique characteristics and
transaction patterns that may require tailored detection approaches.
2. Real-Time Detection Systems: Developing real-time detection systems that
can analyze blockchain transactions as they occur is a promising area for future
research. Such systems would provide immediate alerts and allow for quicker
responses to suspicious activities.
121
3. Enhanced Privacy-Preserving Techniques: Further research into privacy-
preserving techniques that balance the need for transaction privacy with the
requirements of AML regulations is essential. Developing methods that protect
user privacy while enabling effective detection of illicit activities remains a key
challenge.
4. Cross-Chain Analysis: As blockchain technology evolves, the ability to ana-
lyze transactions across multiple blockchain networks will become increasingly
important. Future research should explore methods for cross-chain transaction
analysis to detect illicit activities spanning different blockchain platforms.
5. Regulatory and Ethical Considerations: Further research into the regula-
tory and ethical implications of using AI in financial surveillance is necessary.
This includes studying the impact of such technologies on privacy rights and
developing frameworks that balance security needs with individual freedoms.
6. Deployment and Validation in Real-World Scenarios: Lastly, deploying
the developed models in real-world scenarios and conducting extensive valida-
tion studies would provide deeper insights into their practical effectiveness and
areas for improvement. This step is crucial for transitioning from theoretical
models to practical applications that can make a tangible impact in combating
cryptocurrency-related crimes.
122
Bibliography
[1] Aave: The money market protocol. https://aave.com/.
[2] Compound. https://compound.finance/.
[3] Cyclone webpage. http://www.cyclone.xyz/. Site was under maintenance at
2023-09-29.
[4] Railgun webpage. https://www.railgun.org/. Accessed: 2023-09-29.
[5] Typhoon cash webpage. https://typhoon.cash/. Accessed: 2023-09-29.
[6] Walletexplorer. https://walletexplorer.com/. Accessed: 2023-09-29.
[7] Zkbob webpage. https://www.zkbob.com/. Accessed: 2023-09-29.
[8] Shapeshift webpage. https://shapeshift.io, 2014. Accessed: 2023-09-29.
[9] Cryptomixer webpage, https://cryptomixer.ltd/. https://cryptomixer.io/,
2016. Accessed: 2023-09-29.
[10] Alphabay crime report. https://bitcointalk.org/index.php?topic=
2040756.0, 2017. Accessed: 2023-09-29.
[11] Bitmix webpage. bitmix.bz, 2017. Accessed: 2023-09-29.
[12] Blender webpage. https://blendor.io/, 2017. Accessed: 2023-09-29.
[13] Joinmarket official repository. https://github.com/JoinMarket-Org/
joinmarket-clientserver, 2017. Accessed: 2023-09-29.
[14] How dirty money disappears into the black hole
of cryptocurrency. https://wsj.com/articles/
how-dirty-money-disappears-into-the-black-hole-of-cryptocurrency-1538149743,
2018. Accessed: 2023-09-29.
123
[15] Mixtum webpage. https://mixtum.io/, 2018. Accessed: 2023-09-29.
[16] Wassabi wallet github page. https://github.com/zkSNACKs/WabiSabi/,
2018. Accessed: 2023-09-29.
[17] Chipmixer seizure report. https://justice.gov/opa/pr/
justice-department-investigation-leads-takedown-darknet-cryptocurrency-mixer-processed-over-3,
2019. Accessed: 2023-09-29.
[18] Deanonymization of elliptic dataset transactions. https://habr.com/ru/
articles/479178/, 2019. Accessed: 2024-02-14.
[19] Soduku wallet publication announcement page. https://bitcointalk.org/
index.php?topic=5194344.0, 2019. Accessed: 2023-09-29.
[20] Tornado cash official repository. https://github.com/tornadocash/
tornado-core, 2019. Accessed: 2023-09-29.
[21] Operator of helix bitcoin ‘mixer’ pleads guilty. https://wsj.com/articles/
operator-of-helix-bitcoin-mixer-pleads-guilty-11629328791, 2021.
Accessed: 2023-09-29.
[22] Tornado cash seizure report. https://home.treasury.gov/news/
press-releases/jy0916, 2022. Accessed: 2023-09-29.
[23] Chainalysis crypto crime report, https://go.chainalysis.com/2023-crypto-crime-
report.html. 2023.
[24] M. Ahang, M. Jalayer, A. Shojaeinasab, O. Ogunfowora, T. Charter, and H. Na-
jjaran. Synthesizing rolling bearing fault samples in new conditions: A frame-
work based on a modified cgan. Sensors, 22:5413, 2022.
[25] Selahattin Akkas and Ariful Azad. Gnnshap: Scalable and accurate gnn expla-
nation using shapley values. In Proceedings of the ACM on Web Conference
2024, pages 827–838, 2024.
[26] Kurt M Alonso et al. Zero to monero. Zero to monero, 2020.
[27] Ittai Anati, Shay Gueron, Simon Johnson, and Vincent Scarlata. Innovative
technology for cpu based attestation and sealing. In Proceedings of the 2nd
124
international workshop on hardware and architectural support for security and
privacy, volume 13. ACM New York, NY, USA, 2013.
[28] Alireza Arbabi, Ardeshir Shojaeinasab, Behnam Bahrak, and Homayoun Na-
jjaran. Mixing solutions in bitcoin and ethereum ecosystems: A review and
tutorial. arXiv preprint arXiv:2310.04899, 2023.
[29] Architecure ARM. Security technology building a secure system using trustzone
technology (white paper). ARM Limited, 2009.
[30] Research Fraud Team at Chainanalysis Co. Everything you need to know about
cryptocurrency-based crime; the 2021 crypto crime report, January 2021.
[31] Research Fraud Team at Chainanalysis Co. Everything you need to know about
cryptocurrency-based crime; the 2022 crypto crime report, February 2022.
[32] Simon Barber, Xavier Boyen, Elaine Shi, and Ersin Uzun. Bitter to better—how
to make bitcoin a better currency. In Financial Cryptography and Data Security:
16th International Conference, FC 2012, Kralendijk, Bonaire, Februray 27-
March 2, 2012, Revised Selected Papers 16, pages 399–414. Springer, 2012.
[33] Tom Josua BARBEREAU, Egor Ermolaev, Martin Brennecke, Eduard
Hartwich, and Johannes Sedlmeir. Beyond a fistful of tumblers: Toward a
taxonomy of ethereum-based mixers. In 44th International Conference on In-
formation Systems. AIS Library, 2023.
[34] A. Biryukov, D. Khovratovich, and I. Pustogarov. Deanonymisation of clients
in bitcoin p2p network. Proceedings of the 2014 ACM SIGSAC Conference on
Computer and Communications Security, 2014.
[35] Alex Biryukov, Dmitry Khovratovich, and Ivan Pustogarov. Deanonymisation
of clients in bitcoin p2p network. In Proceedings of the 2014 ACM SIGSAC
Conference on Computer and Communications Security, CCS ’14, page 15–29,
New York, NY, USA, 2014. Association for Computing Machinery.
[36] Alex Biryukov and Sergei Tikhomirov. Deanonymization and linkability of
cryptocurrency transactions based on network analysis. In 2019 IEEE European
Symposium on Security and Privacy (EuroS P), pages 172–184, 2019.
125
[37] George Bissias, A Pinar Ozisik, Brian N Levine, and Marc Liberatore. Sybil-
resistant mixing for bitcoin. In Proceedings of the 13th Workshop on Privacy
in the Electronic Society, pages 149–158, 2014.
[38] Stefano Bistarelli, Matteo Parroccini, Francesco Santini, et al. Visualizing bit-
coin flows of ransomware: Wannacry one week later. In ITASEC, 2018.
[39] Blockstream.info. Bitcoin coin mixing data set with rule-based label. https://
github.com/sxwxs/BitcoinCoinMixingDataSetWithRuleBasedLabel, 2023.
Dataset contains transactions between block 270,000 (2013-11-16) and block
300,000 (2014-05-10), accessed: 2023-09-29.
[40] Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne
Lefebvre. Fast unfolding of communities in large networks. Journal of sta-
tistical mechanics: theory and experiment, 2008(10):P10008, 2008.
[41] Joseph Bonneau, Arvind Narayanan, Andrew Miller, Jeremy Clark, Joshua A
Kroll, and Edward W Felten. Mixcoin: Anonymity for bitcoin with accountable
mixes. In Financial Cryptography and Data Security: 18th International Con-
ference, FC 2014, Christ Church, Barbados, March 3-7, 2014, Revised Selected
Papers 18, pages 486–504. Springer, 2014.
[42] Benedikt B¨unz, Shashank Agrawal, Mahdi Zamani, and Dan Boneh. Zether:
Towards privacy in a smart contract world. In International Conference on
Financial Cryptography and Data Security, pages 423–443. Springer, 2020.
[43] Benedikt B¨unz, Shashank Agrawal, Mahdi Zamani, and Dan Boneh. Zether:
Towards privacy in a smart contract world. In International Conference on
Financial Cryptography and Data Security, pages 423–443. Springer, 2020.
[44] Vitalik Buterin et al. A next-generation smart contract and decentralized ap-
plication platform. white paper, 3(37):2–1, 2014.
[45] James Carlyle, Tudor Malene, Cais Manai, Neal Shah, Gavin Thomas, and
Roger Willis. Obscuro whitepaper. 2022.
[46] C. Chainalysis. Money laundering and cryptocurrency: Trends and new tech-
niques for detection and investigation. Technical report, Chainalysis, New York
City, New York. n.d.
126
[47] Tao-Hung Chang and Davor Svetinovic. Improving bitcoin ownership identi-
fication using transaction patterns analysis. IEEE Transactions on Systems,
Man, and Cybernetics: Systems, 50(1):9–20, 2018.
[48] Tao-Hung Chang and Davor Svetinovic. Improving bitcoin ownership identi-
fication using transaction patterns analysis. IEEE Transactions on Systems,
Man, and Cybernetics: Systems, 50(1):9–20, 2020.
[49] David Chaum. Blind signature system. In Advances in Cryptology: Proceedings
of Crypto 83, pages 153–153. Springer, 1983.
[50] Nicolas Christin. Traveling the silk road: A measurement analysis of a large
anonymous online marketplace. In Proceedings of the 22nd international con-
ference on World Wide Web, pages 213–224, 2013.
[51] Mauro Conti, E Sandeep Kumar, Chhagan Lal, and Sushmita Ruj. A survey
on security and privacy issues of bitcoin. IEEE communications surveys &
tutorials, 20(4):3416–3452, 2018.
[52] Thibault de Balthasar and Julio Hernandez-Castro. An analysis of bitcoin
laundry services. In Secure IT Systems: 22nd Nordic Conference, NordSec
2017, Tartu, Estonia, November 8–10, 2017, Proceedings 22, pages 297–312.
Springer, 2017.
[53] Christian Decker and Roger Wattenhofer. A fast and scalable payment network
with bitcoin duplex micropayment channels. In Stabilization, Safety, and Secu-
rity of Distributed Systems: 17th International Symposium, SSS 2015, Edmon-
ton, AB, Canada, August 18-21, 2015, Proceedings 17, pages 3–18. Springer,
2015.
[54] Shayan Eskandari, Seyedehmahsa Moosavi, and Jeremy Clark. Sok: Transpar-
ent dishonesty: front-running attacks on blockchain. In Financial Cryptography
and Data Security: FC 2019 International Workshops, VOTING and WTSC,
St. Kitts, St. Kitts and Nevis, February 18–22, 2019, Revised Selected Papers
23, pages 170–189. Springer, 2020.
[55] Qi Feng, Debiao He, Sherali Zeadally, Muhammad Khurram Khan, and Neeraj
Kumar. A survey on privacy protection in blockchain system. Journal of net-
work and computer applications, 126:45–58, 2019.
127
[56] Uriel Fiege, Amos Fiat, and Adi Shamir. Zero knowledge proofs of identity. In
Proceedings of the nineteenth annual ACM symposium on Theory of computing,
pages 210–217, 1987.
[57] Akshay Gaikwad. Overview of blockchain. International Journal for Research
in Applied Science and Engineering Technology, 2020.
[58] Arunima Ghosh, Shashank Gupta, Amit Dua, and Neeraj Kumar. Security of
cryptocurrencies in blockchain technology: State-of-art, challenges and future
prospects. Journal of Network and Computer Applications, 163:102635, 2020.
[59] Yanan Gong, Kam Pui Chow, Siu Ming Yiu, and Hing Fung Ting. Analyzing the
peeling chain patterns on the bitcoin blockchain. Forensic Science International:
Digital Investigation, 46:301614, 2023.
[60] Martin Harrigan and Christoph Fretter. The unreasonable effectiveness
of address clustering. In 2016 intl ieee conferences on ubiquitous intel-
ligence & computing, advanced and trusted computing, scalable computing
and communications, cloud and big data computing, internet of people, and
smart world congress (uic/atc/scalcom/cbdcom/iop/smartworld), pages 368–
373. IEEE, 2016.
[61] Ethan Heilman, Leen Alshenibr, Foteini Baldimtsi, Alessandra Scafuro, and
Sharon Goldberg. Tumblebit: An untrusted bitcoin-compatible anonymous
payment hub. In Network and Distributed System Security Symposium, 2017.
[62] Ethan Heilman, Foteini Baldimtsi, and Sharon Goldberg. Blindly signed con-
tracts: Anonymous on-blockchain and off-blockchain bitcoin transactions. In
International conference on financial cryptography and data security, pages 43–
60. Springer, 2016.
[63] Maged Hamada Ibrahim. Securecoin: a robust secure and efficient protocol for
anonymous bitcoin ecosystem.
[64] CypherTrace Inc. Cryptocurrency crime and anti-money laundering report.
Technical report, CypherTrace Inc., March 2023. 14th report since 2018.
[65] Elliptic Inc. Tornado cash alternatives report. Technical report, Elliptic Inc.,
2022.
128
[66] Ingemar Ingemarsson and Gustavus J Simmons. A protocol to set up shared
secret schemes without the assistance of a mutually trusted party. In Workshop
on the Theory and Application of of Cryptographic Techniques, pages 266–282.
Springer, 1990.
[67] M. Jalayer, A. Shojaeinasab, and H. Najjaran. A model identification forensics
approach for signal-based condition monitoring. In International Conference
On Flexible Automation and Intelligent Manufacturing, pages 12–19, Cham,
Switzerland, 2023. Springer Nature.
[68] Maziyar Khadivi, Todd Charter, Marjan Yaghoubi, Masoud Jalayer, Maryam
Ahang, Ardeshir Shojaeinasab, and Homayoun Najjaran. Deep reinforcement
learning for machine scheduling: Methodology, the state-of-the-art, and future
directions. arXiv preprint arXiv:2310.03195, 2023.
[69] Merve Can Kus Khalilov and Albert Levi. A survey on anonymity and privacy
in bitcoin-like digital cash systems. IEEE Communications Surveys & Tutorials,
20(3):2543–2585, 2018.
[70] Sunny King and Scott Nadal. Ppcoin: Peer-to-peer crypto-currency with proof-
of-stake. 2012.
[71] Anupam Kumar, D. Choudhary, M. Raju, D. K. Chaudhary, and R. Sagar.
Combating counterfeit drugs: A quantitative analysis on cracking down the
fake drug industry by using blockchain technology. 2019 9th International Con-
ference on Cloud Computing, Data Science & Engineering (Confluence), pages
174–178, 2019.
[72] Duc V Le and Arthur Gervais. Amr: Autonomous coin mixer with privacy
preserving reward distribution. In Proceedings of the 3rd ACM Conference on
Advances in Financial Technologies, pages 142–155, 2021.
[73] S. M. Lundberg and S. I. Lee. A unified approach to interpreting model pre-
dictions. In Advances in neural information processing systems, volume 30,
2017.
[74] Gregory Maxwell. Coinjoin: Bitcoin privacy for the real world (2013),
https://bitcointalk.org/index.php?topic=279249. 2013.
129
[75] Frank McKeen, Ilya Alexandrovich, Alex Berenzon, Carlos V Rozas, Hisham
Shafi, Vedvyas Shanbhogue, and Uday R Savagaonkar. Innovative instructions
and software model for isolated execution. Hasp@ isca, 10(1), 2013.
[76] Sarah Meiklejohn and Rebekah Mercer. obius: Trustless tumbling for trans-
action privacy. 2018.
[77] Malte Moser. Anonymity of bitcoin transactions. 2013.
[78] Satoshi Nakamoto. Bitcoin: A peer-to-peer electronic cash system. Decentral-
ized business review, 2008.
[79] Lihao Nan and Dacheng Tao. Bitcoin mixing detection using deep autoencoder.
In 2018 IEEE Third international conference on data science in cyberspace
(DSC), pages 280–287. IEEE, 2018.
[80] National Institute of Standards and Technology. Secure hash standard (shs).
Technical report, 2012.
[81] C Andrew Neff. A verifiable secret shuffle and its application to e-voting. In
Proceedings of the 8th ACM conference on Computer and Communications Se-
curity, pages 116–125, 2001.
[82] Novetta Security Co. Survey of bitcoin mixing services: Tracing anonymous
bitcoins. white paper, 2015.
[83] U.S. Department of Justice. Individual arrested and charged with operating
notorious darknet cryptocurrency “mixer”. https://justice.gov/opa/pr/
individual-arrested-and-charged-operating-notorious-darknet-cryptocurrency-mixer,
2021. Accessed: 2023-09-29.
[84] Jaswant Pakki, Yan Shoshitaishvili, Ruoyu Wang, Tiffany Bao, and Adam
Doup´e. Everything you ever wanted to know about bitcoin mixers (but were
afraid to ask). In Financial Cryptography and Data Security: 25th Interna-
tional Conference, FC 2021, Virtual Event, March 1–5, 2021, Revised Selected
Papers, Part I 25, pages 117–146. Springer, 2021.
[85] Alexey Pertsev, Roman Semenov, and Roman Storm. Tornado cash privacy
solution version 1.4. 2019.
130
[86] Joseph Poon and Thaddeus Dryja. The bitcoin lightning network: Scalable
off-chain instant payments. 2016.
[87] Joseph Poon and Thaddeus Dryja. The bitcoin lightning network: Scalable
off-chain instant payments. 2016.
[88] Madana Prathap. Operator of helix bitcoin ‘mixer’ pleads
guilty. https://businessinsider.in/investment/news/
bitcoin-mixing-service-helix-pleads-guilty-to-laundering-300-million-will-forfeit-over-4400-bitcoins/
articleshow/85482090.cms, 2021. Accessed: 2023-09-29.
[89] Ronald L Rivest, Adi Shamir, and Yael Tauman. How to leak a secret. In
Advances in Cryptology—ASIACRYPT 2001: 7th International Conference on
the Theory and Application of Cryptology and Information Security Gold Coast,
Australia, December 9–13, 2001 Proceedings 7, pages 552–565. Springer, 2001.
[90] Tim Ruffing, Pedro Moreno-Sanchez, and Aniket Kate. Coinshuffle: Prac-
tical decentralized coin mixing for bitcoin. In Computer Security-ESORICS
2014: 19th European Symposium on Research in Computer Security, Wroclaw,
Poland, September 7-11, 2014. Proceedings, Part II 19, pages 345–364. Springer,
2014.
[91] Fabian Sch¨ar. Decentralized finance: On blockchain- and smart contract-based
financial markets. Federal Reserve Bank of St. Louis Review, 103(1):37–54,
2021.
[92] Istv´an Andr´as Seres, aniel A Nagy, Chris Buckland, and eter Burcsi. Mixeth:
efficient, trustless coin mixing service for ethereum. Cryptology ePrint Archive,
2019.
[93] A. Shojaeinasab, T. Charter, M. Jalayer, M. Khadivi, O. Ogunfowora,
N. Raiyani, M. Yaghoubi, and H. Najjaran. Intelligent manufacturing execution
systems: A systematic review. J. Manuf. Syst., 62:503–522, 2022.
[94] Ardeshir Shojaeinasab, Masoud Jalayer, Amirali Baniasadi, and Homayoun Na-
jjaran. Unveiling the black box: a unified xai framework for signal-based deep
learning models. Machines, 12(2):121, 2024.
131
[95] Ardeshir Shojaeinasab, Amir Pasha Motamed, and Behnam Bahrak. Mixing
detection on bitcoin transactions using statistical patterns. IET Blockchain,
2022.
[96] Yonatan Sompolinsky and Aviv Zohar. Secure high-rate transaction processing
in bitcoin. In Financial Cryptography and Data Security: 19th International
Conference, FC 2015, San Juan, Puerto Rico, January 26-30, 2015, Revised
Selected Papers 19, pages 507–527. Springer, 2015.
[97] Xiaowen Sun, Tan Yang, and Bo Hu. Lstm-tc: Bitcoin coin mixing detection
method with a high recall. Applied Intelligence, 52(1):780–793, 2022.
[98] Muoi Tran, Loi Luu, Min Suk Kang, Iddo Bentov, and Prateek Saxena. Ob-
scuro: A bitcoin mixer using trusted execution environments. In Proceedings
of the 34th Annual Computer Security Applications Conference, pages 692–701,
2018.
[99] Luke Valenta and Brendan Rowan. Blindcoin: Blinded, accountable mixes for
bitcoin. In Financial Cryptography and Data Security: FC 2015 International
Workshops, BITCOIN, WAHC, and Wearable, San Juan, Puerto Rico, January
30, 2015, Revised Selected Papers, pages 112–126. Springer, 2015.
[100] Zhipeng Wang, Stefanos Chaliasos, Kaihua Qin, Liyi Zhou, Lifeng Gao, Pascal
Berrang, Benjamin Livshits, and Arthur Gervais. On how zero-knowledge proof
blockchain mixers improve, and worsen user privacy. In Proceedings of the ACM
Web Conference 2023, pages 2022–2032, 2023.
[101] Jiajing Wu, Jieli Liu, Weili Chen, Huawei Huang, Zibin Zheng, and Yan Zhang.
Detecting mixing services via mining bitcoin transaction network with hy-
brid motifs. IEEE Transactions on Systems, Man, and Cybernetics: Systems,
52(4):2237–2249, 2021.
[102] Jiajing Wu, Jieli Liu, Yijing Zhao, and Zibin Zheng. Analysis of cryptocurrency
transactions from a network perspective: An overview. Journal of Network and
Computer Applications, 190:103139, 2021.
[103] Lei Wu, Yufeng Hu, Yajin Zhou, Haoyu Wang, Xiapu Luo, Zhi Wang, Fan
Zhang, and Kui Ren. Towards understanding and demystifying bitcoin mixing
services. In Proceedings of the Web Conference 2021, pages 33–44, 2021.
132
[104] Fengwei Zhang and Hongwei Zhang. Sok: A study of using hardware-assisted
isolated execution environments for security. In Proceedings of the Hardware
and Architectural Support for Security and Privacy 2016, pages 1–8. 2016.
[105] Jan Henrik Ziegeldorf, Fred Grossmann, Martin Henze, Nicolas Inden, and
Klaus Wehrle. Coinparty: Secure multi-party mixing of bitcoins. In Proceedings
of the 5th ACM Conference on Data and Application Security and Privacy,
pages 75–86, 2015.