Near Real-Time Ethereum Fraud Detection Using Explainable AI in Blockchain Networks PDF Free Download

Name: Near Real-Time Ethereum Fraud Detection Using Explainable AI in Blockchain Networks PDF
Author: Amanda Boyer

1 / 21

0 views•21 pages

Near Real-Time Ethereum Fraud Detection Using Explainable AI in Blockchain Networks PDF Free Download

Near Real-Time Ethereum Fraud Detection Using Explainable AI in Blockchain Networks PDF free Download. Think more deeply and widely.

Academic Editor: George Drosatos

Received: 28 August 2025

Revised: 27 September 2025

Accepted: 7 October 2025

Published: 9 October 2025

Citation: Ertam, F. Near Real-Time

Ethereum Fraud Detection Using

Explainable AI in Blockchain

Networks. Appl. Sci. 2025,15, 10841.

https://doi.org/10.3390/

app151910841

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license

(https://creativecommons.org/

licenses/by/4.0/).

Article

Near Real-Time Ethereum Fraud Detection Using Explainable AI

in Blockchain Networks

Fatih Ertam

Department of Digital Forensics Engineering, Technology Faculty, Firat University, 23200 Elazı˘g, Türkiye;

fatih.ertam@ﬁrat.edu.tr

Abstract

Blockchain technologies have profoundly transformed information systems by provid-

ing decentralized infrastructures that enhance transparency, security, and traceability.

Ethereum, in particular, supports smart contracts and facilitates the development of de-

centralized ﬁnance (DeFi), non-fungible tokens (NFTs), and Web3 applications. However,

its openness also enables illicit activities, including fraud and money laundering, through

anonymous wallets. Identifying wallets involved in large transfers or abnormal transac-

tional patterns is therefore critical to ecosystem security. This study proposes an AI-based

framework employing XGBoost, LightGBM, and CatBoost to detect suspicious Ethereum

wallets, achieving test accuracies between 95.83% and 96.46%. The system provides near

real-time predictions for individual or recent wallet addresses using a pre-trained XGBoost

model. To improve interpretability, SHAP (SHapley Additive exPlanations) visualizations

are integrated, highlighting the contribution of each feature. The results demonstrate the

effectiveness of AI-driven methods in monitoring and securing Ethereum transactions

against fraudulent activities.

Keywords: blockchain; cryptocurrency forensics ; ethereum; explainable AI; fraud detection

1. Introduction

The advent of blockchain technology has precipitated a paradigm shift within nu-

merous industries, including ﬁnance, supply chain management, healthcare, and digital

identity systems [

]. This technology introduces a decentralised, transparent, and tamper-

resistant framework for the recording and veriﬁcation of transactions [

]. In contradistinc-

tion to conventional centralized systems, in which a sole authority is responsible for the

maintenance and validation of records, blockchain relies on a distributed ledger that is

collectively maintained by a network of nodes. This decentralised consensus mechanism

has been demonstrated to mitigate the risk of single points of failure, whilst simultaneously

enhancing data integrity and auditability across untrusted environments [

]. Among the

diverse blockchain platforms that have been developed, Ethereum distinguishes itself as

a second-generation blockchain that extends beyond simple value transfer by offering a

Turing-complete programming environment for deploying smart contracts—self-executing

agreements encoded directly onto the blockchain [

]. The utilisation of smart contracts

facilitate the creation of decentralised applications (dApps), which are characterised by

their ability to function without the involvement of intermediaries and to execute complex

logic in a deterministic and trustless manner. Consequently, Ethereum has emerged as

the foundational infrastructure for a diverse range of decentralised ﬁnance (DeFi) proto-

cols, non-fungible token (NFT) ecosystems, and autonomous governance models. This

Appl. Sci. 2025,15, 10841 https://doi.org/10.3390/app151910841

Appl. Sci. 2025,15, 10841 2 of 21

positions it as a pivotal element in shaping the future of internet-based services [

]. The

introduction of the concept of a programmable blockchain by Ethereum resulted in a

substantial expansion of the functional scope of distributed ledger technology, thereby

enabling developers to construct decentralised applications (dApps) that exceed the limita-

tions of basic peer-to-peer ﬁnancial transactions [

]. The integration of a Turing-complete

virtual machine, designated as the Ethereum Virtual Machine (EVM), enables the deploy-

ment of sophisticated smart contracts capable of executing conditional logic, managing

digital assets and automating multi-step workﬂows in a trustless and transparent environ-

ment [

]. This paradigm shift has laid the foundation for novel application domains such

as decentralised ﬁnance (DeFi), tokenised assets, supply chain automation, decentralised

autonomous organisations (DAOs), and identity management systems [

]. These domains

leverage Ethereum’s programmable infrastructure to eliminate intermediaries, reduce oper-

ational costs, and enhance system resilience [

]. The accelerated growth in the adoption

and market capitalisation of Ethereum has not only attracted legitimate innovation but

also drawn the attention of malicious actors seeking to exploit the platform for ﬁnancial

gain. As Ethereum continues to serve as the foundational infrastructure for a vast array of

decentralised applications, it has become increasingly susceptible to diverse forms of fraud

and abuse. Within the context of the Ethereum ecosystem, the prevalence of fraudulent

activities encompasses phishing attacks, Ponzi schemes, transaction manipulation, and

the deployment of counterfeit decentralised applications (dApps). In the context of the

blockchain ecosystem, phishing has been identiﬁed as the most prevalent and damaging

form of attack, accounting for approximately 50% of all malicious incidents [

]. These

fraudulent schemes typically employ deceptive tactics, such as the creation of fake websites,

emails, or messaging platforms, with the aim of deceiving users into disclosing private

credentials, including seed phrases or private keys. This ultimately results in unautho-

rised access to their digital wallets and the theft of crypto-assets. Ethereum, a prominent

cryptocurrency, was originally developed by Vitalik Buterin. Ethereum functions as a

decentralised digital asset transfer system, allowing individuals to send cryptocurrency to

others for a minimal transaction fee. Irrespective of geographical location or background,

Ethereum ensures secure, consistent, and cost-effective participation in digital transactions

on a global scale. Ethereum’s decentralised architecture and the anonymity it affords have

led to its emergence as a highly effective medium for cryptocurrency transactions. This

has, in turn, rendered it an attractive tool for criminal networks seeking to conduct money

laundering and other illicit ﬁnancial activities [

]. The Ethereum ecosystem has recently

experienced a notable surge in fraudulent activities, driven by the increasing sophistication

of cybercriminal tactics and the integration of advanced technologies. As stated in the

2025 Crypto Crime Report by Chainalysis, the estimated value of illicit cryptocurrency

transactions in 2024 was USD 40.9 billion. This ﬁgure is predicted to exceed USD 51 billion

as more illicit addresses are identiﬁed [

]. Furthermore, the Ethereum network has been

subject to advanced phishing techniques, including payload-based transaction phishing

(PTXPhish). This method involves the manipulation of smart contract interactions through

the use of malicious payloads, with the objective of deceiving users. A thoroughgoing

investigation has revealed more than 130,000 PTXPhish transactions on the Ethereum

blockchain, resulting in ﬁnancial losses in excess of USD 341.9 million [

]. These de-

velopments highlight the pressing need for effective detection and mitigation strategies

within the Ethereum ecosystem. The implementation of advanced security measures, user

education, and continuous monitoring are of critical importance in ensuring the protection

of users and maintaining trust in decentralised platforms.

Appl. Sci. 2025,15, 10841 3 of 21

The primary contributions of this study are outlined as follows:

•

An artificial intelligence model was developed using labeled data from publicly avail-

able blockchain datasets. This model extracts behavioral and transactional features of

individual wallet addresses in near real time, and subsequently to classify them as either

suspicious or benign based on patterns of fraudulent activity that have been learned.

•

A near real-time monitoring framework was implemented for the identiﬁcation and

analysis of recently active wallet addresses. The system is designed to ingest on-chain

transaction data in a continuous manner, with the capacity to detect newly active

wallets. Utilizing a trained model, it is then able to evaluate the likelihood of these

wallets being involved in illicit activities.

•

In order to enhance the interpretability of the model and thus support trust in auto-

mated decision-making processes, explainable artiﬁcial intelligence (XAI) techniques

were incorporated. These techniques facilitate the attribution of model predictions

to speciﬁc features or behaviors, thereby providing transparency into the rationale

behind the classiﬁcation of a wallet as suspicious.

The remainder of this paper is organized as follows. Section 2reviews recent studies

on ethereum based fraud detection, highlighting methodological advances and existing

research gaps. Section 3describes the materials and methods employed in this study,

including dataset construction, feature engineering, and model development procedures.

Section 4presents and discusses the experimental results, emphasizing model performance,

feature relevance, and comparative analyses. Section 5outlines the main limitations of

the current study, providing context for result interpretation. Section 6discusses poten-

tial directions for extending this research, such as multiple blockchain networks deploy-

ment and integration with blockchain monitoring systems. Finally, Section 7concludes

the paper by summarizing the key ﬁndings and their implications for future blockchain

security research.

2. Related Works

Numerous studies have been conducted to detect fraudulent activities within the

Ethereum network, employing a variety of machine learning algorithms and classiﬁcation

methods. A selection of signiﬁcant contributions is outlined below.

2.1. Classical ML Approaches

Aziz et al. [

] investigated Ethereum fraud detection using various machine learning

techniques, including RF, MLP, and ensemble methods, on a dataset with limited attributes.

LGBM outperformed other models, achieving 98.60% accuracy, which improved to 99.03%

after hyperparameter tuning. Results were also compared with other boosting algorithms

such as XGBoost.

Steven et al. [

] focused on identifying malicious accounts involved in Ethereum trans-

actions. They utilized the XGBoost algorithm and evaluated its performance using tenfold

cross-validation. The model achieved a classification accuracy of 96.3%, and the study high-

lighted the three most influential features contributing to the model’s decision-making process.

Ravindranath et al. [

] evaluated ensemble learning models for detecting fraud in

the Ethereum network. CATBoost and LightGBM showed strong performance, achieving

97–98.42% accuracy with oversampling. High F1 and AUC scores indicated reliable detec-

tion without overﬁtting. Among the tested methods, K-Means SMOTE yielded the best

results, with 98.42% accuracy and a 99.82% AUC. These ﬁndings highlight the effectiveness

of ensemble models and advanced resampling in crypto fraud detection.

Dahiya et al. [

] proposed a neural network-based model for the detection of fraudu-

lent transactions on the Ethereum blockchain. The performance of the model was bench-

Appl. Sci. 2025,15, 10841 4 of 21

marked against several traditional machine learning classiﬁers, including Logistic Regres-

sion, Support Vector Machine (SVM), Gaussian Naive Bayes, and K-Nearest Neighbours.

Among all models that were evaluated, the neural network demonstrated the highest level

of accuracy, achieving 97.09%. This result indicates that the neural network possesses

a superior capacity to capture and learn complex data patterns. The ﬁndings empha-

sise the efﬁcacy of neural networks in differentiating between authentic and fraudulent

Ethereum transactions.

2.2. Self-Supervised and Deep Learning Methods

Teng et al. [

] proposed a novel method for identifying anomalous smart contracts on

the Ethereum platform. Their approach involves extracting transaction patterns through a data

slicing technique, followed by training a detection model using LSTM networks. The results

demonstrated high precision in distinguishing anomalous contracts from legitimate ones.

Ehsan et al. [

] aimed to identify malicious actors and categorize attacks based on

behavior. They built a dataset from illicit Ethereum activities and applied feature selection

methods such as PCA, Information Gain, and Ridge Regression. Classification using Light-

GBM, XGBoost, and others showed that models with Information Gain and LGBM/XGBoost

reached 98% accuracy. XGBoost also completed analysis in 13.72 s. Additionally, the study

improved blockchain security by categorizing fraud types, enhancing network reliability.

Liu et al. [

] introduced S_HGTNs, a framework for detecting anomalies in Ethereum

smart contracts, focusing on ﬁnancial fraud. It builds a Heterogeneous Information Net-

work (HIN) from contract features, learns a relational matrix via a transformer, and classiﬁes

using node embeddings. Experiments show that the model outperforms traditional meth-

ods with higher accuracy and low variance, conﬁrming its robustness and effectiveness.

2.3. Graph-Based Techniques

Tan et al. [

] proposed a fraud detection method on Ethereum by analyzing trans-

action records and using web crawlers to obtain labelled fraudulent addresses. These

were used to reconstruct a transaction network, from which features were extracted via an

amount-based network embedding. A Graph Convolutional Network (GCN) then classiﬁed

addresses as legitimate or fraudulent. The system achieved 95% accuracy, demonstrating

strong performance in identifying fraud.

Jin et al. [

] introduced Meta-IFD (Meta-Interaction-based Fraud Detection), an

Ethereum fraud detection framework based on meta-interaction concepts. It combines

generative and contrastive self-supervision to reﬁne behavioral features and distinguish

activity types. Using multi-view feature learning, Meta-IFD captures rich behavioral

representations to detect fraud such as Ponzi schemes and phishing. Evaluations on

real Ethereum data show its robustness and high accuracy, with the generative module

addressing class imbalance and the contrastive module improving proﬁle discrimination.

Tan et al. [

] proposed a framework for detecting fraudulent Ethereum transactions

through analysis of transaction records. Labelled addresses were collected using web

crawlers and used to build a transaction network from the public ledger. A network

embedding method was employed to extract node features, which were then classiﬁed by a

Graph Convolutional Network (GCN). The system achieved 96% accuracy, demonstrating

its effectiveness in fraud detection on the Ethereum blockchain.

Given the rapid growth of blockchain technology and cryptocurrencies, phishing scams

have emerged as a significant threat to transaction security. Existing detection methods

frequently fail to capture critical neighbor information and its impact on fraudulent behaviors.

In order to address these limitations, a phishing detection framework based on FAAN-GBM

(Feature and Attention Augmented Network with Gradient Boosting Machine) has been

Appl. Sci. 2025,15, 10841 5 of 21

proposed. This framework integrates basic, transaction, and interaction features of nodes

while leveraging attention mechanisms and autoencoders to enhance feature representation.

A recent experimental evaluation on authentic Ethereum datasets has demonstrated that

the FAAN-GBM model exhibits superior performance in comparison to existing approaches,

thereby significantly enhancing the accuracy of phishing fraud node detection [26].

The proliferation of smart contracts within the blockchain ecosystem has engendered

a heightened imperative for efﬁcacious phishing detection mechanisms. Existing methods

frequently prove inadequate in capturing both global structural patterns in transaction

networks and local semantic relationships in transaction data. This limitation restricts

their capacity to detect complex phishing behaviors. To address these challenges, a dy-

namic feature fusion model has been proposed, combining graph-based representation

learning with semantic feature extraction. The model constructs global graph represen-

tations of account relationships and extracts local contextual features from transactions.

These features are then integrated via a dynamic multimodal fusion mechanism. A recent

experimental evaluation on large-scale real-world blockchain datasets has demonstrated

that this approach exhibits superior performance in terms of accuracy, F1 score, and recall

when compared to existing benchmarks. This ﬁnding underscores the importance of jointly

modeling structural and semantic information for effective phishing detection [27].

LMAE4Eth is a multi-view learning framework designed to improve Ethereum fraud

account detection by integrating transaction semantics, masked graph embeddings, and

expert knowledge. It utilises a transaction token comparative language model (TxCLM) to

convert numerical transactions into semantically meaningful representations and a masked

account graph autoencoder (MAGAE) focused on reconstructing account node features

for advanced node-level detection. Scalability is achieved through layer-wise sampling,

and features designed by experts are incorporated to improve model performance. Experi-

mental results demonstrate that LMAE4Eth outperforms 15 baseline methods, achieving

over 10% improvement in F1 score across two datasets and proving its effectiveness in

detecting fraudulent accounts [

]. However, the approaches require extensive sequence

pre-processing and lack the real-time deployment capabilities demonstrated in our work.

2.4. Hybrid Systems

Li et al. [

] addressed phishing detection on Ethereum as a graph classiﬁcation

task and proposed PDGNN (Phishing Detection Graph Neural Network), an end-to-end

framework. It constructs a lightweight transaction network and extracts subgraphs linked

to known phishing accounts. Using a Chebyshev-GCN, the model classiﬁes accounts as

phishing or legitimate. Experiments on ﬁve datasets show that PDGNN outperforms

traditional methods and scales well to large networks. Pahuja et al. [

] proposed a fraud

detection approach based on the CRISP-DM (Cross-Industry Standard Process for Data

Mining) framework for Ethereum transactions. Their method tackled data imbalance using

resampling, applied correlation-based feature selection, and used ensemble learning to

enhance accuracy. A comparison of ten classiﬁers showed ensemble models outperformed

single ones, with LightGBM achieving the highest accuracy at 99.2%, surpassing other

approaches on the same dataset.

2.5. Background on Ethereum and Fraud Typologies

Ethereum is a decentralised blockchain system that facilitates programmable transac-

tions through smart contracts. In the context of the Ethereum blockchain, wallet addresses

can be classiﬁed as either externally owned or contract-based. Common fraudulent be-

haviors include phishing, contract abuse, and address laundering, often observable via

abnormal transaction frequency, unusually high or low gas usage, or multiple interactions

Appl. Sci. 2025,15, 10841 6 of 21

with known blacklisted addresses. The comprehension of these behaviors was instrumental

in the subsequent feature engineering process, which is elaborated in the following section.

3. Materials and Method

The graphical representation of the proposed method of the study is presented in

Figure 1.

Phase 5: Near Real-Time Deployment

Input Address

(Single address or

Batch of addresses)

Feature Extraction

(< 50ms processing

time)

Model Prediction

(< 10ms inference time

Probability scores)

Final Output

(Classification: Normal/Suspicious, Confidence

score, SHAP explanations, Feature contributions)

Phase 4: Model Explainability (XAI)

SHAP Integration

(TreeExplainer for XGBoost

Feature contribution analysis)

Local Explanations

(Individual prediction, Waterfall plots

Feature attributions)

Global Explanations

(Feature importance, Summary plots

Dependence analysis)

Phase 3: Model Training & Optimization

Training Dataset

(9841 labeled samples

80% training, 20% testing)

Gradient Boosting Models

(XGBoost, LightGBM,

CatBoost)

Hyperparameter Tuning

(Grid Search CV, 5-fold cross-

validation, F1-score optimization) Best Model Selection

Phase 2: Feature Engineering & Processing

Raw Transaction Data

(Block number, hash, from, to,

value, gas, timestamp)

17 Behavioral Features

(Sent_tnx, Received_tnx, Unique addresses (from/to),

Value statistics (min/max/avg), Time differences,

Contract creations, Total Ether balance)

Data Preprocessing

(Min-Max normalization, Missing

value handling,Outlier detection)

Phase 1: Data Collection & API Integration

Ethereum Network

(Mainnet via Infura/MetaMask API)

Block Range Selection

(Last 1000 blocks or

10 most recent active addresses)

Transaction Extraction

(Real-time blockchain data, Web3 API

calls)

Figure 1. Proposed method.

The pipeline commences with Phase 1, in which blockchain data is retrieved from the

Ethereum mainnet via Web3 APIs. This phase involves the extraction of transactions from the

preceding 1000 blocks or the 10 most recent active addresses. Phase 2 involves the implemen-

tation of feature engineering, which entails the transformation of raw transaction data into a

set of 17 behavioral features. These features encompass metrics such as transaction counts,

value statistics, and temporal patterns. Subsequent to this, data preprocessing is conducted

through the utilization of Min–Max normalization. Phase 3 encompasses the training of

models using three gradient boosting algorithms (XGBoost, LightGBM, CatBoost) with com-

prehensive hyperparameter tuning through 5-fold cross-validation. Phase 4 integrates SHAP

for model interpretability, providing both local explanations for individual predictions and

global feature importance analysis. Phase 5 demonstrates the deployment pipeline, achieving

sub-50 millisecond feature extraction and sub-10 millisecond inference time, and outputting

classification results with confidence scores and SHAP-based explanations.

Appl. Sci. 2025,15, 10841 7 of 21

3.1. Feature Selection and Reference Dataset

In this study, the MetaMask API was employed to establish a secure connection to

the Ethereum network. The most recent 1000 blocks were analysed programmatically

through the API, enabling the extraction of 17 distinct features associated with a given

wallet address. These features capture various behavioural and transactional characteristics

of the wallet, including but not limited to transaction frequency, token interaction patterns,

and gas usage metrics. Table 1presents the extracted features and their deﬁnitions.

Table 1. Ethereum wallet transaction features.

Feature Name Description

Address Ethereum wallet address.

Sent_tnx

Total number of standard (non-contract) transactions sent from the

address.

Received_tnx

Total number of standard (non-contract) transactions received by the

address.

NumberofCreated_Contracts

Number of smart contract creation transactions initiated by the account.

UniqueReceivedFrom_Addresses Count of distinct sender addresses that sent Ether to this account.

UniqueSentTo_Addresses Count of distinct recipient addresses this account has sent Ether to.

MinValueReceived The smallest single Ether amount received in a transaction.

MaxValueReceived The largest single Ether amount received in a transaction.

AvgValueReceived Average Ether value received across all incoming transactions.

MinValSent The smallest single Ether amount sent in a transaction.

MaxValSent The largest single Ether amount sent in a transaction.

AvgValSent Average Ether value sent across all outgoing transactions.

TotalEtherSent Cumulative Ether sent from this address across all transactions.

TotalEtherReceived Cumulative Ether received by this address across all transactions.

TotalEtherBalance Net Ether balance after all incoming and outgoing transactions.

TotalTransactions

Total count of transactions including normal and contract creation

ones.

TimeDiffBetweenFirstandLast

Time duration in minutes between the ﬁrst and the most recent

transaction.

AvgMinBetweenSentTnx Average time in minutes between two consecutive sent transactions.

A key challenge in this research is the limited availability of publicly accessible labelled

datasets that classify Ethereum addresses as either suspicious or normal. In order to address

this issue, the dataset employed in the study by Aziz et al. [

] served as the primary dataset.

The dataset under consideration contains 9841 entries corresponding to transactions on

the Ethereum network. Each entry is labelled to indicate whether the behavior is normal

(label 0) or suspicious (label 1). Speciﬁcally, 7662 records are marked as normal, while

the remaining instances are identiﬁed as suspicious. The original dataset encompasses

49 extracted features pertaining to transactional behavior, account activity and smart contract

interactions. For the purposes of this research, a subset of 17 features was selected from the

original 49. These features were determined to be both relevant and technically extractable in

real time. This refined feature set was then used to construct a new dataset, the parameters

of which were tailored to the requirements of the proposed detection system. The finalized

dataset for this study has been made publicly available via a GitHub repository [31].

Appl. Sci. 2025,15, 10841 8 of 21

3.2. Performance Metrics

In order to evaluate the classiﬁcation performance of the dataset constructed for this

study, several widely accepted performance metrics were employed, including Accuracy,

Precision,Recall, and the F1-Score [

]. The metrics thus provide a comprehensive under-

standing of the model’s effectiveness in correctly identifying both suspicious and benign

wallet behaviors. The mathematical formulations corresponding to each metric are pre-

sented in Equations

(1)

–

(4)

.Accuracy is a metric of model precision, calculated as the ratio

of instances classiﬁed correctly to the total number of instances. Precision is deﬁned as

the proportion of correctly predicted suspicious wallets among all wallets predicted as

suspicious, thereby reﬂecting the model’s ability to avoid false positives. Recall, also known

as sensitivity, is deﬁned as the proportion of actual suspicious wallets that were correctly

identiﬁed by the model. This metric highlights the model’s capacity to minimize false nega-

tives. The F1-Score is the harmonic mean of precision and recall, offering a balanced metric

that is particularly useful when dealing with imbalanced datasets. The collective utilization

of these metrics ensures a robust evaluation of the model’s classiﬁcation capabilities.

Accuracy =TP +TN

TP +TN +FP +FN (1)

Precision =TP

TP +FP (2)

Recall =TP

TP +FN (3)

F1-Score =2×Precision ×Recall

Precision +Recall (4)

3.3. Classiﬁcation

In this study, several ensemble-based boosting algorithms were employed for the

purpose of classiﬁcation, including LightGBM (Light Gradient Boosting Machine), XGBoost

(Extreme Gradient Boosting), and CatBoost [

]. The selection of these gradient boosting

frameworks was made on the basis of their proven efﬁciency, scalability, and high predictive

performance, particularly in the context of structured tabular data. Each of these algorithms

employs decision tree ensembles with optimized boosting strategies, thereby enabling the

model to capture complex patterns within the feature space and effectively distinguish

between suspicious and normal wallet behaviors. XGBoost is an advanced implementation

of gradient boosting machines that incorporates system optimization and algorithmic

enhancements to improve efﬁciency, scalability, and model performance [34].

To optimize the objective function, XGBoost applies a second-order Taylor expansion

to approximate the loss at iteration t. The approximated loss is given on Equation (5).

L(t)≈

∑

i=1gift(xi) + 1

2hif2

t(xi)+Ω(ft)(5)

where

ft(xi)

is the prediction from the newly added function (typically a regression tree) at

iteration

, and

Ω(ft)

denotes the regularization term given in Equation

(6)

that controls

the complexity of the model.

Ω(f) = γT+1

2λ

∑

j=1

j(6)

The terms

and

represent the ﬁrst and second-order derivatives of the loss function

with respect to the prediction from the previous iteration

y(t−1)

, and are deﬁned as follows:

Appl. Sci. 2025,15, 10841 9 of 21

gi=∂l(yi,ˆ

y(t−1)

∂ˆ

y(t−1)

,hi=∂2l(yi,ˆ

y(t−1)

∂ˆ

y(t−1)2

(7)

Here,

l(yi

y(t−1)

denotes the loss function comparing the true label

and the pre-

dicted value

y(t−1)

. The gradient

captures the direction of steepest descent, while the

Hessian

provides curvature information, allowing the algorithm to perform more accu-

rate and stable updates during optimization.

LightGBM is a gradient boosting framework based on decision tree algorithms, de-

signed to be distributed and efﬁcient. It introduces techniques such as Gradient-based

One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) to reduce computation

and memory usage, making it suitable for large-scale and high-dimensional data [

]. Light-

GBM is a gradient boosting framework that uses histogram-based algorithms and grows

trees leaf-wise, optimizing computational efﬁciency. Loss function is given in Equation

(8)

L(t)=∑

i∈A

gift(xi) + 1

2hif2

t(xi)(8)

where A⊂ {1, . . . , n}is selected using GOSS (Gradient-based One-Side Sampling).

Regularized objective can be deﬁned as given in Equation (9).

∑

i=1

l(yi,ˆ

yi) + λ∥f∥2(9)

CatBoost is a gradient boosting algorithm speciﬁcally designed to handle categorical

features efﬁciently. It employs techniques such as ordered boosting and target statistics to

reduce overﬁtting and eliminate prediction shift, which commonly arise in the processing

of categorical variables [36].

At boosting iteration

, the prediction is updated as follows, as given in Equation

(10)

y(t)

i=ˆ

y(t−1)

i+ηft(xi)(10)

where

denotes the learning rate, and

is the decision function (typically a decision tree)

added at iteration t.

To prevent target leakage and ensure unbiased gradient estimation, CatBoost intro-

duces the ordered gradient, deﬁned in Equation (11).

g(t)

i=∂ℓ(yi,ˆ

y(t−1)

∂ˆ

y(t−1)



without (xi,yi)

(11)

where the gradient for sample

is calculated excluding the sample itself from the statistics,

thus avoiding prediction shift.

The loss function minimized during training is expressed as shown in Equation (12).

L(t)=

∑

i=1

ℓ(yi,ˆ

y(t)

i)(12)

where ℓis the chosen loss function (e.g., log loss or squared error).

Appl. Sci. 2025,15, 10841 10 of 21

For the transformation of categorical features, CatBoost computes a smoothed target

statistic as presented in Equation (13).

TSj=∑i∈B(xij )yi+a·p

|B(xij)|+a(13)

where:

•B(xij)is the set of prior samples with the same categorical value as xij,

•pis the prior mean of the target,

•ais a regularization (smoothing) parameter.

This approach enables CatBoost to achieve state-of-the-art performance, particularly

on datasets with high-cardinality categorical variables.

For the purposes of this study, the dataset was divided into a training set and a testing

set using an 80/20 split ratio, where 80% of the data was used for training the model and the

remaining 20% was reserved for performance evaluation. The classiﬁcation results obtained

from the employed boosting algorithms were compared based on the metrics deﬁned earlier.

Table 2presents a summary of the performance comparison across different classiﬁers.

Table 2shows that the boosting algorithms typically generate comparable outcomes,

with accuracy values ranging approximately from 95.83% to 96.46%, as evidenced by the

test results. The XGBoost-based model was selected for utilization in this study, and all

code implementations were written in Python 3.13.

Table 2. Performance comparison of different classiﬁers.

Metric XGBoost LightGBM CatBoost

Best Hyperparameters

colsample_bytree: 1.0

learning_rate: 0.2

max_depth: 5

n_estimators: 300

reg_alpha: 0

reg_lambda: 1.5

subsample: 1.0

bagging_fraction: 0.8

feature_fraction: 0.9

learning_rate: 0.2

max_depth: 5

n_estimators: 200

num_leaves: 31

depth: 7

iterations: 300

l2_leaf_reg: 3

learning_rate: 0.2

CV Accuracy Mean 0.9588 0.9646 0.9583

CV Accuracy Std 0.0090 0.0071 0.0066

CV F1 Mean 0.9585 0.9643 0.9580

CV F1 Std 0.0090 0.0071 0.0066

Test Accuracy 0.9589 0.9634 0.9584

Test Precision 0.9586 0.9633 0.9582

Test Recall 0.9589 0.9634 0.9584

Test F1 0.9581 0.9628 0.9575

Test ROC AUC 0.9882 0.9898 0.9880

Training Time (s) 833.96 471.86 220.65

Model Size (KB) 548.79 516.15 639.19

Latency (ms) 0.72 0.72 0.72

Table 3presents a near-real-time performance metrics and requrirements.

The performance metrics in the table demonstrate the model’s high efﬁciency and

effectiveness levels. The mean processing time for a single instance was measured at just

0.72 milliseconds (ms), well below the speciﬁed requirement of 100 milliseconds (ms). The

Appl. Sci. 2025,15, 10841 11 of 21

completion times for 95% and 99% of the transactions were recorded as 0.76 milliseconds

(ms) and 1.03 ms, respectively, thereby demonstrating the model’s efﬁcacy, even in extreme

cases. The throughput per batch was 12,021 samples per second, which is well above the

predeﬁned requirement, conﬁrming the model’s high processing capacity. Additionally,

the batch processing time for a single example was found to be 0.08 milliseconds, thereby

substantiating the system’s aptitude for real-time applications.

An evaluation of the resource usage revealed a memory consumption of 410 MB and

a model size of 548.8 KB. These values are both well below the speciﬁed limits. These

ﬁndings demonstrate that the model is both lightweight and portable, operating efﬁciently

in terms of resources. A comprehensive evaluation of the performance metrics reveals

that the requirements are being met with considerably higher performance, thereby sub-

stantiating the model’s reliability in delivering both high processing speed and minimal

resource utilization.

Table 3. Near-real-time performance metrics and requirements.

Metric Value Requirement

Average Processing Time 0.72 ms <100 ms

P95 Response Time 0.76 ms <200 ms

P99 Response Time 1.03 ms <500 ms

Throughput (10 samples) 12,021.0 samples/s >50 samples/s

Time per Sample (batch) 0.08 ms <20 ms

Memory Usage 410.0 MB <1000 MB

Model Size 548.8 KB <1000 KB

3.4. Hyperparameter Optimization

Comprehensive grid search was performed across three gradient boosting algorithms.

The optimal XGBoost conﬁguration achieved through 5-fold (for each of 2187 candidates,

totalling 10,935 ﬁts) stratiﬁed cross-validation:

• n_estimators: 300

• max_depth: 5

• learning_rate: 0.2

• subsample: 1.0

• colsample_bytree: 1.0

This systematic approach ensures reproducible results and addresses potential

overﬁtting concerns.

3.5. Ablation Study

In this study, ablation results were obtained by removing each feature separately. The

ﬁve most effective features are given in Table 4.

The ﬁndings of the present study suggest that the most critical features affecting the

model’s prediction performance are ‘Time Diff between ﬁrst and last (Mins)’ and ‘Total-

Transactions’. The elimination of both features results in a model accuracy reduction of

approximately 1.06 %, suggesting a considerably more substantial impact compared to

other features. Upon the removal of the three additional features—MinValueReceived,

TotalEtherReceived, and MaxValueReceived—the accuracy loss remained at 0.26%, 0.26%,

and 0.21%, respectively. The ﬁndings indicate that behavioral/temporal characteristics, in-

cluding transaction timing and frequency, are more effective discriminators and informative

indicators for the model than ﬁnancial value-based features.

Appl. Sci. 2025,15, 10841 12 of 21

Table 4. Ablation study.

Feature Baseline

Accuracy

Without

Feature

Accuracy

Drop

Relative

Accuracy

Impact

Baseline

Without

Feature

Drop

Relative

Impact

Time Diff between ﬁrst and last (Mins) 0.9589 0.9487 0.0102 1.0593 0.9581 0.9474 0.0107 1.1209

TotalTransactions 0.9589 0.9487 0.0102 1.0593 0.9581 0.9477 0.0104 1.0895

MinValueReceived 0.9589 0.9563 0.0025 0.2648 0.9581 0.9556 0.0026 0.2677

TotalEtherReceived 0.9589 0.9563 0.0025 0.2648 0.9581 0.9554 0.0027 0.2851

MaxValueReceived 0.9589 0.9568 0.0020 0.2119 0.9581 0.9558 0.0023 0.2375

Sent_tnx 0.9589 0.9573 0.0015 0.1589 0.9581 0.9565 0.0017 0.1725

TotalEtherSent 0.9589 0.9573 0.0015 0.1589 0.9581 0.9565 0.0016 0.1682

UniqueReceivedFrom_Addresses 0.9589 0.9573 0.0015 0.1589 0.9581 0.9566 0.0015 0.1598

AvgValueReceived 0.9589 0.9573 0.0015 0.1589 0.9581 0.9564 0.0017 0.1768

TotalEtherBalance 0.9589 0.9573 0.0015 0.1589 0.9581 0.9565 0.0016 0.1682

Avg min between sent tnx 0.9589 0.9578 0.0010 0.1059 0.9581 0.9569 0.0012 0.1291

MaxValSent 0.9589 0.9578 0.0010 0.1059 0.9581 0.9571 0.0010 0.1079

UniqueSentTo_Addresses 0.9589 0.9578 0.0010 0.1059 0.9581 0.9570 0.0011 0.1163

Received_tnx 0.9589 0.9584 0.0005 0.0530 0.9581 0.9575 0.0006 0.0602

MinValSent 0.9589 0.9584 0.0005 0.0530 0.9581 0.9577 0.0005 0.0478

NumberofCreated_Contracts 0.9589 0.9589 0.0000 0.0000 0.9581 0.9580 0.0001 0.0082

AvgValSent 0.9589 0.9599 −0.0010 −0.1059 0.9581 0.9592 −0.0011 −0.1119

4. Results and Discussion

4.1. SHAP (SHapley Additive exPlanations)

To enhance the interpretability of the XGBoost model selected for this study, the SHAP

(SHapley Additive exPlanations) algorithm was employed. In the ﬁeld of explainable

artiﬁcial intelligence (XAI), SHAP (SHapley Additive exPlanations) has emerged as one of

the most theoretically grounded and model-agnostic approaches for interpreting machine

learning models. It is based on cooperative game theory, particularly the concept of Shapley

values, which aim to fairly distribute the “payout” (in this case, the model output) among

the input features based on their marginal contributions. SHAP assigns each feature a value

that quantiﬁes its individual contribution to a particular prediction. These contributions

are calculated by considering all possible permutations of feature subsets and computing

the average marginal effect of including a feature across these subsets. The result is a set of

additive feature attributions that sum to the model’s output for that instance. This makes

SHAP both local (interpreting individual predictions) and global (aggregating attributions

across many predictions) in scope [37].

In SHAP, the contribution of each feature

to the model’s prediction is calculated

using the following Shapley value formula:

ϕi=∑

S⊆N\{i}

|S|!·(|N|−|S| − 1)!

|N|![f(S∪ {i})−f(S)](14)

where

is the set of all input features,

S⊆N\ {i}

is a subset of features excluding feature

f(S)

is the model prediction using only the features in subset

f(S∪ {i})

is the prediction

after adding feature

ϕi

is the SHAP value representing the contribution of feature

to the

model output.

Appl. Sci. 2025,15, 10841 13 of 21

This formulation guarantees several desirable properties: local accuracy (the sum of

SHAP values equals the model prediction), missingness (features not in the model get zero

contribution), and consistency (if a model changes to increase the contribution of a feature,

its SHAP value will not decrease). As such, SHAP provides a principled and intuitive way

to interpret complex machine learning models. Algorithm 1illustrates the procedure for

generating the SHAP plot. Figure 2depicts the contribution values of each feature to the

classiﬁcation outcome.

Figure 2. SHAP value.

For instance, in a speciﬁc prediction case, a wallet with high TotalEtherSent and

frequent outgoing transactions showed positive SHAP values, indicating strong association

with suspicious behavior. In contrast, wallets with low diversity in interacting addresses

showed negative SHAP values, correlating with benign activity. This insight is useful for

forensic analysts investigating suspicious wallet activity.

Table 5presents features according to SHAP importance.

The SHAP plot, in isolation, does not explicitly indicate the model’s prediction for a

given instance. Instead, it provides valuable insights into which features the model utilised

to make its prediction and the extent to which each feature inﬂuenced the outcome. The

SHAP importance plot is interpreted by examining the vertical axis, which lists the various

features employed in the model, arranged in accordance with their estimated impact on

the model’s output. The horizontal axis represents the SHAP values, which quantify the

extent to which a feature’s value for a particular sample deviates the model’s output from

its expected baseline. Positive SHAP values indicate that the feature in question exerts a

Appl. Sci. 2025,15, 10841 14 of 21

propelling inﬂuence on the prediction, thereby elevating the output (e.g., increasing the

probability of being considered suspicious). Conversely, negative SHAP values imply

a mitigating effect of the feature on the output. The colour of each point corresponds

to the actual value of the feature, with red representing high values and blue indicating

low values.

Table 5. Features and their mean absolute SHAP values.

Feature Mean Absolute SHAP Value

Time Diff between ﬁrst and last (Mins) 1.6933

UniqueReceivedFrom_Addresses 1.4526

AvgValueReceived 1.1515

TotalTransactions 1.0833

Received_tnx 0.9854

Sent_tnx 0.9066

TotalEtherReceived 0.8005

TotalEtherSent 0.6614

MaxValueReceived 0.6227

MinValueReceived 0.6016

Avg min between sent tnx 0.5552

MinValSent 0.5397

AvgValSent 0.3124

UniqueSentTo_Addresses 0.2595

MaxValSent 0.2215

TotalEtherBalance 0.1875

NumberofCreated_Contracts 0.1364

SHAP values are used to quantitatively ascertain the extent to which features con-

tribute to model predictions. A thorough examination of the table reveals that features

with the highest average absolute SHAP values play a pivotal role in determining the

model’s output. Speciﬁcally, features such as “Time Diff between ﬁrst and last (Mins)” and

“UniqueReceivedFrom_Addresses,” with values of 1.693 and 1.453, respectively, exert the

most signiﬁcant inﬂuence on the model’s decision-making processes. Features such as

“AvgValueReceived”, “TotalTransactions”, and “Received_tnx”, which follow, also have

signiﬁcant effects. Conversely, certain features, including “NumberofCreated_Contracts,”

“TotalEtherBalance,” and “MaxValSent,” exhibited lower SHAP values, suggesting that

their inﬂuence on model predictions is comparatively constrained relative to other features.

In summary, SHAP analysis offers a reliable indicator for explaining the model’s

decision-making mechanism and identifying which features are signiﬁcant. This enhances

the model’s transparency and interpretability.

The integration of SHAP with XGBoost enhances the interpretability of tree ensemble

models by providing a decomposition of the model’s predictions into individual feature

contributions. For each prediction, expressed as

f(x) = ∑T

t=1ft(x)

, SHAP computes the

contributions of each feature, denoted as

ϕi

, in a manner that satisﬁes the efﬁciency property,

∑iϕi=f(x)−E[f(X)]

. This property ensures that the sum of the feature contributions

precisely accounts for the difference between the model’s prediction and its expected value,

Appl. Sci. 2025,15, 10841 15 of 21

thereby enabling a rigorous and quantitative assessment of how each feature inﬂuences the

model’s output.

Consequently, the SHAP framework provides a transparent and interpretable mechanism

for understanding the decision-making process of complex ensemble models such as XGBoost.

Algorithm 1 Fraud Detection in Blockchain Transactions Using XGBoost and SHAP

Input: Dataset ﬁle path

Output:

Trained XGBoost model, normalization scaler, accuracy score, SHAP summary

plot

1: Load dataset from CSV ﬁle

2: Separate input features Xand target variable y:

3: Remove Address column from X

4: Extract class column as target y

5: Normalize features Xusing Min-Max scaling to obtain Xscaled

6: Split data into training and test sets:

7: (Xtrain

Xtest

ytrain

ytest)←train_test_split(Xscaled

test_size =

0.2, random_state =42)

8: Initialize XGBoost classiﬁer with parameters:

9: use_label_encoder=False,eval_metric="logloss",tree_method="hist"

10: Train model Mon training data (Xtrain,ytrain)

11: Save trained model and scaler to disk

12: Predict target values ypred for Xtest using model M

13: Calculate accuracy score between ytest and ypred

14: Initialize SHAP explainer with model M

15: Compute SHAP values for test data Xtest

16: Generate and save SHAP summary plot

4.2. Feature Extraction for Given Wallet Address

One of the modules developed for the present study focuses on the extraction of

relevant features for any given Ethereum wallet address, and the subsequent prediction

of the wallet’s suspicion based on these extracted features. Algorithm 2illustrates the

step-by-step procedure used for feature extraction from the wallet.

4.3. Detection of the Last 10 Active Wallet Addresses and Extraction of the Properties of These

Wallets and Model Estimation

The algorithm developed to identify the last 10 active wallet addresses is presented

in Algorithm 3. This module is designed to identify the last 10 active Ethereum wallet

addresses and extract their corresponding features. Following feature extraction, the model

is used to predict whether each of these wallets exhibits suspicious behavior. This script

connects to the Ethereum API using Infura and iteratively checks each block from the latest

one down to the range limit. For every block, it collects unique “from” and “to” addresses

from all transactions. Once it identiﬁes 10 unique active addresses, it stops and writes them

to a CSV ﬁle.

After identifying the last 10 active addresses, the characteristics of each of these

addresses were extracted as illustrated in Algorithm 4, and the resulting data were saved

to a CSV ﬁle.

The model was trained on the extracted features of the last 10 addresses, and predic-

tions regarding the normal or suspicion of each wallet address were made by applying the

XGBoost model. The algorithm developed for this module is presented in Algorithm 5.

Appl. Sci. 2025,15, 10841 16 of 21

Algorithm 2 Ethereum Account Feature Extraction

Input: Ethereum account address, start block, end block

Output: Transaction features for the account in a CSV ﬁle

1: Connect to Ethereum via Web3 provider

2: Initialize empty list transactions

3: Loop over blocks from start_block to end_block:

4: for each block number in the range do

5: Retrieve block with full transaction data

6: for each transaction in block do

7: if transaction is sent from or received by the given address then

Extract transaction data: block number, hash, sender, recipient, value in ETH,

gas, timestamp

9: Append data to transactions

10: end if

11: end for

12: end for

13: Extract timestamps and compute:

14: Time difference between ﬁrst and last transaction

15: Average time between sent transactions

16: Separate sent and received transactions

17: Compute the number of unique senders and recipients

18: Count the number of contract creations

19: Compute statistical values for sent and received ETH:

20: Min, Max, and Average values

21: Total ETH sent and received

22: Balance = Received - Sent

23: Build a feature dictionary with all computed metrics

24: Save the feature dictionary as a row in a CSV ﬁle

Algorithm 3 Extracting Latest Active Ethereum Addresses

Input: Ethereum node access, block range N, number of addresses k

Output: A CSV ﬁle with the kmost recent active Ethereum addresses

1: Connect to Ethereum using Web3

2: Get the latest block number

3: Set the scan range as the last Nblocks

4: Initialize an empty set active_addresses

5: for block number from latest to latest −N(in reverse) do

6: Fetch the block with full transaction data

7: for each transaction in the block do

8: Add from and to addresses to active_addresses, if present

9: if size of active_addresses ≥kthen

10: Return the list of active addresses

11: end if

12: end for

13: end for

14: Save the collected addresses into a CSV ﬁle

A comparison of our model with existing approaches, such as those proposed by

Aziz et al. [16]

and Ehsan et al. [

], reveals that our model not only achieves similar

or better accuracy, but also emphasizes real-time applicability and modular design. The

majority of extant research concentrates exclusively on ofﬂine datasets, whereas our pipeline

operates directly on live-chain data using the Web3 API. This architectural enhancement

ensures the system’s deployability for security monitoring platforms. Furthermore, the

explainability provided by SHAP not only fosters model transparency but also supports

compliance with regulatory requirements in the context of blockchain forensics.

Appl. Sci. 2025,15, 10841 17 of 21

Algorithm 4 Extract Ethereum Account Features

Input: Ethereum address list from CSV, start and end block numbers

Output: Extracted features for each address saved in output CSV

1: Connect to Ethereum mainnet via Web3

2: Load Ethereum addresses from latest_active_addresses.csv

3: Deﬁne output CSV eth_account_features.csv

4: Get current block as latest_block, set start_block = latest_block - 500

5: for each address in input CSV do

6: if address is valid then

7: Initialize empty transaction list

8: for each block from start_block to latest_block do

9: Get block with full transactions

10: for each transaction in block do

11: if transaction involves address then

12: Collect transaction info (value, gas, timestamp, etc.)

13: end if

14: end for

15: Wait 0.5 s to avoid rate limits

16: end for

17: Compute:

• Number of sent/received transactions

• Number of contracts created

• Unique sent-to and received-from addresses

• Min, max, avg sent/received values

• Total Ether sent/received, balance

• Time statistics

18: Write all features to output CSV

19: end if

20: end for

Algorithm 5 Ethereum Account Fraud Classiﬁcation Pipeline

Input: CSV ﬁle path

Output: Prediction (Normal or Suspicion)

1: Load: Saved scaler from scaler.pkl

2: Load: Trained model from xgboost_fraud_detection_model.pkl

3: Load dataset as dataframe df from CSV

4: Store the Address column separately in addresses

5: Remove the Address column from df

6: for all columns in df do

7: Convert values to numeric (coerce invalid values as NaN)

8: end for

9: Fill missing values in df with column medians

10: Load scaler using joblib

11: Apply scaler transformation to df to obtain X_new_scaled

12: Load XGBoost model using joblib

13: Predict labels for X_new_scaled using the loaded model

14: Create a new dataframe output_df with:

15: Address from original data

16: Class as "NORMAL" if prediction is 0, else "SUSPICION"

17: Save output_df to eth_account_predictions.csv

5. Limitation

While the proposed system exhibits high performance in the realm of near real-time

Ethereum fraud detection, it is imperative to acknowledge certain limitations to provide a

balanced perspective on the scope and applicability of the work.

•

The paucity of high-quality, publicly labeled datasets for Ethereum fraud detection

poses a signiﬁcant constraint. The present study is predicated on a dataset comprising

9.841 transactions, which, while substantial, may not encompass the full spectrum of

fraudulent behavior present in the current ecosystem.

Appl. Sci. 2025,15, 10841 18 of 21

•

It is imperative to acknowledge the perpetual adaptability exhibited by cybercriminals

in their endeavors to circumvent detection systems. This perpetual adaptability

renders training data as a representation of evolving fraud patterns as inadequate.

This temporal bias has the potential to compromise the model’s efﬁcacy in addressing

novel attack vectors.

•

The binary classiﬁcation approach (normal and suspicious) may be an oversimpliﬁ-

cation of the complex nature of blockchain activity. It is important to note that some

transactions may fall into a gray area that is neither clearly fraudulent nor entirely

legitimate. The current model does not differentiate between different fraudulent

activities (e.g., phishing, Ponzi schemes, and money laundering).

•

Despite the fact that our set of 17 features captures fundamental behavioral patterns,

it is possible that these features do not encompass all relevant fraud indicators. It is

possible that more sophisticated fraud schemes may employ patterns not yet captured

in our current feature set. The prevailing feature engineering approach is static, which

may hinder its ability to adapt to the evolving nature of fraud.

•

At present, the system’s design is exclusively tailored for Ethereum, thereby limiting its

generalizability to disparate blockchain networks characterized by varied transaction

structures and consensus mechanisms.

•

The system’s reliance on external APIs (Ethereum nodes, Etherscan) introduces potential

points of failure and rate-limiting constraints that could impact real-time performance.

•

Despite the system’s optimization for efﬁciency, concurrent processing of voluminous

address groups can necessitate substantial memory resources, a factor that could im-

pede the system’s applicability in environments characterized by resource constraints.

•

In the context of sophisticated attacks, entities that possess an understanding of the

model’s decision boundaries may devise specially designed transactions with the

intent to evade detection. It should be noted that this particular scenario has not been

the focus of a comprehensive evaluation within the scope of our current research.

•

While SHAP offers interpretability, it concomitantly unveils the model’s decision-

making process, leaving it vulnerable to exploitation by potential attackers.

While the current work demonstrates signiﬁcant advances in near real-time Ethereum

fraud detection with high interpretability, the identiﬁed limitations provide clear directions

for future research and development. The proposed future work encompasses both techni-

cal enhancements and broader considerations of practical deployment, ethical implications,

and societal impact. Addressing these limitations and pursuing the outlined research

directions will contribute to the development of more robust, scalable, and trustworthy

blockchain security systems. The rapid evolution of blockchain technology and fraud

techniques necessitates continuous research and adaptation. The framework under consid-

eration provides a solid foundation that can be extended and enhanced to meet emerging

challenges in blockchain security and fraud detection.

6. Future Works

In the future, the robustness of models may be enhanced through the incorporation

of token transaction graphs and smart contract call traces. In addition, the validation of

the pipeline is planned to be conducted on real-time unlabeled data, employing human-in-

the-loop veriﬁcation. The development of a uniﬁed framework capable of operating across

multiple blockchain networks (Ethereum, Binance Smart Chain, Polygon, etc.) would

signiﬁcantly increase the system’s utility and market viability. The extension of the binary

classiﬁcation to identify speciﬁc types of fraud (e.g., phishing, pyramid schemes, mixing

services) will provide researchers with more actionable intelligence. The incorporation of

graph-based features, while preserving the interpretability and efﬁciency of the prevailing

Appl. Sci. 2025,15, 10841 19 of 21

approach through hybrid architectures, is expected to enhance the system’s effectiveness.

Beyond rudimentary statistical measurements, more sophisticated temporal modeling

techniques can be employed to identify patterns in fraud behavior over time. The develop-

ment of speciﬁc models for various types of fraud, and the subsequent integration of these

models to create hybrid systems, is a promising area of research.

7. Conclusions

This study presents a machine learning approach for detecting fraudulent Ethereum

wallet addresses. The system has been demonstrated to demonstrate the capability of

evaluating individual or recently active wallet addresses in near real-time by leveraging

a pre-trained XGBoost model. This has been shown to result in a high accuracy rate

of 96% in classifying suspicious behavior. Incorporating SHAP values into the model

helps improve its interpretability and transparency, thus providing information on the

contribution of each feature to the ﬁnal decision. The ﬁndings of this study suggest

that explainable artiﬁcial intelligence (XAI) techniques have the potential to substantially

improve the trustworthiness and usability of blockchain analytics tools. In the future,

several enhancements are planned. Firstly, the model will be extended to support multi-

chain analysis, incorporating data from other popular blockchains such as Binance Smart

Chain or Polygon. Furthermore, the feature set may be expanded to include more granular

behavioral indicators derived from smart contract interactions and token transfers. Real-

time streaming data integration is also a key development goal, which would allow the

system to function continuously on live blockchain activity. Finally, the deployment of the

solution as a publicly accessible API or dashboard has the potential to facilitate broader use

in security monitoring, compliance, and ﬁnancial auditing applications.

Funding: This work is supported by Scientiﬁc Research Projects Coordination Unit of Firat University,

Türkiye, Project Numbers: TEKF.25.13 and ADEP.25.28.

Data Availability Statement: Data are available from the corresponding author upon reasonable request.

Conﬂicts of Interest: The author declares no conﬂicts of interest.

References

1. Xu, M.; Chen, X.; Kou, G. A systematic review of blockchain. Financ. Innov. 2019,5, 27. [CrossRef]

Ressi, D.; Romanello, R.; Piazza, C.; Rossi, S. AI-enhanced blockchain technology: A review of advancements and opportunities.

J. Netw. Comput. Appl. 2024,225, 103858. [CrossRef]

Sun, J.; Jia, Y.; Wang, Y.; Tian, Y.; Zhang, S. Ethereum fraud detection via joint transaction language model and graph representation

learning. Inf. Fusion 2025,120, 103074. [CrossRef]

Gad, A.G.; Mosa, D.T.; Abualigah, L.; Abohany, A.A. Emerging trends in blockchain technology and applications: A review and

outlook. J. King Saud Univ.-Comput. Inf. Sci. 2022,34, 6719–6742. [CrossRef]

Zheng, Z.; Su, J.; Chen, J.; Lo, D.; Zhong, Z.; Ye, M. Dappscan: Building large-scale datasets for smart contract weaknesses in

dapp projects. IEEE Trans. Softw. Eng. 2024,50, 1360–1373. [CrossRef]

Han, H.; Shiwakoti, R.K.; Jarvis, R.; Mordi, C.; Botchie, D. Accounting and auditing with blockchain technology and artiﬁcial

Intelligence: A literature review. Int. J. Account. Inf. Syst. 2023,48, 100598. [CrossRef]

Tripathi, G.; Ahad, M.A.; Casalino, G. A comprehensive review of blockchain technology: Underlying principles and historical

background with future challenges. Decis. Anal. J. 2023,9, 100344. [CrossRef]

Ma, F.; Ren, M.; Fu, Y.; Wang, M.; Li, H.; Song, H.; Jiang, Y. Security reinforcement for Ethereum virtual machine. Inf. Process.

Manag. 2021,58, 102565. [CrossRef]

Wu, S.; Yu, Z.; Wang, D.; Zhou, Y.; Wu, L.; Wang, H.; Yuan, X. Deﬁranger: Detecting DeFI price manipulation attacks. IEEE Trans.

Dependable Secur. Comput. 2023,21, 4147–4161. [CrossRef]

10.

Faqir-Rhazoui, Y.; Arroyo, J.; Hassan, S. A comparative analysis of the platforms for decentralized autonomous organizations in

the Ethereum blockchain. J. Internet Serv. Appl. 2021,12, 9. [CrossRef]

Appl. Sci. 2025,15, 10841 20 of 21

11.

Li, S.; Gou, G.; Liu, C.; Xiong, G.; Li, Z.; Xiao, J.; Xing, X. TGC: Transaction Graph Contrast Network for Ethereum Phishing Scam

Detection. In Proceedings of the 39th Annual Computer Security Applications Conference, Austin, TX, USA, 4–8 December 2023;

pp. 352–365.

12.

Wu, J.; Lin, D.; Fu, Q.; Yang, S.; Chen, T.; Zheng, Z.; Song, B. Toward understanding asset ﬂows in crypto money laundering

through the lenses of Ethereum heists. IEEE Trans. Inf. Forensics Secur. 2023,19, 1994–2009. [CrossRef]

13.

Wronka, C. Money laundering through cryptocurrencies-analysis of the phenomenon and appropriate prevention measures.

J. Money Laund. Control 2022,25, 79–94. [CrossRef]

14.

Chainalysis, T. The Chainalysis 2025 Crypto Crime Report. 2025. Available online: https://go.chainalysis.com/2025-Crypto-

Crime-Report.html (accessed on 19 May 2025).

15.

Chen, Z.; Hu, Y.; He, B.; Luo, D.; Wu, L.; Zhou, Y. Dissecting payload-based transaction phishing on Ethereum. arXiv 2024,

arXiv:2409.02386. [CrossRef]

16.

Aziz, R.M.; Baluch, M.F.; Patel, S.; Ganie, A.H. LGBM: A machine learning approach for Ethereum fraud detection. Int. J. Inf.

Technol. 2022,14, 3321–3331. [CrossRef]

17.

Farrugia, S.; Ellul, J.; Azzopardi, G. Detection of illicit accounts over the Ethereum blockchain. Expert Syst. Appl. 2020,150, 113318.

[CrossRef]

18.

Ravindranath, V.; Nallakaruppan, M.; Shri, M.L.; Balusamy, B.; Bhattacharyya, S. Evaluation of performance enhancement in

Ethereum fraud detection using oversampling techniques. Appl. Soft Comput. 2024,161, 111698. [CrossRef]

19.

Dahiya, M.; Mishra, N.; Singh, R. Neural network based approach for Ethereum fraud detection. In Proceedings of the 2023 4th

International Conference on Intelligent Engineering and Management (ICIEM), London, UK, 9–11 May 2023; 2023; pp. 1–4.

20.

Hu, T.; Liu, X.; Chen, T.; Zhang, X.; Huang, X.; Niu, W.; Lu, J.; Zhou, K.; Liu, Y. Transaction-based classiﬁcation and detection

approach for Ethereum smart contract. Inf. Process. Manag. 2021,58, 102462. [CrossRef]

21.

Ehsan, A.; Iqbal, Z.; Abuowaida, S.; Aljaidi, M.; Zia, H.U.; Alshdaifat, N.; Alshammry, N.K. Enhanced Anomaly Detection in

Ethereum: Unveiling and Classifying Threats with Machine Learning. IEEE Access 2024,12, 176440–176456. [CrossRef]

22.

Liu, L.; Tsai, W.T.; Bhuiyan, M.Z.A.; Peng, H.; Liu, M. Blockchain-enabled fraud discovery through abnormal smart contract

detection on Ethereum. Future Gener. Comput. Syst. 2022,128, 158–166. [CrossRef]

23.

Tan, R.; Tan, Q.; Zhang, P.; Li, Z. Graph neural network for ethereum fraud detection. In Proceedings of the 2021 IEEE

international conference on big knowledge (ICBK), Auckland, New Zealand, 7–8 December 2021; pp. 78–85.

24.

Jin, C.; Zhou, J.; Xie, C.; Yu, S.; Xuan, Q.; Yang, X. Enhancing Ethereum Fraud Detection via Generative and Contrastive

Self-supervision. IEEE Trans. Inf. Forensics Secur. 2024,20, 839–853. [CrossRef]

25.

Tan, R.; Tan, Q.; Zhang, Q.; Zhang, P.; Xie, Y.; Li, Z. Ethereum fraud behavior detection based on graph neural networks.

Computing 2023,105, 2143–2170. [CrossRef]

26.

Liu, S.Z.; Yu, X.Y.; Li, Y.T.; Zhang, H.; Guo, X.P.; Ma, C.H.; Long, H.X. Detection of Ethereum Phishing Fraud Nodes Based on

Feature Enhancement Strategy and GBM. Electronics 2024,13, 5060. [CrossRef]

27.

Sheng, Z.; Song, L.; Wang, Y. Dynamic Feature Fusion: Combining Global Graph Structures and Local Semantics for Blockchain

Phishing Detection. IEEE Trans. Netw. Serv. Manag. 2025,22, 4706–4718. [CrossRef]

28.

Jia, Y.; Wang, Y.; Sun, J.; Tian, Y.; Qian, P. LMAE4Eth: Generalizable and Robust Ethereum Fraud Detection by Exploring

Transaction Semantics and Masked Graph Embedding. IEEE Trans. Inf. Forensics Secur. 2025,20, 10260–10274. [CrossRef]

29.

Li, P.; Xie, Y.; Xu, X.; Zhou, J.; Xuan, Q. Phishing fraud detection on ethereum using graph neural network. In Proceedings of the

International Conference on Blockchain and Trustworthy Systems, Chengdu, China, 4–5 August 2022; Springer: Singapore, 2022;

pp. 362–375.

30.

Pahuja, L.; Kamal, A. EnLEFD-DM: Ensemble Learning Based Ethereum Fraud Detection Using CRISP-DM Framework. Expert

Syst. 2023,40, e13379. [CrossRef]

31.

Github. Github Repository Dataset. 2025. Available online: https://github.com/fatihertam/ethereumfrauddetection (accessed

on 19 May 2025).

32.

Kilincer, I.F. Explainable AI supported hybrid deep learnig method for layer 2 intrusion detection. Egypt. Inform. J. 2025,

30, 100669. [CrossRef]

33.

Ahn, J.M.; Kim, J.; Kim, K. Ensemble machine learning of gradient boosting (XGBoost, LightGBM, CatBoost) and attention-based

CNN-LSTM for harmful algal blooms forecasting. Toxins 2023,15, 608. [CrossRef]

34.

Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference

on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794.

35.

Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. Lightgbm: A highly efﬁcient gradient boosting decision

tree. Adv. Neural Inf. Process. Syst. 2017,30, 3149–3157.

Appl. Sci. 2025,15, 10841 21 of 21

36.

Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. Adv.

Neural Inf. Process. Syst. 2018,31, 6639–6649.

37.

Li, Z. Extracting spatial effects from machine learning model using local interpretation method: An example of SHAP and

XGBoost. Comput. Environ. Urban Syst. 2022,96, 101845. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual

author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to

people or property resulting from any ideas, methods, instructions or products referred to in the content.

0 views·21 pages

Near Real-Time Ethereum Fraud Detection Using Explainable AI in Blockchain Networks PDF Free Download

Near Real-Time Ethereum Fraud Detection Using Explainable AI in Blockchain Networks PDF free Download. Think more deeply and widely.

Uploaded by Amanda Boyer on 2/9/2026

/21

100%