Advanced AI and Machine Learning Models in Cryptocurrency Market Analysis and Prediction PDF Free Download

1 / 69
0 views69 pages

Advanced AI and Machine Learning Models in Cryptocurrency Market Analysis and Prediction PDF Free Download

Advanced AI and Machine Learning Models in Cryptocurrency Market Analysis and Prediction PDF free Download. Think more deeply and widely.

INTERDEPARTMENTAL
PROGRAMME OF
POSTGRADUATE STUDIES IN
BUSINESS ADMINISTRATION
Dissertation Thesis
Advanced AI and Machine Learning Models in
Cryptocurrency Market Analysis and Prediction
by
Stamatis Kavidopoulos
Under the supervision of:
Theodore Panagiotidis, Professor
Thesis submitted for the degree of Master in Business
Administration
September, 2024
i
Acknowledgments
I would like to express my deepest gratitude to several individuals who have been instrumental in the
completion of this dissertation.
First and foremost, I extend my heartfelt thanks to my advisor, Professor Theodore Panagiotidis, for his
unwavering guidance and support throughout this academic journey. His insightful feedback and
encouragement have been invaluable in shaping both this research and my growth as a scholar.
I am eternally grateful to my wife Aphrodite, for her boundless love, patience, and understanding during
the countless hours devoted to this work. Her unwavering belief in me and her constant support have
been my anchor throughout this challenging process.
To my beloved daughter Marilia, whose innocent curiosity and joy have been a constant source of
inspiration and motivation. The sacrificed playtime and bedtime stories will forever be appreciated, and
I hope that one day this work will make you proud.
This achievement would not have been possible without the love and support of my family. Your
encouragement has been the driving force behind my perseverance.
Thank you all for being an integral part of this academic milestone in my life.
ii
Abstract
This dissertation investigates the efficacy of advanced AI and machine learning models for Bitcoin price
prediction, addressing the inherent volatility and complexity of the cryptocurrency market.
The study employs a multi-faceted approach, integrating diverse data sources, including technical
indicators, macroeconomic factors, social media analysis, to enhance predictive accuracy. We develop
and evaluate various time-series models, including statistical models like SARIMAX, machine learning
models such as Prophet, and deep learning architectures like DeepAR and Temporal Fusion
Transformers (TFT).
Furthermore, we introduce a novel AI multi-agent framework comprising specialized AI agents for
financial analysis, research, and investment recommendations. This framework enriches the prediction
models with contextual insights and facilitates a more comprehensive understanding of market dynamics.
Our results demonstrate the superior performance of Prophet, in capturing both long-term trends and
short-term fluctuations of Bitcoin price movements. The integration of macroeconomic and technical
indicators further enhances predictive accuracy, highlighting the importance of incorporating market
context.
The developed multi-agent framework showcases the potential of combining specialized AI agents for
cryptocurrency market analysis, enabling more robust and well-informed investment recommendations.
This research contributes to the expanding field of financial forecasting using AI, providing valuable
insights for traders, investors, and researchers navigating the dynamic cryptocurrency landscape.
Keywords
Bitcoin, Cryptocurrency, Price Prediction, Time Series Analysis, Deep Learning, AI Multi-Agent Systems
iii
Disclaimer
The contents of this dissertation, including all analyses, frameworks, and models presented, are intended
for academic purposes only. The Multi-Agent Framework Architecture and its associated BTC
forecasting model are designed to demonstrate the application of multi-agent systems and machine
learning techniques in financial analysis.
It is important to note that the information and results provided herein do not constitute financial advice.
The author and associated institutions do not endorse any investment decisions or strategies based on
the findings of this research. Readers are strongly advised to consult with a certified financial advisor or
conduct independent research before making any investment decisions.
The performance of financial models can vary significantly based on market conditions and unforeseen
factors. Therefore, the accuracy and reliability of the BTC forecasting model, as well as any investment
recommendations derived from it, cannot be guaranteed. The author and affiliated institutions disclaim
any liability for financial losses or damages resulting from the use of this information.
iv
Table of Contents
Acknowledgements i
Abstract ii
Disclaimer iii
List of Figures vi
List of Tables vii
1. Introduction 1
1.1 Background and Motivation 1
1.2 Research Objectives 1
1.3 Significance of the Study 2
2. Literature Review 3
2.1 Cryptocurrency Market Analysis 3
2.2 Machine Learning in Financial Forecasting 5
2.3 Multi-Agent AI Systems in Finance 11
3.Methodology 13
3.1 Scope of the study 13
3.2 Data Collection and Preprocessing 14
3.3 Feature Selection 19
3.4 Time-Series Models for Bitcoin Price Prediction 20
3.4.1 Statistical Models 20
3.4.1.1 SARIMAX 21
3.4.1.2 GARCH 22
3.4.2 Machine Learning Models 22
3.4.2.1 Prophet 22
3.4.2.2 Non-Parametric Time Series (NPTS) 23
3.4.3 Deep Learning Models 24
3.4.3.1 Temporal Fusion Transformers (TFT) 24
v
3.4.3.2 DeepAR 26
3.5 Stationarity Tests 27
3.6 Multi-Agent Framework 29
3.6.1 Architecture 29
3.6.2 Financial Analyst Agent 30
3.6.3 Research Analyst Agent 31
3.6.4 Investment Advisor Agent 32
4. Results and Analysis 34
4.1 Model Performance Evaluation 34
4.2 Multi-Agent System Output Analysis 41
5. Conclusion 45
5.1 Implications for Cryptocurrency Market Analysis 45
5.2 Limitations of the Study 45
5.3 Future Research Directions 46
6. Appendix 46
6.1 Technical Implementation Details 46
6.2 Data Dictionary 47
7. References 58
vi
List of Figures
Figure 2.1.1: Daily Candlestick Chart of Bitcoin/TetherUS on Binance Exchange. 3
Figure 2.3.1: Overview of an LLM-powered autonomous agent system. 12
Figure 3.1.1: Overall architecture of the BTC prediction model pipeline. 13
Figure 3.2.1: Candlestick interpretation showing the OHLC values. 15
Figure 3.2.2: Adjusted Close prices for BTC and ETH for 2018 – 2024 in Python. 16
Figure 3.2.3: Trading volume for BTC and ETH for 2018 – 2024 in Python. 17
Figure 3.2.4: BTC Candlestick chart for 2024 in Python. 17
Figure 3.2.5: Correlation Adjusted Close prices for BTC and ETH for 2018 – 2024 in Python. 18
Figure 3.3.1: Feature selection pipeline for narrowing down the dimensions of the data. 20
Figure 3.6.1.1: Multi-Agent framework for BTC market analysis. 29
Figure 4.1.1: Top 20 features (Absolute Feature Importance) of SGD model. Absolute feature
importance is the average of absolute Shapley values computed for each feature. 35
Figure 4.1.2: Top 20 features (SHAP Plot) of SGD model. The Shapley values (x-axis) display the
relative impact of the feature value (color) on the record's prediction. 37
Figure 4.1.3: Evaluation metrics comparison between different models. Models from left to right are
Transformers, DeepAR, Prophet, AutoARIMA, NPTS. 39
Figure 4.1.4: Time-based 5-fold cross-validation test. 40
Figure 4.2.1: Output Report of Financial Analyst Agent with CrewAI library in Python. 41
Figure 4.2.2: Output Report of Research Analyst Agent with CrewAI library in Python. 42
Figure 4.2.3: Output Report of Investment Advisor Agent with CrewAI library in Python. 43
vii
List of Tables
Table 2.2.1: Literature Review of ML and AI approaches for Cryptocurrency Price Prediction. 10
Table 4.1.1: Feature Selection Models evaluated to pick the top 20 features for the TS model. 35
Table 4.1.2: Time Series Models evaluation metrics. 38
Table 1 (Appendix): Data Taxonomy of features that have been used for this analysis. 57
1
1. Introduction
1.1 Background and Motivation
In the last decade, Bitcoin has emerged as the progenitor of a financial revolution, introducing the world
to the concept of cryptocurrency—a digital or virtual form of currency secured by cryptography and built
on blockchain technology (Nakamoto, 2008).
But, where Bitcoin as an entity belong? The answer is inside Blockchain. It is the underlying technology
of Bitcoin, a decentralized ledger that records all transactions across a network of computers, ensuring
transparency and security without the need for a central authority (Swan, 2015).
As the flagship cryptocurrency, Bitcoin has not only paved the way for a myriad of other digital currencies
but has also challenged traditional notions of currency and financial sovereignty (Yermack, 2013). The
cryptocurrency markets, characterized by their decentralized nature, have seen exponential growth,
attracting a diverse array of participants from individual investors to large institutional entities (Catalini
and Gans, 2016).
These markets are marked by their high volatility, 24/7 operation, and a global reach that transcends the
constraints of conventional financial systems (Glaser et al., 2014). The allure of Bitcoin and its
counterparts lies in their potential to (Böhme et al., 2015):
Offer high returns
Facilitate rapid and cost-effective transactions
Provide a degree of anonymity
However, these same characteristics also contribute to the complexity and unpredictability of the crypto
markets, making them a fertile ground for the application of advanced AI and Machine Learning models
to navigate their intricacies and harness their potential for profit and innovation (Madan et al., 2015).
1.2 Research Objectives
The primary objectives of this research are:
To develop and evaluate advanced AI and machine learning models for Bitcoin price prediction
using time-series algorithms, such as SARIMAX, Prophet, LSTM, GRU, and DeepAR.
To create a comprehensive multi-agent framework that integrates financial analysis, research, and
investment recommendations for cryptocurrency market analysis, in order to enrich the model
prediction with insights.
2
To assess the effectiveness of combining traditional technical indicators with macro-economic
factors and sentiment analysis in improving prediction accuracy.
To compare the performance of various AI/ML models and determine their relative strengths in
cryptocurrency market forecasting.
1.3 Significance of the Study
This thesis contributes to the expanding knowledge base on financial time-series forecasting using
advanced AI and Deep Learning algorithms (Sezer et al., 2020) and also to the upcoming trend of AI
Agents. By focusing specifically on the cryptocurrency market, it addresses a gap in the existing research
and provides valuable insights for both academic and practical applications.
The significance of this study lies in several key areas:
Advanced Modeling Techniques: By implementing and comparing multiple time-series
models, including deep learning approaches like LSTM and DeepAR, this research contributes to
the understanding of which techniques are most effective for cryptocurrency price
prediction. (Casolaro et al., 2023)
Comprehensive Data Integration: The study incorporates a wide range of data sourcesweb
scraped and not - including technical indicators, macro-economic variables, Fear and Greed
index, uncertainty indices and social media aggregated analytics, providing a holistic approach to
market analysis.
Multi-Agent AI Framework: The development of a multi-agent AI compound system
represents an innovative approach to cryptocurrency market analysis, combining the strengths of
different AI agents to produce comprehensive investment recommendations.
Methodological Contribution: By addressing the challenges of data collection, preprocessing,
and feature engineering in the context of cryptocurrency markets, this research provides valuable
insights for future studies in this field. (Lotfi et al., 2021)
In conclusion, this study aims to advance the understanding of cryptocurrency market dynamics through
the application of cutting-edge AI and machine learning techniques. By developing a multi-faceted
approach that combines advanced time-series modeling with a multi-agent framework, this research seeks
to provide a more accurate and comprehensive tool for cryptocurrency market analysis and prediction.
3
2. Literature Review
2.1 Cryptocurrency Market Analysis
The emergence of cryptocurrencies has revolutionized the financial landscape, introducing a novel asset
class that challenges traditional economic paradigms. Bitcoin, the pioneering cryptocurrency, has become
a focal point for investors and researchers alike, owing to its unique characteristics and market behavior
(Narayanan et al., 2016). The cryptocurrency market's inherent volatility and the complex interplay of
factors influencing price movements have spurred the development of sophisticated analytical
approaches.
Market analysis in the cryptocurrency domain encompasses a wide array of techniques, ranging from
fundamental analysis to advanced statistical methods. Fundamental analysis in this context often involves
scrutinizing blockchain metrics, such as transaction volumes, active addresses and hash rates, to gauge
network health and adoption rates (Koutmos, 2018). These on-chain indicators provide insights into user
engagement and potential price trajectories, offering a unique perspective not available in traditional
financial markets.
Technical analysis, a staple of conventional market analysis, has found new applications in cryptocurrency
trading. Traders and analysts employ various chart patterns, trend indicators, and oscillators to identify
potential entry and exit points. However, the effectiveness of these tools in the highly volatile
cryptocurrency market remains a subject of ongoing debate and research (Corbet et al., 2019).
Figure 2.1.1: Daily Candlestick Chart of Bitcoin/TetherUS on Binance Exchange. Source:
CoinMarketCap 2024
4
Figure 2.1.1 illustrates the daily candlestick chart for Bitcoin/TetherUS (BTC/USDT) on the Binance
exchange, covering a period from late March to late July 2024. Each candlestick represents one day of
trading activity, with green candlesticks indicating a higher closing price compared to the opening price,
and red candlesticks indicating a lower closing price compared to the opening price.
The chart also features Bollinger Bands, a technical analysis tool comprising three components:
A 20-day simple moving average (SMA) as the middle line.
An upper band set two standard deviations above the SMA.
A lower band set two standard deviations below the SMA.
These bands provide insights into market volatility and potential overbought or oversold conditions,
facilitating better-informed trading decisions. Beneath the candlestick chart, the volume bars represent
the daily trading volume, which serves as an indicator of market activity and liquidity. Higher volume
bars correspond to days with increased trading activity, which can signify stronger market trends or
investor sentiment. This figure is essential for understanding the dynamics of Bitcoin's price movements
and market sentiment. By observing the interaction between the candlesticks and the Bollinger Bands,
along with trading volumes, researchers can gain insights into the market's response to various factors
and identify potential trading opportunities or risks. This analysis is crucial for developing predictive
models and strategies in cryptocurrency trading and investment.
Additionally, the integration of sentiment analysis has emerged as a crucial component of cryptocurrency
market analysis. Given the decentralized nature of cryptocurrencies and the significant role of social
media in information dissemination, monitoring public sentiment can provide valuable insights into
market dynamics. Studies have demonstrated correlations between social media sentiment and
cryptocurrency price movements, highlighting the importance of incorporating this dimension into
analytical models (Karalevicius et al., 2018).
Macroeconomic factors have also been shown to influence cryptocurrency markets, albeit in ways that
may differ from traditional asset classes. Research has explored the relationships between cryptocurrency
prices and variables such as interest rates, inflation, and geopolitical events, revealing complex and
sometimes counterintuitive connections (Panagiotidis et al., 2018).
The advent of machine learning and artificial intelligence has opened new avenues for cryptocurrency
market analysis. These technologies enable the processing of vast amounts of data from diverse sources,
uncovering patterns and relationships that may not be apparent through conventional analysis methods.
Deep learning models, in particular, have shown promise in capturing the non-linear dynamics of
5
cryptocurrency markets (McNally et al., 2018). Despite these advancements, cryptocurrency market
analysis faces unique challenges.
The market's 24/7 nature, global accessibility, and the influence of regulatory developments create a
complex ecosystem that defies simple modeling approaches. As the cryptocurrency market continues to
evolve, so do the methods and tools for its analysis. The integration of multidisciplinary approaches,
combining insights from finance, computer science, and behavioral economics, represents a promising
direction for future research. By synthesizing diverse analytical techniques and leveraging cutting-edge
technologies, researchers and practitioners aim to develop more robust and accurate models for
understanding and predicting cryptocurrency market behavior.
2.2 Artificial Intelligence (AI) in Financial Forecasting
The application of machine learning techniques to financial forecasting has ushered in a new era of
predictive analytics, transforming the landscape of investment strategies and risk management. As
financial markets generate vast quantities of data, traditional statistical methods often fall short in
capturing the intricate patterns and non-linear relationships inherent in these complex systems.
Machine learning algorithms, with their ability to process and learn from large datasets, have emerged as
powerful tools for deciphering market trends and predicting future movements (Henrique et al., 2019).
Among the various machine learning approaches, supervised learning algorithms have gained significant
traction in financial forecasting. These methods, including support vector machines (SVM), random
forests, and gradient boosting trees, have demonstrated remarkable efficacy in predicting stock prices and
market indices. For instance, a study by Patel et al. (2015) compared the performance of ANN, SVM,
random forest, and naive-Bayes in forecasting stock & stock price index, revealing that random forest
outperformed other techniques in terms of prediction accuracy.
Deep learning, a subset of Machine Learning (ML) and AI inspired by the structure and function of the
human brain, has shown particular promise in financial time series forecasting. Recurrent Neural
Networks (RNNs), especially Long Short-Term Memory (LSTM) networks, have excelled in capturing
temporal dependencies in financial data. These models can remember important information over long
periods, making them well-suited for analyzing market trends that unfold over time.
Fischer and Krauss (2018) demonstrated the superiority of LSTM networks over deep neural networks,
random forests, and logistic regression classifiers in predicting stock market movements. The integration
of natural language processing (NLP) with machine learning has opened new avenues for financial
forecasting. By analyzing textual data from news articles, social media, and financial reports, NLP
6
techniques can gauge market sentiment and extract valuable insights that impact asset prices.
Xing et al. (2018) developed a novel deep learning framework that combines NLP with technical analysis
for stock market prediction, achieving improved accuracy compared to traditional methods. Ensemble
methods, which combine predictions from multiple models, have gained popularity in financial
forecasting due to their ability to reduce overfitting and improve generalization. Techniques such as
bagging (Random Forest), boosting (XGBoost, GBM, LightGBM), and stacking have been successfully
applied to various financial prediction tasks. For example, Weng et al. (2018) proposed an ensemble
approach combining multiple neural networks for stock market forecasting, demonstrating enhanced
predictive performance compared to individual models.
Despite the promising results, machine learning in financial forecasting faces several challenges. The non-
stationary nature of financial time series, where statistical properties change over time, poses difficulties
for model generalization. Additionally, the interpretability of complex machine learning models remains
a concern, particularly in regulated financial environments where decision-making processes must be
transparent (Bussmann et al., 2020).
As the field evolves, researchers are exploring innovative approaches to address these challenges.
Transfer learning techniques, which allow models trained on one task to be applied to related tasks, show
potential in improving model adaptability to changing market conditions. Moreover, the development of
explainable AI methods aims to enhance the interpretability of machine learning models in financial
applications (Bracke et al., 2019).
The integration of machine learning with traditional financial theories and subject matter experts (SMEs)
knowledge represents a promising direction for future research. By combining data-driven insights with
domain expertise, hybrid approaches can potentially yield more robust and reliable financial forecasting
models. As computational capabilities continue to advance and new algorithms emerge, the role of
machine learning in shaping financial decision-making processes is poised to grow, offering
unprecedented opportunities for market participants to gain competitive advantages in an increasingly
complex financial landscape.
7
A/A Author Title
Data
Models Conclusions
Technical
Indices
Social
media
Macro-
economic
Factors
Blockc
hain
Crypto
News
1
Panagiotidis
et. al.
| 2018 |
On the
determinan
ts of
bitcoin
returns:
a LASSO
approach
-
GLMNET
LASSO,
Least Angle
Regresssion
LASSO
The study identifies key
determinants of Bitcoin
returns using the LASSO
regression method. Google
trends, gold returns, and
policy uncertainty emerge
as the most significant
factors influencing Bitcoin
returns. Various other
factors such as stock
market indices, exchange
rates, and central bank
rates also affect Bitcoin
returns to varying degrees,
with these relationships
varying across different
sub-periods examined
within the study
timeframe.
2 Wu et. al.
| 2018 |
A new
Forecasting
Framework
for Bitcoin
Price with
LSTM
-
-
-
-
Convention
al LSTM,
LSTM with
Autoregress
ive (AR)
features
The paper presents a new
forecasting framework that
utilizes LSTM models for
predicting Bitcoin's daily
price. Two variations of
the LSTM model were
evaluated: a conventional
LSTM and an LSTM
combined with
autoregressive (AR)
features. The results
indicate that incorporating
AR features significantly
enhances the forecasting
accuracy of the LSTM
model. The proposed
model outperformed the
conventional LSTM in
terms of various
performance metrics,
including MSE =
61170.21, RMSE = 247.33,
MAE = 176.37, and
MAPE = 2.553.
8
3
McNally et.
al.
| 2018 |
Predicting
the Price of
Bitcoin
Using
Machine
Learning
-
-
-
ARIMA,
RNN, LSTM
The research conducted
aimed to predict the price
direction of Bitcoin using
various ML models. The
LSTM model yielded the
highest classification
accuracy (52%) and a
RMSE of 8%,
outperforming the RNN
and significantly surpassing
the ARIMA model's
accuracy and RMSE. These
findings support the
effectiveness of LSTM
models for the prediction
of Bitcoin prices,
demonstrating their
capacity to recognize
longer-term dependencies
in the dataset.
4
Yiying et. al.
| 2019 |
Cryptocurre
ncy Price
Analysis
with
Artificial
Intelligence
-
-
-
-
ANN, LSTM
(Testing
multiple
memory lenghts)
The study found that both
ANN and LSTM models
are effective in
cryptocurrency price
prediction, with ANN
relying more on long-term
history and LSTM on
short-term dynamics.
Despite their different
internal structures, both
frameworks can predict
prices with a reasonable
degree of accuracy. This
indicates that
cryptocurrency market
price is predictable to
some extent, but the
efficiency of prediction
varies depending on the
model used.
5
Pang et. al.
| 2019 |
Cryptocurre
ncy Price
Prediction
using Time
Series and
Social
Sentiment
Data
-
-
-
ARIMA,
ARIMAX,
LSTM,
Decision
Tree Models
The integration of
sentiment data with ML
models significantly
improves the accuracy of
cryptocurrency price
prediction. LSTM models
in particular showed robust
performance, while
decision trees helped filter
false positive signals.
Sentiment data from social
media is critical for
capturing the non-linear
relationship between price
and market behaviors. The
results suggest a potential
for data-driven algorithmic
trading in cryptocurrency
markets.
9
6
Livieris et. al.
| 2020 |
Ensemble
Deep
Learning
Models for
Forecasting
Cryptocurre
ncy Time-
Series
-
-
-
-
CNN-LSTM,
CNN-
BiLSTM
(Ensemble-
Averaging /
Bagging
Ensemble /
Stacking
Ensemble)
The study combined three
ensemble learning
strategies with advanced
deep learning models to
forecast major
cryptocurrency (BTC,
ETH, XRP) hourly prices.
It was found that ensemble
learning strategies could
improve the accuracy of
price predictions when
combined with deep
learning techniques, such
as LSTM and Bi-
directional LSTM
networks. The application
of these models could
result in strong, stable, and
reliable forecasting models,
beneficial for decision-
making and portfolio
optimization in the volatile
cryptocurrency market.
7
Derbentsev
et. al.
| 2020 |
Comparati
ve
Performan
ce of
Machine
Learning
Ensemble
Algorithms
for
Forecasting
Cryptocurr
ency Prices
-
-
-
-
Stohastic
GBM
(SGBM),
Random
Forest (RF)
The study verifies the
applicability of ML
ensembles approach for
the forecasting of
cryptocurrency prices. The
out-of-sample accuracy of
short-term prediction daily
close prices obtained by
SGBM and RF in terms of
Mean Absolute Percentage
Error (MAPE) for the
three most capitalized
cryptocurrencies (BTC,
ETH, and XRP) were
within 0.92-2.61%.
8
Iqbal et. al.
| 2021 |
Time-
Series
Prediction
of
Cryptocurr
ency
Market
using
Machine
Learning
Techniques
-
-
-
-
ARIMA,
XGBoost,
FBProphet
The study applied ARIMA,
FBProphet, and XGBoost
machine learning
techniques to forecast BTC
prices. ARIMA was
determined to be the most
accurate model for
forecasting Bitcoin prices
in the cryptocurrency
market, with the lowest
RMSE score of 322.4 and
MAE score of 227.3,
indicating its effectiveness
over the other models
tested.
10
9
Parekh et.
al.
| 2022 |
DL-GuesS:
Deep
Learning
and
Sentiment
Analysis-
Based
Cryptocurr
ency Price
Prediction
-
-
-
Hybrid
model of
LSTM and
GRU NNs
The proposed DL-GuesS
framework integrates DL
models with sentiment
analysis to predict
cryptocurrency prices,
specifically DASH, LTC,
BTC and BCH. The study
demonstrates that the
inclusion of sentiment
analysis from social media
significantly enhances the
model's prediction
capabilities, underscoring
the value of fusing various
data types to improve
forecasting in the highly
volatile cryptocurrency
market. This holistic
approach highlights the
importance of both
historical data and real-
time market sentiment in
predicting cryptocurrency
price fluctuations and can
serve as a robust model for
investors and financial
institutions.
10
Murray K et
al.
| 2023 |
On
Forecasting
Cryptocurr
ency
Prices: A
Compariso
n of
Machine
Learning,
Deep
Learning,
and
Ensembles
-
-
-
LSTM,
Temporal
Fusion
Transforme
r (TFT),
hybrid
models
Deep learning approaches,
especially LSTM,
consistently perform best
across various
cryptocurrencies. This
study also introduced the
TFT model, demonstrating
its potential in improving
prediction accuracy
11 Proposed
approach
Advanced
AI and
Machine
Learning
Models in
Cryptocurr
ency
Market
Analysis
and
Prediction
-
ARIMA,
FBProphet,
DeepAR,
TFT, SGD,
NPTS
The proposed approach
demonstrates that deep
learning architectures,
especially Prophet, excel in
Bitcoin price prediction by
capturing complex market
dynamics. Integrating
macroeconomic indicators
significantly improves
predictive accuracy. The
developed AI multi-agent
framework, combining
specialized AI agents,
enables market analysis
and robust investment
recommendations.
Table 2.2.1: Literature Review of ML and AI approaches for Cryptocurrency Price Prediction
11
2.3 Multi-Agent AI Systems in Finance
The integration of multi-agent artificial intelligence (AI) systems into the financial sector represents a
paradigm shift in how complex market dynamics are modeled, analyzed, and predicted. These systems,
comprising multiple autonomous AI entities collaborating within a shared environment, offer a nuanced
approach to tackling the multifaceted challenges inherent in financial markets. Before going any further,
let’s clear out what an AI Agent is.
AI agents are autonomous or semi-autonomous systems capable of planning, reasoning, acting and
accessing memory, to make decisions with minimum human intervention. They utilize large language
models (LLMs) as the core "brain," supported by several key components that enable complex problem-
solving and task execution. This framework allows the agent to tackle intricate challenges by breaking
them down into manageable parts and leveraging various tools and capabilities.
Key Components
Planning
Subgoal Decomposition: The agent breaks down large, complex tasks into smaller,
more manageable subgoals. This enables efficient handling of multifaceted problems by
addressing each segment systematically.
Reflection and Refinement: The agent engages in self-criticism and self-reflection on
past actions, learning from mistakes to improve future performance and enhance the
quality of final results.
Memory
Short-term Memory: Utilizes in-context learning techniques to retain and apply recent
information.
Long-term Memory: Employs external vector stores or databases for retaining and
quickly retrieving large amounts of information over extended periods.
Tool Use
The agent learns to interact with external APIs and resources to access information not
contained in its pre-trained model weights. Some examples are connecting with Google, SEC,
ML Model APIs etc.
12
Figure 2.3.1: Overview of an LLM-powered autonomous agent system.
Multi-agent frameworks involve multiple autonomous entities working together to tackle complex tasks.
These frameworks enable decentralized decision-making, coordination, and emergent behaviors, leading
to more efficient, robust, and scalable solutions in various domains, including finance.
One of the innovative applications of multi-agent AI systems in finance is anomaly detection. A multi-
agent framework grounded in LLMs can advance anomaly detection in financial markets by alleviating
the manual labor involved in verifying anomaly alerts through a network of AI agents, each specializing
in distinct aspects like data analysis and report consolidation. This framework exemplifies how AI can
augment human capabilities, specifically in financial monitoring, and represents a leap towards
autonomous financial analysis (Park, 2024).
AI agents have evolved to employ LLMs for sophisticated functionalities in trading and financial
decision-making. Platforms like FinRobot support multiple financially specialized AI agents, each
powered by LLMs, to tackle intricate tasks in a cooperative manner (Yang et al., 2024).
Multi-agent AI systems, particularly those leveraging LLMs and reinforcement learning, are transforming
the financial sector, including cryptocurrency markets. These systems enhance operational efficiency,
provide data-driven insights, and enable autonomous financial analysis and decision-making. As these
technologies continue to evolve, they promise to further revolutionize the way financial markets operate,
offering new opportunities for innovation and efficiency.
13
3. Methodology
3.1 Scope of the study
This study focuses on processing and analyzing cryptocurrency data, with the target on forecasting
Bitcoin (BTC) Adjusted Close Price, along with various macro-economic indicators, uncertainty
indices, technical indicators and social media sentiment. By combining data from multiple sources, a
comprehensive dataset is created for analysis and machine learning purposes.
Time Period
The analysis period spans from 2018 to the end of March 2024. This timeframe was strategically selected
to encompass the most recent data available, while also capturing intervals both before and after the
onset of the COVID-19 pandemic.
Data Capture: A total number of 2,282 records have been gathered in a Daily format.
Figure 3.1.1: Overall architecture of the BTC prediction model pipeline.
In Figure 3.1.1 there is an overview of the pipeline that starts from data acquisition, data scraping, to
data pre-processing, feature selection until the Time-Series Modeling and the evaluation.
Those topics will be thoroughly analyzed in the upcoming sections.
14
3.2 Data Collection and Preprocessing
Data collection is the foundational step in any AI/ML project. It involves gathering relevant data that
aligns with the project's goals and objectives. This data can come from various sources, including internal
databases, public datasets, APIs, and web scraping techniques.
In our case, multiple data sources have been utilized and scraped from the web, from Kaggle and static
files. Specifically, the below main data sources:
Yahoo Finance: This provides historical data on crypto prices, ETFs, BTC ETFs, gold price etc.
Fear and Greed Index: This index measures the fear and greed sentiment in the BTC market.
FRED (Federal Reserve Economic Data): This provides data related to GDP, CPI, Inflation
rates, mostly for the US market.
Social media (Twitter-X): This refers to the number of tweets related to the BTC topic, derived
from Kaggle.
Technical Indicators: Includes 85 technical indicators to analyze Bitcoin’s market dynamics.
CoinMarketCap: This provides data on different cryptocurrencies, including their market cap,
volume, cryptocurrency dominance.
Global Policy Uncertainty (GPR): This provides data from Global Policy Uncertainty.
World Uncertainty Index (WUI): This provides data from the World Uncertainty Index.
For additional information on the above features, you can refer the Appendix 6.2 section.
The most important concept to understand is the OHLC framework that is also being used to derive and
extract the target value of the model. The Open, High, Low, and Close (OHLC) features are crucial
elements in the technical analysis of Bitcoin, providing a detailed snapshot of price activity over a
specified period, as you have already seen from Figure 2.1.1.
The 'Open' price signifies the initial trading value at the beginning of the period, while the 'Close' price
represents the final trading value at the end. The 'High' price denotes the peak value reached during the
period, and the 'Low' price indicates the lowest value. Those features are also used to produce technical
indicators. Together, these metrics help in identifying trends, assessing volatility, and spotting potential
market reversals, all of which are essential for traders and analysts in developing strategies and making
informed decisions.
The importance of OHLC data is especially pronounced in the context of Bitcoin due to its significant
15
price fluctuations, driven by market sentiment, regulatory developments, and technological changes.
Figure 3.2.1: Candlestick interpretation showing the OHLC values.
Combining multiple data sources necessitated extensive data pre-processing to ensure the integrity and
usability of the dataset. Some of the data extracted from those sources are in different timelines, for
instance the granularity could be daily, monthly, quarterly. To merge and combine those, we need to find
common ground.
This involves several crucial steps like:
1. Data Cleaning
Resampling: Adjusting the data from all the data sources to “Daily” so that we can have
daily records.
Missing Values: Handling missing values within the data, by using a hybrid approach of
back filling and linear interpolation.
Handling Outliers: Identifying and mitigating anomalies to produce a refined dataset.
2. Feature Engineering
Lag Features: Creating features for Bitcoin OPEN, HIGH, LOW, CLOSE, VOLUME for
that incorporate previous time steps (1,3,5,7 days) to enrich the dataset.
Engineered Features: Developing additional features derived from the cleaned data to
enhance predictive power and analytical depth.
16
These pre-processing steps were essential in transforming raw data into a structured and insightful
dataset, paving the way for robust analysis and accurate conclusions.
Exploratory Data Analysis
Exploratory data analysis revealed significant insights from these pre-processed datasets. The OHLC
framework, fundamental in technical analysis, provided a detailed snapshot of Bitcoin’s price activity over
analysis period.
In the figure below, you can see the adjusted close price of BTC/ ETH, two of the most famous and
important cryptocurrencies in the market. We can examine the fluctuation of the prices on specific period
of times and more specifically for BTC who our target is, we can find the below milestones:
11 May 2020: Bitcoin’s block reward is halved, from 12.5 to 6.25 Bitcoins. (This is one of the
reasons that the price went up in later months)
7 September 2021: El Salvador officially becomes the first country in the world to accept Bitcoin
as legal tender.
24 September 2021: China's top regulators ban crypto trading and mining, causing the Bitcoin
price to drop
11 November 2022: The bankruptcy and collapse of the cryptocurrency exchange FTX causes a
wide ripple effect across cryptocurrency markets, with the price of Bitcoin falling to its lowest
level in two years.
10 January 2024: 11 spot bitcoin ETFs approved by the SEC
Figure 3.2.2: Adjusted Close prices for BTC and ETH for 2018 – 2024 in Python.
17
Figure 3.2.3: Trading volume for BTC and ETH for 2018 – 2024 in Python.
Same insights can be derived from the Volume as well, but as we can see in Figure 3.2.3, there is a big
surge in the trading volume on 26 February 202, due to institutional changes.
Figure 3.2.4: BTC Candlestick chart for 2024 in Python.
The above figure includes the candlestick for the BTC price from Jan’ 2024 till end of March. The candles
include information about the OHLC framework as discussed in Chapter 3.2. The x-axis represents the
traded volume in millions and the y-axis the BTC price in USD.
It is remarkable to mention that the 3 lines describe the:
18
Short-term trends: The 10-day moving average (blue) responds more quickly to recent price
changes.
Medium-term trends: The 20-day moving average (orange) provides a balance between
responsiveness and smoothing.
Longer-term trends: The 30-day moving average (green) offers a broader view of the price
trend.
Figure 3.2.5: Correlation Adjusted Close prices for BTC and ETH for 2018 – 2024 in Python.
The correlation heatmap in Figure 3.2.5 reveals a highly positive correlation (0.93) between the adjusted
closing prices of Bitcoin (BTC) and Ethereum (ETH). This indicates that these two cryptocurrencies
tend to move in the same direction, suggesting a strong interdependence. This high correlation invites
further investigation into the shared factors influencing these price movements, such as market
sentiment, macroeconomic trends, and regulatory impacts.
For additional information regarding the data processing and EDA, you can see the documentation and
information in this GitHub repository: https://github.com/StamKavid/FinAgent
19
3.3 Feature Selection
Feature selection is a critical step in machine learning, essential for enhancing model performance by
identifying the most relevant features. This process reduces model complexity, improves generalization,
and increases accuracy (Stańczyk, 2015)
Techniques for feature selection are broadly categorized into:
Supervised
o Methods like: Information Gain, Chi-square Test, Fisher’s Score, and Correlation
Coefficient, leverage labeled data to enhance the efficiency of models like regression and
classification.
Unsupervised
o Methods like: Variance Threshold, Mean Absolute Difference, and Dispersion Ratio, are
applied to unlabeled data to select features based on intrinsic properties.
Further classification of these techniques includes:
Filter, like Information Gain and Chi-square Test, are computationally efficient and suitable for
high-dimensional data
Wrapper, such as Forward Feature Selection, Backward Feature Elimination, and Recursive
Feature Elimination, provide better predictive accuracy by evaluating various feature subsets
through model training
Embedded, like LASSO Regularization and Random Forest Importance, combine the benefits
of both filter and wrapper methods by integrating feature selection during the model training
process.
Hybrid methods are a combination of some of the above.
In the proposed methodology, a hybrid method has been developed in combination with the Filter and
Embedded, because of the high number of dimensions (329) that ended up in total.
The process involves an algorithm that looks on Pearson Correlation values and picks up the top 50
features that are correlated more with the target (Adjusted Close BTC Price). Then a Stochastic
Gradient Descent (SGD) algorithm is used to define the feature importance so that we can narrow
down the exogenous variables of the time-series model even more, as shown in Figure 3.3.1.
20
Figure 3.3.1: Feature selection pipeline for narrowing down the dimensions of the data.
The reason behind the methodology of the above figure is that based on experiments with all the features
in a time-series model, we were facing the below issues:
Overfitting risk: High-dimensional models are more prone to overfitting.
Computational Efficiency: Models with fewer features are computationally less expensive to
train and evaluate.
Interpretability: A model with a smaller number of features is easier to interpret and understand.
3.4 Time-Series Models for Bitcoin Price Prediction
3.4.1 Statistical Models
Time series analysis is a crucial tool in understanding and forecasting cryptocurrency market behavior.
Statistical models play a pivotal role in this analysis, offering a structured approach to capturing the
underlying patterns, trends, and volatilities inherent in cryptocurrency price movements (Chu et al., 2017).
These models are designed to account for various characteristics of financial time series data, such as:
1. Trend: Long-term movement in the series
2. Seasonality: Repeating patterns or cycles
3. Autocorrelation: Relationship between a variable's current value and its past values
4. Heteroskedasticity: Varying volatility over time
Two popular models that address these characteristics and are mainly used in the financial sector are
SARIMAX and GARCH. These models have shown effectiveness in analyzing and forecasting
21
cryptocurrency markets (Dyhrberg, 2016).
3.4.1.1 SARIMAX
SARIMAX (Seasonal AutoRegressive Integrated Moving Average with eXogenous variables) is an
extension of the ARIMA model that incorporates seasonal components and exogenous variables. It is
particularly useful for cryptocurrency time series analysis as it can capture both the intrinsic patterns in
the data and the influence of external factors (Box et al., 2015).
Mathematical Representation
The SARIMAX model can be mathematically represented as:
(,,) × (,,)
where:
is the order of the non-seasonal AR part,
is the degree of non-seasonal differencing,
is the order of the non-seasonal MA part,
, , and are the seasonal orders of the AR, differencing, and MA parts, respectively,
is the number of periods in a season,
t represents the exogenous variables that may influence the time series.
The full equation of the SARIMAX model is:
=+
 +
 +
 +
 ++
where:
t is the time series at time ,
 is a constant (intercept),
i are the coefficients for the non-seasonal AR terms,
i are the coefficients for the non-seasonal MA terms,
Φi are the coefficients for the seasonal AR terms,
Θi are the coefficients for the seasonal MA terms,
is the coefficient matrix for the exogenous variables t,
t is the error term at time .
22
3.4.1.2 GARCH
GARCH (Generalized AutoRegressive Conditional Heteroskedasticity) is a statistical model specifically
designed to handle heteroskedasticity in time series data. It's particularly useful in cryptocurrency analysis
due to the high volatility often observed in these markets (Engle, 2001).
Mathematical Representation
The GARCH (p, q) model is represented as:
=+
 +

where:
is the conditional variance at time t
ω is a constant term
εt is the error term
αi and βj are the ARCH and GARCH parameters respectively
The GARCH model allows the conditional variance to be dependent upon its own lags, which is
particularly useful for capturing volatility clustering often observed in cryptocurrency markets (Chu et al.,
2017).
3.4.2 Machine Learning Models
3.4.2.1 Prophet
Prophet, developed by Facebook (now Meta), is an open-source forecasting tool designed to handle time
series data with strong seasonal effects and multiple seasons of historical data. It's particularly useful in
cryptocurrency analysis due to its ability to capture both long-term trends and short-term fluctuations
(Taylor & Letham, 2018).
Mathematical Representation
The Prophet model is represented as:
󰇛󰇜󰇛󰇜󰇛󰇜󰇛󰇜󰇛󰇜
23
where:
y(t) is the observed value at time t
g(t) is the trend component
s(t) is the seasonal component
h(t) is the holiday component
ε(t) is the error term
The trend component g(t) can be modeled as linear or logistic growth. The seasonal component s(t) is
modeled using Fourier series, allowing for multiple periodicities. The holiday component h(t) allows for
incorporating known events that may impact the time series.
Prophet's decomposable time series model and its ability to handle missing data and outliers make it
particularly suitable for analyzing cryptocurrency markets, which often exhibit complex patterns and
sudden changes (Yenidogan et al., 2018)
3.4.2.2 Non-Parametric Time Series (NPTS)
Non-Parametric Time Series (NPTS) techniques offer a versatile approach to analyzing time series data
without relying on specific distribution assumptions or rigid structural models. These methods generate
forecasts by drawing samples from the full range of historical observations in the training dataset.
However, NPTS doesn't treat all past data points equally. Instead, it assigns varying weights to previous
observations based on their temporal distance from the current prediction point. Specifically, NPTS
employs an exponential decay function to weight historical data, giving much higher importance to recent
observations compared to those from the distant past. This weighting scheme ensures that the model
places greater emphasis on more recent trends and patterns when making predictions.
Mathematical Representation
The function qT(t) is defined as follows:
() = {  (|()()|)
 unweighted
 (|()()|)
 weighted
24
where:
t is the current time point for which the prediction is being made.
T′ is a reference time point.
D is the dimensionality of the feature space.
fi(t) is the i-th feature at time t.
α is a scaling factor used in the unweighted version.
ai is a weighting factor specific to the i-th feature in the weighted version.
Unweighted Model: In the unweighted model, all features contribute equally to the computation of
qT(t). The term α acts as a uniform scaling factor across all feature differences.
Weighted Model: In the weighted model, each feature fi is assigned a specific weight ai, allowing for
different contributions of each feature to the overall metric. This model is useful when certain features
are known to be more influential in the prediction process.
NPTS methods offer robustness and flexibility in cryptocurrency analysis, as they can capture complex
patterns without making strong assumptions about the underlying data-generating process. This is
particularly useful given the high volatility and rapid market changes often observed in cryptocurrency
trading (Rangapuram et al., 2023)
3.4.3 Deep Learning Models
3.4.3.1 Temporal Fusion Transformers (TFT)
Temporal Fusion Transformers (TFT) represent a novel architecture specifically designed for
interpretable, multi-horizon forecasting in time series data. Introduced by Lim et al. (2019), TFTs are
particularly well-suited for cryptocurrency analysis due to their ability to handle complex temporal
dynamics and incorporate multiple types of inputs.
Key Components
1. Variable Selection Networks: These networks dynamically select relevant input variables at
each time step, enhancing the model's interpretability and efficiency.
2. Gating Mechanisms: Allow the model to skip over any temporal elements as needed, helping
to mitigate noise in the input data.
3. Temporal Self-Attention Layers: Enable the model to learn long-term dependencies in the time
25
series data, crucial for capturing market cycles and long-term trends in cryptocurrency prices.
4. Static Covariate Encoders: Process time-independent features, which can be particularly useful
for incorporating metadata about different cryptocurrencies.
Mathematical Representation
The core of the TFT's multi-head attention mechanism can be represented as:
(,,)=(,…,)
where each head is computed as:
=,,
and:
Q,
K
and are query, key, and value matrices
, , , and are learned parameter matrices
TFTs offer several advantages for cryptocurrency time series analysis:
1. Interpretability: The variable selection networks provide insights into which features are most
important at different forecasting horizons.
2. Multi-horizon Forecasting: TFTs can generate predictions for multiple future time steps
simultaneously, useful for both short-term and long-term cryptocurrency price forecasting.
3. Handling Mixed Frequency Data: The model can effectively process inputs sampled at
different frequencies, which is common in cryptocurrency data (e.g., price data might be available
at minute intervals, while some economic indicators are only available monthly).
4. Incorporating Static Metadata: TFTs can leverage time-invariant information about
cryptocurrencies, potentially improving forecast accuracy.
26
3.4.3.2 DeepAR
DeepAR, introduced by Salinas et al. (2020), is a probabilistic forecasting model based on autoregressive
recurrent neural networks. It's particularly well-suited for cryptocurrency analysis due to its ability to learn
complex patterns across multiple related time series and generate probabilistic forecasts.
Key Components
1. Autoregressive Structure: DeepAR models the conditional distribution of future observations
given past observations, capturing temporal dependencies in cryptocurrency price movements.
2. Recurrent Neural Network: Typically, an LSTM or GRU, which learns to encode the relevant
information from past observations into a latent state.
3. Likelihood Function: DeepAR can use various likelihood functions (e.g., Gaussian, Negative
Binomial) to model the conditional distribution, allowing flexibility in handling different types of
cryptocurrency data.
Mathematical Representation
The core of DeepAR can be represented as: ,,
,=,
,=,,,,,
where:
, is the observation for time series
i
at time
t
is the chosen likelihood function
, are the parameters of the likelihood function
, is the hidden state of the RNN
, are optional covariates
and are neural networks with parameters
DeepAR offers several advantages for cryptocurrency time series analysis:
1. Probabilistic Forecasts: DeepAR generates full predictive distributions, providing uncertainty
estimates crucial for risk management in cryptocurrency trading.
27
2. Learning Across Multiple Time Series: The model can learn patterns from multiple
cryptocurrencies simultaneously, potentially improving forecast accuracy for less liquid assets.
3. Handling Missing Data and Variable Length Series: DeepAR can naturally deal with missing
observations and time series of different lengths, common in cryptocurrency data.
4. Incorporating Covariates: The model can leverage additional features like trading volume or
market sentiment indicators to improve forecasts.
However, challenges remain in interpreting the model's decisions and in ensuring the chosen likelihood
function adequately captures the often heavy-tailed and highly volatile nature of cryptocurrency returns.
3.5 Stationarity Tests
Stationarity is a crucial property in time series analysis, as many statistical procedures assume that the data
is stationary. A stationary process has constant mean, variance, and autocovariance structure over time
(Gujarati & Porter, 2009). Non-stationary data can lead to spurious regressions and unreliable forecasts.
Therefore, before proceeding with further analysis, it is essential to test for stationarity in the time series
data.
This section employs three widely-used stationarity tests:
the Augmented Dickey-Fuller (ADF) test
the Phillips-Perron (PP) test
the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test
These tests complement each other and provide a comprehensive assessment of the stationarity
properties of the time series.
Augmented Dickey-Fuller (ADF) Test
The ADF test is an extension of the Dickey-Fuller test and is used to test for the presence of a unit root
in a time series sample (Dickey & Fuller, 1979).
The test equation is:
=++++++
where is the time series, is a constant, is the coefficient on a time trend, and is the lag order of
the autoregressive process. The null hypothesis is that a unit root is present ( = 0 ), while the alternative
hypothesis is that the process is stationary ( < 0 ).
28
Phillips-Perron (PP) Test
The Phillips-Perron test is a non-parametric method to test for unit roots (Phillips & Perron, 1988). It
addresses the issue of serial correlation in the error terms by using a non-parametric correction to the t-
test statistic. The PP test is robust to general forms of heteroskedasticity in the error term. The test
regression is: =++
The null hypothesis is that the series has a unit root ( = 1 ), while the alternative hypothesis is that the
process is stationary ( < 1 ).
Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test
The KPSS test differs from the ADF and PP tests in that its null hypothesis assumes the series is
stationary (Kwiatkowski et al., 1992). This test complements the ADF and PP tests, as it can help
distinguish between series that appear to be stationary, series that have a unit root, and series that are not
sufficiently informative to be sure whether they are stationary or integrated.
The test statistic is based on the residuals from the OLS regression of the time series on an intercept and
possibly a time trend: =++
The test statistic is defined as: =

where =
 is the partial sum of residuals, is the sample size, and
is an estimator of the
spectral density at frequency zero.
The null hypothesis is that the process is trend stationary, while the alternative hypothesis is that the
process has a unit root.
29
3.6 Multi-Agent Framework
3.6.1 Architecture
In contemporary financial analysis, the complexity and volume of data necessitate sophisticated tools and
methodologies to extract actionable insights. AI multi-agent compound systems represent a robust
approach, integrating various autonomous agents to collectively achieve complex tasks that would be
challenging for a single entity (Zaharia et al. 2024).
This chapter delineates the architecture of a multi-agent framework designed for Bitcoin (BTC)
forecasting, elucidating the roles, interactions, and components of each agent within the system.
The Multi-Agent Framework Architecture depicted in Figure 3.6.1.1 is designed to enhance BTC
forecasting by leveraging multiple specialized agents. The framework comprises three primary agents:
Financial Analyst
Research Analyst
Investment Advisor
Each agent interacts with distinct tools to perform its designated functions, ultimately converging their
outputs to generate a comprehensive forecast through an AI model.
Figure 3.6.1.1: Multi-Agent framework for BTC market analysis.
By leveraging advanced data collection, processing techniques, and machine learning models, this
framework aims to provide accurate and actionable insights into the dynamic cryptocurrency market.
30
This architecture not only enhances forecasting capabilities but also offers a scalable and adaptable
solution for broader financial analysis applications.
3.6.2 Financial Analyst Agent
The primary objective of this Agent is to collect and summarize recent news articles, press releases, and
market analyses related to the cryptocurrency BTC and its industry. The focus should be on significant
events, market sentiments, analysts' opinions, and upcoming events such as earnings reports.
Specific Goals:
Collect and summarize the latest news.
Identify notable shifts in market sentiment.
Analyze potential impacts on BTC.
Provide a comprehensive report detailing these findings.
Tools
To accomplish this task, the following tools and resources will be utilized:
Google Search: Utilizes web scraping and API calls (SerpAPI) to gather real-time financial data
and market trends, based on the correspondent task.
Calculator: Employs mathematical knowledge to analyze the collected data, generating key
financial metrics and indicators.
Expected Output
The expected outcome of this task is a comprehensive report that includes:
Report Components:
1. Detailed Summary of the Latest News:
o Provide a thorough summary of the most recent and relevant news articles related to
Bitcoin.
2. Notable Shifts in Market Sentiment:
o Highlight any significant changes in how the market views cryptocurrency, based on
recent data and sentiment analysis.
3. Potential Impacts on Bitcoin:
o Assess how recent developments and news might affect the cryptocurrency's market
performance and valuation.
31
4. Cryptocurrency Ticker: Bitcoin
o Ensure the report specifies the ticker symbol for the cryptocurrency Bitcoin.
3.6.3 Research Analyst Agent
The primary objective of this Agent is to conduct a thorough analysis of the cryptocurrency's financial
health and market performance. This includes examining key financial metrics and comparing the
cryptocurrency's performance to its peers and overall market trends.
Specific Goals:
Examine key financial metrics such as 24-hour trading volume, circulating supply, market
capitalization, and total value locked (TVL).
Analyze the cryptocurrency's performance relative to its competitors and market trends.
Provide a detailed assessment of the cryptocurrency's financial standing, including its strengths
and weaknesses.
Tools
To accomplish this task, the following tools and resources will be utilized:
Google Search: Utilizes web scraping and API calls (SerpAPI) to gather real-time financial data
and market trends, based on the correspondent task.
Crypto Panic: News aggregator and media platform for cryptocurrency news, market data, and
sentiment analysis.
Yahoo Finance News: Provides up-to-date financial news, market data, and analysis for various
financial markets, including cryptocurrencies.
Expected Output
The expected outcome of this task is a detailed report that includes:
Report Components:
1. Expanded Summary of the Analysis:
o Provide a comprehensive summary of the analysis conducted on the cryptocurrency's
financial health and market performance.
2. Assessment of Financial Standing:
o Offer a clear and detailed assessment of the cryptocurrency's financial standing based on
key financial metrics.
32
3. Strengths and Weaknesses:
o Identify and discuss the strengths and weaknesses of cryptocurrency, highlighting areas
of financial robustness and potential concerns.
4. Comparison with Competitors:
o Compare the cryptocurrency's performance with its competitors, considering the current
market scenario and trends.
o Include relevant charts and tables to support the comparative analysis.
3.6.4 Investment Advisor Agent
The primary objective of this Agent is to review and synthesize the analyses provided by the Financial
Analyst and the Research Analyst, combining these insights to form a comprehensive investment
recommendation.
Specific Goals:
Review analyses covering financial health, market sentiment, insider trading activity, and
upcoming events.
Create a comprehensive investment recommendation report.
Tools
To accomplish this task, the following tools and resources will be utilized:
ML Model API (Price Forecasting):
o Use machine learning model (based on the time-series model already established) to
forecast cryptocurrency trends and market movements.
Google Search:
o Perform additional searches to gather supplementary information and validate the
findings from analysts.
Yahoo Finance News:
o Access up-to-date financial news, market data, and analysis to support the investment
recommendation.
Calculator:
o Perform necessary calculations to analyze financial metrics and market data.
33
Expected Output
The expected outcome of this task is a comprehensive investment recommendation report that includes:
Report Components:
1. Market Stance:
o Clearly state whether the market is bullish, bearish, or neutral for the specific
cryptocurrency.
2. Investment Stance and Strategy:
o Provide a detailed and clear investment stance and strategy based on the synthesized
analyses.
3. Supporting Evidence:
o Include supporting evidence from the financial and market analyses conducted by the
Financial Analyst and Research Analyst.
4. Forecasting capabilities:
o Based on the already created ML model the agent is capable of forecasting the future BTC
price.
5. Professional Presentation:
o Ensure the report is comprehensive, professional, and aesthetically pleasing, with well-
formatted sections for readability and presentation.
34
4. Results and Analysis
4.1 Model Performance Evaluation
The evaluation of time series model performance is a critical step in determining the most suitable
forecasting approach for BTC price prediction. This section outlines the methodology employed to assess
and compare the efficacy of various time series models, providing insights into their relative strengths
and weaknesses in the context of cryptocurrency price forecasting.
Feature Selection Model(s)
Feature selection is a crucial step in optimizing machine learning models by enhancing performance,
reducing overfitting, and decreasing computational costs. Tree-based models like XGBoost and
LightGBM, as well as linear models such as SGD (Stochastic Gradient Descent), are effective in feature
selection. Below is a table summarizing the key aspects of feature selection using these models, including
their evaluation metrics.
Model
Algorithm
type
Evaluation Metrics Top 5 Features
Random Forest Gradient Boosting R2 = 0.956
RMSE = 2602
Low Lag 1
Trend Psar Up
High Lag 1
Trend Ichimoku Conv
Gradient Boosting
Trees Gradient Boosting
R2 = 0.922
RMSE = 3476
Volume VPT
Trend Psar Up
Volume ADI
XGBoost Gradient Boosting R2 = 0.932
RMSE = 3239
Close Lag 1
Volume VPT
Trend Psar Up
Trend Ichimoku Conv
High Lag 1
35
LightGBM Gradient Boosting
R2 = 0.948
RMSE = 2842
Close Lag 1
Trend Psar Up
Low Lag 1
High Lag 1
Trend Psar Down
SGD Linear Model
(SGD)
R2 = 0.960
RMSE = 2491
Trend Psar Up
Close Lag 1
SP500 adjusted price
Low Lag 1
Volume VPT
Table 4.1.1: Feature Selection Models evaluated to pick the top 20 features for the TS model.
The SGD model outperforms the other models based on key evaluation metrics, which are critical for
assessing model accuracy and reliability.
Highest R² Score: The R² (coefficient of determination) score of 0.960 indicates that the SGD
model explains 96% of the variance in the target variable. This is the highest R² score among all
models, signifying superior predictive power.
Lowest RMSE: The Root Mean Square Error (RMSE) of 2491 is the lowest among the models,
indicating that the SGD model has the smallest average prediction error. Lower RMSE values
reflect higher accuracy in predictions.
Figure 4.1.1: Top 20 features (Absolute Feature Importance) of SGD model. Absolute feature
importance is the average of absolute Shapley values computed for each feature.
36
In the context of predicting the adjusted Bitcoin price, understanding the importance of each feature
used by the model is crucial.
Here we analyze the top 5 features based on their importance scores and assess their relevance to Bitcoin
price prediction.
Top 5 Features
1. TREND_PSAR_UP (Importance: 0.1268)
Description: The Parabolic SAR (Stop and Reverse) indicator is used to identify potential
reversals in the market. The "UP" variant indicates an upward trend or continuation.
Relevance to target: The TREND_PSAR_UP feature is highly relevant as it helps in
identifying upward trends in Bitcoin prices, which is critical for predicting future price
movements. Given the volatility of Bitcoin, detecting trend reversals is particularly valuable.
2. CLOSE_lag1 (Importance: 0.1188)
Description: This feature represents the closing price of Bitcoin on the previous day.
Relevance to target: The closing price from the previous day is a strong predictor of the
next day's price, capturing the immediate past performance and momentum. This feature is
highly pertinent as it directly reflects recent market behavior.
3. SP500_ADJUSTED (Importance: 0.088)
Description: Adjusted closing price of the S&P 500 index.
Relevance to target: While Bitcoin often behaves independently of traditional markets, there
can be periods where broader economic trends influence cryptocurrency prices. The S&P 500
can provide insights into overall market sentiment, making this feature moderately relevant.
4. LOW_lag1 (Importance: 0.0762)
Description: The lowest price of Bitcoin on the previous day.
Relevance to target: The lowest price from the previous day helps in understanding the
volatility and potential support levels. This feature is useful for risk assessment and
anticipating market lows, which are crucial for accurate price predictions.
5. VOLUME_VPT (Importance: 0.0715)
Description: Volume Price Trend indicator, which combines price and volume to identify
the direction and strength of trends.
Relevance to target: Trading volume is a significant indicator of market activity and
momentum. The Volume Price Trend (VPT) captures both the price and volume, providing
37
insights into the strength of market movements. This is particularly important for a volatile
asset like Bitcoin.
Figure 4.1.2: Top 20 features (SHAP Plot) of SGD model. The Shapley values (x-axis) display the relative
impact of the feature value (color) on the record's prediction.
As shown in the above figure, based on the SHAP plots we can derive some conclusions on the
importance of the features in accordance with the target, which is the adjusted BTC price.
The feature TREND_PSAR_UP has the 1st most impact. A higher value is associated with a
higher prediction.
o Feature impact: 2958
The feature CLOSE_lag1 has the 2nd most impact. A higher value is associated with a higher
prediction.
o Feature impact: 2773
The feature SP500_ADJUSTED has the 3rd most impact. A lower value is associated with a
higher prediction.
o Feature impact: 2071
Time Series Model(s)
In the context of predicting the adjusted Bitcoin price, employing time series models is essential due to
the temporal nature of the data. Exogenous variables, or external regressors, can provide additional
context and improve the model's accuracy by incorporating influential external factors. These variables
have been selected based on the Top 50 features from the Feature Selection model and also the
stationarity tests that have been applied, as mentioned in the above section.
38
The time step used for the analysis is 1 Day and the forecasting parameters where:
Forecast horizon: 10 days (Models are trained on the historical data to forecast time steps in the
forecast horizon)
Horizons in evaluation: 1 (Metrics are computed on the evaluation time steps.
To ensure a comprehensive assessment of model performance, we employ a range of evaluation metrics,
each offering unique insights into different aspects of predictive accuracy:
Model
Evaluation Metrics
Description
AutoARIMA
MASE = 1.799 (± 7.022)
MAPE = 12.2% (± 47.3%)
RMSE = 3688
MAE = 3448
AutoARIMA automatically finds
the optimal ARIMA
(AutoRegressive Integrated
Moving Average) model
according to an information
criterion.
Transformer
MASE = 0.558 (± 0.505)
MAPE = 4.4% (± 4.1%)
RMSE = 1143
MAE = 1040
Transformer estimator is a
transformer neural network that
forecasts probability
distributions for the next
forecast horizon values, given
the preceding context length
values.
DeepAR
MASE = 0.453 (± 0.328)
MAPE = 3.5% (± 2.6%)
RMSE = 987
MAE = 865
DeepAR is an autoregressive
recurrent neural network that
forecasts probability
distributions for the next
forecast horizon values given the
preceding context length values.
Prophet
MASE = 0.034 (± 0.027)
MAPE = 0.2% (± 0.3%)
RMSE = 79.150
MAE = 65.106
Prophet is a procedure for
forecasting time series data based
on an additive model where non-
linear trends are fit with yearly,
weekly, and daily seasonality.
NPTS
MASE = 0.688 (± 0.679)
MAPE = 5.3% (± 5.3%)
RMSE = 1488
MAE = 1316
Non-Parametric Time Series
predictor predicts future values
by sampling from past
observations. The sampling
weights can follow either a
uniform or exponentially
decreasing distribution, and
optionally take into account the
seasonality of the time series.
Table 4.1.2: Time Series Models evaluation metrics.
39
Figure 4.1.3: Evaluation metrics comparison between different models. Models from left to right are
Transformers, DeepAR, Prophet, AutoARIMA, NPTS.
To ensure a comprehensive evaluation of the model's performance, several metrics are employed.
The formulas for these metrics and a description are provide below:
Mean Absolute Error (MAE)
The Mean Absolute Error (MAE) measures the average magnitude of errors between predicted and actual
values. It is calculated as the average of the absolute differences between the predicted values (
) and
the actual values (). MAE is straightforward to interpret as it provides the error in the same units as the
target variable.
Formula:
MAE = 1
|
|

Root Mean Squared Error (RMSE)
The Root Mean Squared Error (RMSE) measures the square root of the average squared differences
between predicted and actual values. RMSE penalizes larger errors more than MAE, making it sensitive
to outliers. It provides an overall indication of the magnitude of prediction errors.
Formula:
RMSE =1
(
)

Mean Absolute Percentage Error (MAPE)
The Mean Absolute Percentage Error (MAPE) expresses the prediction error as a percentage of the actual
values. It is calculated as the average of the absolute percentage differences between predicted and actual
40
values. MAPE is useful for understanding the relative error magnitude and is scale-independent.
Formula:
MAPE =1
 ×100%
Mean Absolute Scaled Error (MASE)
The Mean Absolute Scaled Error (MASE) is a scale-independent metric that compares the MAE of the
model to the MAE of a naive forecast. It is particularly useful for comparing forecast accuracy across
different time series. MASE helps in understanding how well the model performs relative to a simple
baseline.
Formula:
=
1
|
|

1
||

Cross-Validation Strategy
To ensure robust evaluation and mitigate the risk of overfitting, we implement a time-based cross-
validation approach:
Figure 4.1.4: Time-based 5-fold cross-validation test.
The metrics used to rank models obtained by different algorithms are computed on each of the test folds.
The final model is trained on the sampled dataset.
The offset between consecutive cross-test evaluation folds aims at avoiding overlaps with hyperparameter
search validation fold that could cause an optimistic bias on model evaluation metrics.
41
4.2 Multi-Agent System Output Analysis
This section provides an analysis of the outputs generated by the AI Multi-Agent System. The framework
comprises different agents, each specializing in distinct areas of financial analysis, research, and
investment advice. The figures below summarize the outputs from these agents, dated 19/07/2024.
Figure 4.2.1: Output Report of Financial Analyst Agent with CrewAI library in Python.
The Financial Analyst Agent provides a detailed summary of the latest news impacting Bitcoin (BTC),
highlighting key financial metrics, notable shifts in market sentiment, and potential impacts on Bitcoin's
short-term and long-term prospects. The analysis includes:
Bullish Metrics and Trader Optimism: Despite recent price cooling, bullish metrics suggest a
potential push higher for Bitcoin.
Concerns Over Large BTC Movements: The movement of $3 billion worth of BTC by Mt.
Gox introduces uncertainty and caution among market participants.
Industry Shift to AI: The transition of Texas Bitcoin miners towards AI technologies signifies a
strategic pivot within the industry.
Warnings on Market Volatility: High-profile warnings about volatility and risks in the
cryptocurrency market have created a mixed sentiment.
42
The detailed summary emphasizes Bitcoin's strong market capitalization, high liquidity, and scarcity, while
also acknowledging the risks associated with volatility, regulatory scrutiny, and energy consumption.
Figure 4.2.2: Output Report of Research Analyst Agent with CrewAI library in Python.
The Research Analyst Agent provides an expanded summary of Bitcoin's financial standing and a
comparison with key competitors such as Ethereum (ETH) and Binance Coin (BNB). Key insights
include:
Key Financial Metrics for Bitcoin (BTC): High trading volume, significant market
capitalization, and substantial circulating supply highlight Bitcoin's dominant market position.
Strengths and Weaknesses: Bitcoin's market dominance, high liquidity, and scarcity are
contrasted with its volatility, regulatory risks, and energy consumption challenges.
Comparison with Competitors: Ethereum offers greater use case diversity and leads in DeFi,
while Binance Coin excels in exchange utility.
The agent's analysis underscores Bitcoin's resilience and potential for long-term value appreciation,
despite its inherent risks and challenges.
43
Figure 4.2.3: Output Report of Investment Advisor Agent with CrewAI library in Python.
The Investment Advisor Agent delivers a comprehensive investment recommendation for Bitcoin
(BTC) based on current market sentiment, financial health, and upcoming events. Key recommendations
and strategies include:
Forecasting of BTC Price for 20 July 2024: The projected price of Bitcoin is $67,205.
Market Sentiment Analysis: Bullish sentiment is driven by positive market indicators and
industry shifts towards AI.
Regulatory and Insider Trading Concerns: Increased regulatory scrutiny and insider trading
44
activity influence market sentiment.
Investment Strategy: A bullish investment stance is recommended, with a focus on long-term
holding, diversification, and monitoring regulatory developments.
The agent's report aligns with the broader market optimism, highlighting Bitcoin's strong market presence
and potential for substantial growth, while advising caution in navigating regulatory risks and market
volatility.
These reports collectively provide a multi-faceted view of Bitcoin's current market dynamics, financial
health, and strategic investment considerations, leveraging the expertise of specialized agents to inform
decision-making.
45
5. Conclusion
5.1 Implications for Cryptocurrency Market Analysis
This dissertation carries significant implications for cryptocurrency market analysis, particularly in
understanding and predicting Bitcoin price movements:
Enhanced Predictive Accuracy: The integration of diverse data sources, including technical
indicators, macroeconomic factors, and sentiment analysis, significantly improves the accuracy of
Bitcoin price predictions. This approach provides a more holistic understanding of market
dynamics compared to traditional methods relying solely on historical price data.
Multi-Agent Frameworks for Comprehensive Analysis: The development of a multi-agent
framework showcases the potential of combining specialized AI agents for cryptocurrency market
analysis. This approach allows for the integration of diverse analytical perspectives, leading to
more robust and well-informed investment recommendations.
5.2 Limitations of the Study
Despite its contributions, this study acknowledges certain limitations:
Focus on Bitcoin: The study primarily focuses on Bitcoin, limiting the generalizability of
findings to other cryptocurrencies. Further research is needed to assess the applicability of the
proposed models and frameworks to altcoins with distinct market dynamics.
Data Availability and Quality: The accuracy of predictions relies heavily on the availability and
quality of data. Data has been collected for a small span of time (~6 years), inconsistencies across
sources, and the potential for manipulation in social media sentiment data pose ongoing
challenges.
Rapidly Evolving Market: The cryptocurrency market's dynamic and rapidly evolving nature
necessitates continuous model adaptation and refinement. New trends, regulatory changes, and
technological advancements can quickly render existing models outdated.
46
5.3 Future Research Directions
This study paves the way for several promising future research directions:
Expanding to Altcoin Analysis: Applying the proposed methodologies and frameworks to
other prominent cryptocurrencies will broaden the scope and generalizability of findings.
Incorporate more data. Enlarge the timeframe from 2018-2024 to the day that BTC start trading
so that we can capture more patterns and connections within data.
Incorporating On-Chain Metrics and Back testing: Integrating blockchain-specific data, such
as transaction volume, network hash rate, and active addresses, could further enhance predictive
accuracy by capturing fundamental network health indicators and also by leveraging Alpaca API
for back testing.
Exploring Explainable AI: Integrating explainable AI techniques will enhance the transparency
and trustworthiness of predictions, allowing users to understand the rationale behind model
decisions.
Real-Time Forecasting and Trading Strategies: Investigating the feasibility of deploying the
developed models and AI Agent frameworks for real-time forecasting and algorithmic trading
strategies could provide practical applications for market participants.
Social media extended Sentiment Analysis: Extract the sentiment of social media information
like tweets, Reddit, using a pre-trained model like FinBERT.
6. Appendix
6.1 Technical Implementation Details
The technical implementation of the analysis and models described in this document was carried out
using a combination of Python and Dataiku Data Science Studio (DSS). Below are the details of the
technologies and methodologies employed.
Technologies Used:
1. Python:
o Data Processing and Manipulation: Libraries such as Pandas and NumPy were used
for data cleaning, transformation, and analysis.
o Visualization: Matplotlib and Seaborn were utilized to create visual representations of
EDA section, including trends and distribution plots.
47
o Machine Learning: Scikit-learn provided a suite of algorithms and tools for training and
evaluating models. Advanced models such as XGBoost and LightGBM were used for
gradient boosting techniques.
o Time Series Analysis: Libraries like statsmodels and prophet were used for modeling
and forecasting time series data.
o Deep Learning: TensorFlow and PyTorch were used to implement and train deep
learning models, including Transformers and DeepAR.
2. Dataiku Data Science Studio (DSS):
o Data Integration: Dataiku DSS facilitated the integration and preprocessing of data from
various sources.
o AI/ML Models: The platform allowed the training of both Feature Selection and Time-
series model to choose the optimal hyperparameters and derive the evaluation metrics.
3. GitHub repository: https://github.com/StamKavid/FinAgent
6.2 Data Dictionary
The data dictionary presented in this section serves as a comprehensive guide to the features used in our
Bitcoin price prediction model. It provides a clear and concise description of each variable, ensuring
a thorough understanding of the dataset's structure and content.
This dictionary is crucial for interpreting the model's inputs and outputs, facilitating reproducibility, and
enabling other researchers to build upon this work. Each entry in the dictionary includes the feature name
and a detailed description of what it represents in the context of Bitcoin price analysis.
By establishing this common language and understanding, we lay the groundwork for the complex
analyses and discussions that follow in subsequent sections of this study.
48
No.
Feature
Description
Source
1
DATE
The date of the observation
Yahoo Finance
2 OPEN
The opening price of Bitcoin
for the given day
3 HIGH
The highest price of Bitcoin
reached during the day
4 LOW
The lowest price of Bitcoin
reached during the day
5 CLOSE
The closing price of Bitcoin
for the day
6 ADJ_CLOSE
The adjusted closing price of
Bitcoin, accounting for
corporate actions
7 VOLUME
The trading volume of Bitcoin
for the day
8 GOLD_ADJ_CLOSE
The adjusted closing price of
gold
9 SILVER_ADJ_CLOSE
The adjusted closing price of
silver
10 OIL_ADJ_CLOSE
The adjusted closing price of
oil
11
GOLD_VOLUME
The trading volume of gold
12
SILVER_VOLUME
The trading volume of silver
13
OIL_VOLUME
The trading volume of oil
14 EUR_USD_ADJ_CLOSE
The adjusted closing exchange
rate of Euro to US Dollar
15 USD_JPY_ADJ_CLOSE
The adjusted closing exchange
rate of US Dollar to Japanese
Yen
16 GBP_USD_ADJ_CLOSE
The adjusted closing exchange
rate of British Pound to US
Dollar
17 USD_CNY_ADJ_CLOSE
The adjusted closing exchange
rate of US Dollar to Chinese
Yuan
18 VIX_ADJ_CLOSE
The adjusted closing value of
the CBOE Volatility Index
19 CBOE_INTEREST_RATE_ADJ_CLOSE
The adjusted closing value of
the CBOE Interest Rate
20 TREASURY_YIELD_5YRS_ADJ_CLOSE
The adjusted closing yield of 5-
year US Treasury bonds
21 RUSSEL_2000_ADJ_CLOSE
The adjusted closing value of
the Russell 2000 index
22 ISHARES_20YR_ADJ_CLOSE
The adjusted closing price of
iShares 20+ Year Treasury
Bond ETF
23 TREASURY_BILL_13WK_ADJ_CLOSE
The adjusted closing yield of
13-week US Treasury bills
24 RUSSEL_2000_VOLUME
The trading volume of the
Russell 2000 index
25 ISHARES_20YR_VOLUME
The trading volume of iShares
20+ Year Treasury Bond ETF
Yahoo Finance
26 TESLA_ADJ_CLOSE
Adjusted closing price of Tesla
stock
27 AMD_ADJ_CLOSE
Adjusted closing price of
AMD stock
28 INTEL_ADJ_CLOSE
Adjusted closing price of Intel
stock
49
29 APPLE_ADJ_CLOSE
Adjusted closing price of
Apple stock
30 NVIDIA_ADJ_CLOSE
Adjusted closing price of
NVIDIA stock
31 META_ADJ_CLOSE
Adjusted closing price of Meta
(Facebook) stock
32 GOOGLE_ADJ_CLOSE
Adjusted closing price of
Google stock
33
TESLA_VOLUME
Trading volume of Tesla stock
34
AMD_VOLUME
Trading volume of AMD stock
35
INTEL_VOLUME
Trading volume of Intel stock
36 APPLE_VOLUME
Trading volume of Apple
stock
37 NVIDIA_VOLUME
Trading volume of NVIDIA
stock
38 META_VOLUME
Trading volume of Meta
(Facebook) stock
39 GOOGLE_VOLUME
Trading volume of Google
stock
40 GBTC_ADJ_CLOSE
Adjusted closing price of
Grayscale Bitcoin Trust
41 ARKB_ADJ_CLOSE
Adjusted closing price of ARK
21Shares Bitcoin ETF
42 BITB_ADJ_CLOSE
Adjusted closing price of
Bitwise Bitcoin ETF
43 FBTC_ADJ_CLOSE
Adjusted closing price of
Fidelity Wise Origin Bitcoin
Fund
44 BTCO_ADJ_CLOSE
Adjusted closing price of
Invesco Galaxy Bitcoin ETF
45 IBIT_ADJ_CLOSE
Adjusted closing price of
iShares Bitcoin Trust
46 HODL_ADJ_CLOSE
Adjusted closing price of
VanEck Bitcoin Trust ETF
47 BITO_ADJ_CLOSE
Adjusted closing price of
ProShares Bitcoin Strategy
ETF
48 GBTC_VOLUME
Trading volume of Grayscale
Bitcoin Trust
49 ARKB_VOLUME
Trading volume of ARK Next
Generation Internet ETF
50 BITB_VOLUME
Trading volume of Bitwise
Bitcoin ETF
51 FBTC_VOLUME
Trading volume of Fidelity
Wise Origin Bitcoin Fund
Yahoo Finance
52 BTCO_VOLUME
Trading volume of Invesco
Galaxy Bitcoin ETF
53 IBIT_VOLUME
Trading volume of iShares
Bitcoin Trust
54 HODL_VOLUME
Trading volume of VanEck
Bitcoin Trust ETF
55 BITO_VOLUME
Trading volume of ProShares
Bitcoin Strategy ETF
56 ETH_ADJ_CLOSE
Adjusted closing price of
Ethereum
57
ETH_VOLUME
Trading volume of Ethereum
58 USDT_ADJ_CLOSE
Adjusted closing price of
Tether
50
59
USDT_VOLUME
Trading volume of Tether
60 USDC_ADJ_CLOSE
Adjusted closing price of USD
Coin
61
USDC_VOLUME
Trading volume of USD Coin
62 DOGE_ADJ_CLOSE
Adjusted closing price of
Dogecoin
63
DOGE_VOLUME
Trading volume of Dogecoin
64
XRP_ADJ_CLOSE
Adjusted closing price of XRP
65
XRP_VOLUME
Trading volume of XRP
66 SOL_ADJ_CLOSE
Adjusted closing price of
Solana
67
SOL_VOLUME
Trading volume of Solana
68 GAS_ADJ_CLOSE
Adjusted closing price of Gas
(Ethereum network fees)
69
GAS_VOLUME
Volume of Gas transactions
70
GAS_USD
Gas price in USD
71 BTC_FEAR_AND_GREED_INDEX
Bitcoin market sentiment
index
Alternative.me API
72 EXTREME_FEAR
Indicator of extreme fear in
the market
73 EXTREME_GREED
Indicator of extreme greed in
the market
74
FEAR
Indicator of fear in the market
75 GREED
Indicator of greed in the
market
76 NEUTRAL
Indicator of neutral sentiment
in the market
77 SP500_ADJUSTED
Adjusted value of S&P 500
index
FRED
78
GDP
Gross Domestic Product
79
RGDP
Real Gross Domestic Product
80
UNRATE
Unemployment rate
81
CPI
Consumer Price Index
82
INTEREST_RATE_ADJUSTED
Adjusted interest rate
83
TREASURE_MATURITY_ADJUSTED
Adjusted treasury maturity rate
84
INFLATION_RATE_ADJUSTED
Adjusted inflation rate
85 STICKY_CPI
Sticky Price Consumer Price
Index
FRED
86
M2_MONEY_STOCK_ADJUSTED
Adjusted M2 Money Stock
87 VOLUME_ADI
Accumulation/Distribution
Index
Python ‘TA’ library
88
VOLUME_OBV
On-Balance Volume
89
VOLUME_CMF
Chaikin Money Flow
90
VOLUME_FI
Force Index
91
VOLUME_EM
Ease of Movement
92 VOLUME_SMA_EM
Simple Moving Average of
Ease of Movement
93
VOLUME_VPT
Volume Price Trend
94 VOLUME_VWAP
Volume Weighted Average
Price
95
VOLUME_MFI
Money Flow Index
96
VOLUME_NVI
Negative Volume Index
97
VOLATILITY_BBM
Bollinger Bands Middle Band
98
VOLATILITY_BBH
Bollinger Bands High Band
99
VOLATILITY_BBL
Bollinger Bands Low Band
100
VOLATILITY_BBW
Bollinger Bands Width
101
VOLATILITY_BBP
Bollinger Bands Percentage
51
102 VOLATILITY_BBHI
Bollinger Bands High
Indicator
103
VOLATILITY_BBLI
Bollinger Bands Low Indicator
104
VOLATILITY_KCC
Keltner Channel Central
105
VOLATILITY_KCH
Keltner Channel High
106
VOLATILITY_KCL
Keltner Channel Low
107
VOLATILITY_KCW
Keltner Channel Width
108
VOLATILITY_KCP
Keltner Channel Percentage
109 VOLATILITY_KCHI
Keltner Channel High
Indicator
110 VOLATILITY_KCLI
Keltner Channel Low
Indicator
111
VOLATILITY_DCL
Donchian Channel Low
112
VOLATILITY_DCH
Donchian Channel High
113
VOLATILITY_DCM
Donchian Channel Middle
114
VOLATILITY_DCW
Donchian Channel Width
115
VOLATILITY_DCP
Donchian Channel Percentage
116
VOLATILITY_ATR
Average True Range
117
VOLATILITY_UI
Ulcer Index
118 TREND_MACD
Moving Average Convergence
Divergence
119
TREND_MACD_SIGNAL
MACD Signal Line
120
TREND_MACD_DIFF
MACD Difference
121
TREND_SMA_FAST
Fast Simple Moving Average
122
TREND_SMA_SLOW
Slow Simple Moving Average
123 TREND_EMA_FAST
Fast Exponential Moving
Average
124 TREND_EMA_SLOW
Slow Exponential Moving
Average
125
TREND_VORTEX_IND_POS
Vortex Indicator Positive
126
TREND_VORTEX_IND_NEG
Vortex Indicator Negative
Python ‘TA’ library
127
TREND_VORTEX_IND_DIFF
Vortex Indicator Difference
128
TREND_TRIX
Triple Exponential Average
129
TREND_MASS_INDEX
Mass Index
130
TREND_DPO
Detrended Price Oscillator
131
TREND_KST
Know Sure Thing Oscillator
132
TREND_KST_SIG
Know Sure Thing Signal Line
133
TREND_KST_DIFF
Know Sure Thing Difference
134 TREND_ICHIMOKU_CONV
Ichimoku Cloud Conversion
Line
135
TREND_ICHIMOKU_BASE
Ichimoku Cloud Base Line
136
TREND_ICHIMOKU_A
Ichimoku Cloud Span A
137
TREND_ICHIMOKU_B
Ichimoku Cloud Span B
138
TREND_STC
Schaff Trend Cycle
139 TREND_ADX
Average Directional
Movement Index
140 TREND_ADX_POS
ADX Positive Directional
Indicator
141 TREND_ADX_NEG
ADX Negative Directional
Indicator
142
TREND_CCI
Commodity Channel Index
143
TREND_VISUAL_ICHIMOKU_A
Visual Ichimoku Cloud Span A
144
TREND_VISUAL_ICHIMOKU_B
Visual Ichimoku Cloud Span B
145
TREND_AROON_UP
Aroon Up Indicator
146
TREND_AROON_DOWN
Aroon Down Indicator
147
TREND_AROON_IND
Aroon Indicator
148
TREND_PSAR_UP
Parabolic SAR Uptrend
52
149
TREND_PSAR_DOWN
Parabolic SAR Downtrend
150 TREND_PSAR_UP_INDICATOR
Parabolic SAR Uptrend
Indicator
151 TREND_PSAR_DOWN_INDICATOR
Parabolic SAR Downtrend
Indicator
152
MOMENTUM_RSI
Relative Strength Index
153
MOMENTUM_STOCH_RSI
Stochastic RSI
154
MOMENTUM_STOCH_RSI_K
Stochastic RSI %K
155
MOMENTUM_STOCH_RSI_D
Stochastic RSI %D
156
MOMENTUM_TSI
True Strength Index
157
MOMENTUM_UO
Ultimate Oscillator
158
MOMENTUM_STOCH
Stochastic Oscillator
159
MOMENTUM_STOCH_SIGNAL
Stochastic Oscillator Signal
160
MOMENTUM_WR
Williams %R
161
MOMENTUM_AO
Awesome Oscillator
162
MOMENTUM_ROC
Rate of Change
163
MOMENTUM_PPO
Percentage Price Oscillator
164
MOMENTUM_PPO_SIGNAL
PPO Signal Line
165
MOMENTUM_PPO_HIST
PPO Histogram
166
MOMENTUM_PVO
Percentage Volume Oscillator
167
MOMENTUM_PVO_SIGNAL
PVO Signal Line
168
MOMENTUM_PVO_HIST
PVO Histogram
Python ‘TA’ library
169 MOMENTUM_KAMA
Kaufman's Adaptive Moving
Average
170
OTHERS_DR
Daily Return
171
OTHERS_DLR
Daily Log Return
172
OTHERS_CR
Cumulative Return
173 UNIQUE_USERS
Number of unique users (BTC
tweets)
Kaggle Dataset for
BTC Tweets
174 FOLLOWERS
Number of followers (BTC
tweets)
175 TWEET_COUNT
Number of tweets (BTC
tweets)
176 BTC_PERCENTAGE_DOMINANCE
Bitcoin's market dominance
percentage
CoinMarketCap
(CMC)
177 ETH_PERCENTAGE_DOMINANCE
Ethereum's market dominance
percentage
178 USDT_PERCENTAGE_DOMINANCE
Tether's market dominance
percentage
179 BNB_PERCENTAGE_DOMINANCE
Binance Coin's market
dominance percentage
180 SOL_PERCENTAGE_DOMINANCE
Solana's market dominance
percentage
181 OTHERS_PERCENTAGE_DOMINANCE
Other cryptocurrencies'
combined market dominance
percentage
182
GPR
Geopolitical Risk Index
Economic Policy
Uncertainty
183 GPRT
Geopolitical Risk Index
(Threats)
184
GPRA
Geopolitical Risk Index (Acts)
185 GPRH
Geopolitical Risk Index
(Historical)
186 GPRHT
Geopolitical Risk Index
(Historical Threats)
187 GPRHA
Geopolitical Risk Index
(Historical Acts)
188
SHARE_GPR
Share of Geopolitical Risk
53
189
N10
N10
190 SHARE_GPRH
Share of Historical
Geopolitical Risk
191
N3H
N3H
192 GPRH_NOEW
Geopolitical Risk Index
(Historical, No Equal
Weighting)
193 GPR_NOEW
Geopolitical Risk Index (No
Equal Weighting)
194 GPRH_AND
Geopolitical Risk Index
(Historical, AND method)
195 GPR_AND
Geopolitical Risk Index (AND
method)
196 GPRH_BASIC
Basic Geopolitical Risk Index
(Historical)
Economic Policy
Uncertainty
197
GPR_BASIC
Basic Geopolitical Risk Index
198 SHAREH_CAT_1
Share of Historical Category 1
GPR
199 SHAREH_CAT_2
Share of Historical Category 2
GPR
200 SHAREH_CAT_3
Share of Historical Category 3
GPR
201 SHAREH_CAT_4
Share of Historical Category 4
GPR
202 SHAREH_CAT_5
Share of Historical Category 5
GPR
203 SHAREH_CAT_6
Share of Historical Category 6
GPR
204 SHAREH_CAT_7
Share of Historical Category 7
GPR
205 SHAREH_CAT_8
Share of Historical Category 8
GPR
206 GPRC_ARG
Geopolitical Risk Index for
Argentina
207 GPRC_AUS
Geopolitical Risk Index for
Australia
208 GPRC_BEL
Geopolitical Risk Index for
Belgium
209 GPRC_BRA
Geopolitical Risk Index for
Brazil
210 GPRC_CAN
Geopolitical Risk Index for
Canada
211 GPRC_CHE
Geopolitical Risk Index for
Switzerland
212 GPRC_CHL
Geopolitical Risk Index for
Chile
213 GPRC_CHN
Geopolitical Risk Index for
China
214 GPRC_COL
Geopolitical Risk Index for
Colombia
215 GPRC_DEU
Geopolitical Risk Index for
Germany
216 GPRC_DNK
Geopolitical Risk Index for
Denmark
217 GPRC_EGY
Geopolitical Risk Index for
Egypt
218 GPRC_ESP
Geopolitical Risk Index for
Spain
54
219 GPRC_FIN
Geopolitical Risk Index for
Finland
220 GPRC_FRA
Geopolitical Risk Index for
France
221 GPRC_GBR
Geopolitical Risk Index for
United Kingdom
222 GPRC_HKG
Geopolitical Risk Index for
Hong Kong
Economic Policy
Uncertainty
223 GPRC_HUN
Geopolitical Risk Index for
Hungary
224 GPRC_IDN
Geopolitical Risk Index for
Indonesia
225 GPRC_IND
Geopolitical Risk Index for
India
226 GPRC_ISR
Geopolitical Risk Index for
Israel
227 GPRC_ITA
Geopolitical Risk Index for
Italy
228 GPRC_JPN
Geopolitical Risk Index for
Japan
229 GPRC_KOR
Geopolitical Risk Index for
South Korea
230 GPRC_MEX
Geopolitical Risk Index for
Mexico
231 GPRC_MYS
Geopolitical Risk Index for
Malaysia
232 GPRC_NLD
Geopolitical Risk Index for
Netherlands
233 GPRC_NOR
Geopolitical Risk Index for
Norway
234 GPRC_PER
Geopolitical Risk Index for
Peru
235 GPRC_PHL
Geopolitical Risk Index for
Philippines
236 GPRC_POL
Geopolitical Risk Index for
Poland
237 GPRC_PRT
Geopolitical Risk Index for
Portugal
238 GPRC_RUS
Geopolitical Risk Index for
Russia
239 GPRC_SAU
Geopolitical Risk Index for
Saudi Arabia
240 GPRC_SWE
Geopolitical Risk Index for
Sweden
241 GPRC_THA
Geopolitical Risk Index for
Thailand
242 GPRC_TUN
Geopolitical Risk Index for
Tunisia
243 GPRC_TUR
Geopolitical Risk Index for
Turkey
244 GPRC_TWN
Geopolitical Risk Index for
Taiwan
245 GPRC_UKR
Geopolitical Risk Index for
Ukraine
246 GPRC_USA
Geopolitical Risk Index for
United States
247 GPRC_VEN
Geopolitical Risk Index for
Venezuela
55
248 GPRC_VNM
Geopolitical Risk Index for
Vietnam
Economic Policy
Uncertainty
249 GPRC_ZAF
Geopolitical Risk Index for
South Africa
250 GPRHC_ARG
Historical Geopolitical Risk
Index for Argentina
251 GPRHC_AUS
Historical Geopolitical Risk
Index for Australia
252 GPRHC_BEL
Historical Geopolitical Risk
Index for Belgium
253 GPRHC_BRA
Historical Geopolitical Risk
Index for Brazil
254 GPRHC_CAN
Historical Geopolitical Risk
Index for Canada
255 GPRHC_CHE
Historical Geopolitical Risk
Index for Switzerland
256 GPRHC_CHL
Historical Geopolitical Risk
Index for Chile
257 GPRHC_CHN
Historical Geopolitical Risk
Index for China
258 GPRHC_COL
Historical Geopolitical Risk
Index for Colombia
259 GPRHC_DEU
Historical Geopolitical Risk
Index for Germany
260 GPRHC_DNK
Historical Geopolitical Risk
Index for Denmark
261 GPRHC_EGY
Historical Geopolitical Risk
Index for Egypt
262 GPRHC_ESP
Historical Geopolitical Risk
Index for Spain
263 GPRHC_FIN
Historical Geopolitical Risk
Index for Finland
264 GPRHC_FRA
Historical Geopolitical Risk
Index for France
265 GPRHC_GBR
Historical Geopolitical Risk
Index for United Kingdom
266 GPRHC_HKG
Historical Geopolitical Risk
Index for Hong Kong
267 GPRHC_HUN
Historical Geopolitical Risk
Index for Hungary
268 GPRHC_IDN
Historical Geopolitical Risk
Index for Indonesia
269 GPRHC_IND
Historical Geopolitical Risk
Index for India
270 GPRHC_ISR
Historical Geopolitical Risk
Index for Israel
271 GPRHC_ITA
Historical Geopolitical Risk
Index for Italy
Economic Policy
Uncertainty
272 GPRHC_JPN
Historical Geopolitical Risk
Index for Japan
273 GPRHC_KOR
Historical Geopolitical Risk
Index for South Korea
274 GPRHC_MEX
Historical Geopolitical Risk
Index for Mexico
275 GPRHC_MYS
Historical Geopolitical Risk
Index for Malaysia
276 GPRHC_NLD
Historical Geopolitical Risk
Index for Netherlands
56
277 GPRHC_NOR
Historical Geopolitical Risk
Index for Norway
278 GPRHC_PER
Historical Geopolitical Risk
Index for Peru
279 GPRHC_PHL
Historical Geopolitical Risk
Index for Philippines
280 GPRHC_POL
Historical Geopolitical Risk
Index for Poland
281 GPRHC_PRT
Historical Geopolitical Risk
Index for Portugal
282 GPRHC_RUS
Historical Geopolitical Risk
Index for Russia
283 GPRHC_SAU
Historical Geopolitical Risk
Index for Saudi Arabia
284 GPRHC_SWE
Historical Geopolitical Risk
Index for Sweden
285 GPRHC_THA
Historical Geopolitical Risk
Index for Thailand
286 GPRHC_TUN
Historical Geopolitical Risk
Index for Tunisia
287 GPRHC_TUR
Historical Geopolitical Risk
Index for Turkey
288 GPRHC_TWN
Historical Geopolitical Risk
Index for Taiwan
289 GPRHC_UKR
Historical Geopolitical Risk
Index for Ukraine
290 GPRHC_USA
Historical Geopolitical Risk
Index for USA
291 GPRHC_VEN
Historical Geopolitical Risk
Index for Venezuela
292 GPRHC_VNM
Historical Geopolitical Risk
Index for Vietnam
293 GPRHC_ZAF
Historical Geopolitical Risk
Index for South Africa
294 MARKET_CAP
Total market capitalization of
crypto market
CoinMarketCap
(CMC)
295 CRYPTO_VOLUME_24
Trading volume of
cryptocurrencies over the past
24 hours
296 WTUI
World Trade Uncertainty
Index
World Uncertainty
Index
297
WUI
World Uncertainty Index
298 BTC_DAILY_ABSOLUTE_CHANGE
Daily absolute change in
Bitcoin prices
Processed Data from
Yahoo Finance
299 BTC_DAILY_RETURNS_PERC
Daily returns in percentage of
Bitcoin prices
300 BTC_LOG_DIFFERENCE
Logarithmic difference in
Bitcoin prices
301 BTC_PRICE_MIN_7D
Minimum Bitcoin price over
the past 7 days
302 BTC_PRICE_MAX_7D
Maximum Bitcoin price over
the past 7 days
57
Table 1: Data Taxonomy of features that have been used for this analysis
303 BTC_PRICE_MIN_14D
Minimum Bitcoin price over
the past 14 days
304 BTC_PRICE_MAX_14D
Maximum Bitcoin price over
the past 14 days
305 BTC_PRICE_MIN_21D
Minimum Bitcoin price over
the past 21 days
306 BTC_PRICE_MAX_21D
Maximum Bitcoin price over
the past 21 days
307 BTC_PRICE_MIN_30D
Minimum Bitcoin price over
the past 30 days
308 BTC_PRICE_MAX_30D
Maximum Bitcoin price over
the past 30 days
309 BTC_PRICE_MIN_60D
Minimum Bitcoin price over
the past 60 days
310 BTC_PRICE_MAX_60D
Maximum Bitcoin price over
the past 60 days
58
7. References
Böhme, R., Christin, N., Edelman, B., & Moore, T. (2015). Bitcoin: Economics, technology, and
governance. Journal of economic Perspectives, 29(2), 213-238.
Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: forecasting and
control. John Wiley & Sons.
Bracke, P., Datta, A., Jung, C., & Sen, S. (2019). Machine learning explainability in finance: an application
to default risk analysis.
Bussmann, N., Giudici, P., Marinelli, D., & Papenbrock, J. (2020). Explainable AI in fintech risk
management. Frontiers in Artificial Intelligence, 3, 26.
Casolaro, A., Capone, V., Iannuzzo, G., & Camastra, F. (2023). Deep learning for time series forecasting:
Advances and open problems. Information, 14(11), 598.
Catalini, C., & Gans, J. S. (2020). Some simple economics of the blockchain. Communications of the
ACM, 63(7), 80-90.
Chu, J., Chan, S., Nadarajah, S., & Osterrieder, J. (2017). GARCH modelling of cryptocurrencies. Journal
of Risk and Financial Management, 10(4), 17.
Corbet, S., Lucey, B., Urquhart, A., & Yarovaya, L. (2019). Cryptocurrencies as a financial asset: A
systematic analysis. International Review of Financial Analysis, 62, 182-199.
Derbentsev, V., Babenko, V., Khrustalev, K. I. R. I. L. L., Obruch, H., & Khrustalova, S. O. F. I. I. A.
(2021). Comparative performance of machine learning ensemble algorithms for forecasting
cryptocurrency prices. International Journal of Engineering, 34(1), 140-148.
Dickey, D. A., & Fuller, W. A. (1979). Distribution of the estimators for autoregressive time series with
a unit root. Journal of the American statistical association, 74(366a), 427-431.
Dyhrberg, A. H. (2016). Bitcoin, gold and the dollar–A GARCH volatility analysis. Finance research
letters, 16, 85-92.
Engle, R. (2001). GARCH 101: The use of ARCH/GARCH models in applied econometrics. Journal of
economic perspectives, 15(4), 157-168.
Fischer, T., & Krauss, C. (2018). Deep learning with long short-term memory networks for financial
market predictions. European journal of operational research, 270(2), 654-669.
59
Glaser, F., Zimmermann, K., Haferkorn, M., Weber, M. C., & Siering, M. (2014). Bitcoin-asset or
currency? revealing users' hidden intentions. Revealing Users' Hidden Intentions (April 15, 2014). ECIS.
Gujarati, D. N. (2009). Basic econometrics.
Henrique, B. M., Sobreiro, V. A., & Kimura, H. (2019). Literature review: Machine learning techniques
applied to financial market prediction. Expert Systems with Applications, 124, 226-251.
Iqbal, M., Iqbal, M., Jaskani, F., Iqbal, K., & Hassan, A. (2021). Time-series prediction of cryptocurrency
market using machine learning techniques. EAI Endorsed Transactions on Creative Technologies, 8(28).
Karalevicius, V., Degrande, N., & De Weerdt, J. (2018). Using sentiment analysis to predict interday
Bitcoin price movements. The Journal of Risk Finance, 19(1), 56-75.
Koutmos, D. (2018). Bitcoin returns and transaction activity. Economics Letters, 167, 81-85.
Kwiatkowski, D., Phillips, P. C., Schmidt, P., & Shin, Y. (1992). Testing the null hypothesis of stationarity
against the alternative of a unit root: How sure are we that economic time series have a unit root?. Journal
of econometrics, 54(1-3), 159-178.
Lim, B., Arık, S. Ö., Loeff, N., & Pfister, T. (2021). Temporal fusion transformers for interpretable multi-
horizon time series forecasting. International Journal of Forecasting, 37(4), 1748-1764.
Livieris, I. E., Pintelas, E., Stavroyiannis, S., & Pintelas, P. (2020). Ensemble deep learning models for
forecasting cryptocurrency time-series. Algorithms, 13(5), 121.
Lotfi, C., Srinivasan, S., Ertz, M., & Latrous, I. Web Scraping Techniques and Applications: A Literature.
Madan, I., Saluja, S., & Zhao, A. (2015). Automated bitcoin trading via machine learning algorithms.
URL: http://cs229. stanford. edu/proj2014/Isaac% 20Madan, 20.
McNally, S., Roche, J., & Caton, S. (2018, March). Predicting the price of bitcoin using machine learning.
In 2018 26th euromicro international conference on parallel, distributed and network-based processing
(PDP) (pp. 339-343). IEEE.
Murray, K., Rossi, A., Carraro, D., & Visentin, A. (2023). On forecasting cryptocurrency prices: A
comparison of machine learning, deep learning, and ensembles. Forecasting, 5(1), 196-209.
Nakamoto, S. (2008). Bitcoin: A peer-to-peer electronic cash system.
Narayanan, A., Bonneau, J., Felten, E., Miller, A., & Goldfeder, S. (2016). Bitcoin and cryptocurrency
technologies: a comprehensive introduction. Princeton University Press.
Panagiotidis, T., Stengos, T., & Vravosinos, O. (2018). On the determinants of bitcoin returns: A LASSO
60
approach. Finance Research Letters, 27, 235-240.
Pang, Y., Sundararaj, G., & Ren, J. (2019, December). Cryptocurrency price prediction using time series
and social sentiment data. In Proceedings of the 6th IEEE/ACM International Conference on Big Data
Computing, Applications and Technologies (pp. 35-41).
Parekh, R., Patel, N. P., Thakkar, N., Gupta, R., Tanwar, S., Sharma, G., ... & Sharma, R. (2022). DL-
GuesS: Deep learning and sentiment analysis-based cryptocurrency price prediction. IEEE Access, 10,
35398-35409.
Park, T. (2024). Enhancing Anomaly Detection in Financial Markets with an LLM-based Multi-Agent
Framework. arXiv preprint arXiv:2403.19735.
Patel, J., Shah, S., Thakkar, P., & Kotecha, K. (2015). Predicting stock and stock price index movement
using trend deterministic data preparation and machine learning techniques. Expert systems with
applications, 42(1), 259-268.
Phillips, P. C. B. (1988). Testing for a Unit Root in Time Series Regression. Biometrika.
Rangapuram, S. S., Gasthaus, J., Stella, L., Flunkert, V., Salinas, D., Wang, Y., & Januschowski, T. (2023).
Deep Non-Parametric Time Series Forecaster. arXiv preprint arXiv:2312.14657.
Salinas, D., Flunkert, V., Gasthaus, J., & Januschowski, T. (2020). DeepAR: Probabilistic forecasting with
autoregressive recurrent networks. International journal of forecasting, 36(3), 1181-1191.
Sezer, O. B., Gudelek, M. U., & Ozbayoglu, A. M. (2020). Financial time series forecasting with deep
learning: A systematic literature review: 20052019. Applied soft computing, 90, 106181.
Stańczyk, U., & Jain, L. C. (2015). Feature selection for data and pattern recognition: An introduction
(pp. 1-7). Springer Berlin Heidelberg.
Swan, M. (2015). Blockchain: Blueprint for a new economy. " O'Reilly Media, Inc.".
Taylor, S. J., & Letham, B. (2018). Forecasting at scale. The American Statistician, 72(1), 37-45.
Weng, B., Lu, L., Wang, X., Megahed, F. M., & Martinez, W. (2018). Predicting short-term stock prices
using ensemble methods and online data sources. Expert Systems with Applications, 112, 258-273.
Wu, C. H., Lu, C. C., Ma, Y. F., & Lu, R. S. (2018, November). A new forecasting framework for bitcoin
price with LSTM. In 2018 IEEE international conference on data mining workshops (ICDMW) (pp.
168-175). IEEE.
Xing, F. Z., Cambria, E., & Welsch, R. E. (2018). Natural language based financial forecasting: a survey.
61
Artificial Intelligence Review, 50(1), 49-73.
Yang, H., Zhang, B., Wang, N., Guo, C., Zhang, X., Lin, L., ... & Wang, C. D. (2024). FinRobot: An
Open-Source AI Agent Platform for Financial Applications using Large Language Models. arXiv preprint
arXiv:2405.14767.
Yenidoğan, I., Çayir, A., Kozan, O., Dağ, T., & Arslan, Ç. (2018, September). Bitcoin forecasting using
ARIMA and PROPHET. In 2018 3rd international conference on computer science and engineering
(UBMK) (pp. 621-624). IEEE.
Yermack, D. (2024). Is Bitcoin a real currency? An economic appraisal. In Handbook of digital currency (pp.
29-40). Academic Press.
Yiying, W., & Yeze, Z. (2019, March). Cryptocurrency price analysis with artificial intelligence. In 2019
5th international conference on information management (ICIM) (pp. 97-101). IEEE.
Zaharia, M., Khattab, O., Chen, L., Davis, J. Q., Miller, H., Potts, C., ... & Ghodsi, A. (2024). The shift
from models to compound ai systems. Berkeley Artificial Intelligence Research Lab. Available online at:
https://bair. berkeley. edu/blog/2024/02/18/compound-ai-systems/(accessed February 27, 2024).