Who's on the Phone: Effectiveness of Cloned Vishing Messages PDF Free Download

1 / 32
0 views32 pages

Who's on the Phone: Effectiveness of Cloned Vishing Messages PDF Free Download

Who's on the Phone: Effectiveness of Cloned Vishing Messages PDF free Download. Think more deeply and widely.

Brigham Young University Brigham Young University
BYU ScholarsArchive BYU ScholarsArchive
Undergraduate Honors Theses
2025-06-13
Who's on the Phone: Effectiveness of Cloned Vishing Messages Who's on the Phone: Effectiveness of Cloned Vishing Messages
Katherine A. Rackliffe
Brigham Young University
Follow this and additional works at: https://scholarsarchive.byu.edu/studentpub_uht
BYU ScholarsArchive Citation BYU ScholarsArchive Citation
Rackliffe, Katherine A., "Who's on the Phone: Effectiveness of Cloned Vishing Messages" (2025).
Undergraduate Honors Theses
. 441.
https://scholarsarchive.byu.edu/studentpub_uht/441
This Honors Thesis is brought to you for free and open access by BYU ScholarsArchive. It has been accepted for
inclusion in Undergraduate Honors Theses by an authorized administrator of BYU ScholarsArchive. For more
information, please contact ellen_amatangelo@byu.edu.
1
Honors Thesis
WHO’S ON THE PHONE: EFFECTIVENESS OF CLONED VISHING MESSAGES
By
Katherine Rackliffe
Submitted to Brigham Young University in partial fulfillment of graduation requirements
for University Honors
Electrical and Computer Engineering Department
Brigham Young University
April 2025
Advisor: Benjamin Schooley
Honors Coordinator: Derek Hansen
2
3
ABSTRACT
WHO’S ON THE PHONE: EFFECTIVINESS OF CLONED VISHING MESSAGES
Katherine Rackliffe
Electrical and Computer Engineering Department
Bachelor of Cybersecurity
Voice cloning is increasingly used as a form of phone-based phishing attack (i.e.,
vishing attack). The purpose of this research is to understand the impact of voice cloning
on the success rate of vishing/phishing attacks. We hypothesize that calls utilizing cloned
voices are more likely to deceive recipients compared to traditional vishing calls or text
messages with the same phishing content. As AI cloning is a new technology, there is not
much literature on this topic. To evaluate this hypothesis, we conduct a field experiment
with students in university courses who receive a message purporting to be from a
professor from one of their classes. Our results indicate that text messages were more
frequently reported than voice messages, but there was no statistically significant
difference in deception rates between voice cloning and text-based phishing. These
findings suggest that while AI-generated voices do not significantly outperform text
messages in phishing effectiveness, vishing remains a serious cybersecurity threat that
warrants further study, especially as people are unaware of the technology and will not
report it at the same rate as traditional text phishing.
4
5
ACKNOWLEDGMENTS
I would like to thank Dr. Derek Hansen, who helped advise me on the structure of
the study and throughout much of the process. Dr. Ben Schooley and Dr. Nancy Fulda are
both on my committee and have provided valuable feedback for the workings of the
study. David Wood worked on a similar study and assisted with data analysis and
collection. The BYU Honors program contributed funding to the project that paid for
software and participant incentives. I would also like to thank my family for supporting
me throughout my education and this project.
6
7
TABLE OF CONTENTS
TITLE PAGE ...................................................................................................... 1
ABSTRACT ....................................................................................................... 3
ACKNOWLEDGMENTS .............................................................................. 5
TABLE OF CONTENTS....................................................................................... 7
LIST OF TABLES AND FIGURES ........................................................................ 8
INTRODUCTION ............................................................................................... 9
METHODS ....................................................................................................... 13
RESULTS ......................................................................................................... 17
ANALYSIS ....................................................................................................... 23
CONCLUSION ................................................................................................. 27
WORKS CITED ......................................................................................... 29
8
LIST OF TABLES AND FIGURES
FIGURE 1: Vocloner clone creation
FIGURE 2: Volconer voice input box
FIGURE 3: Responses
FIGURE 4: Text vs. Voice messages
FIGURE 5: Reporting behavior
FIGURE 6: Gender-based responses
FIGURE 7: Confidence levels and responses
9
INTRODUCTION
Phishing is a type of social engineering that attackers use to obtain information.
Common phishing attacks include texts, SMS messages, and phone calls of scammers
pretending to be an authority figure. For example, they may pose as a worker at a bank,
requesting a password to access the victim's account. Phishing attacks are the most
frequent and costly type of cybersecurity attack. Cloudflare estimated that 90 percent of
successful cyber-attacks begin with a phishing attack. (Introducing Cloudflare’s 2023
Phishing Threats Report, n.d.) A report from APWG (Anti-phishing working group)
found that phones were used in 25 percent of fraudulent attacks. (APWG | Phishing
Activity Trends Reports, n.d.) Phone-based attacks include vishing, which is voice
phishing, and smishing, which is SMS (text-based) phishing. (Griffin & Rackley, 2008)
Voice cloning has improved drastically in recent years due to advances in artificial
intelligence (Smith, 2024). Text–to-speech technology has been around for some time,
beginning with synthetic, robotic voices that have improved to deepfake versions of
popular voices. (The Biden Deepfake Robocall Is Only the Beginning | WIRED, n.d.) For
a while, it was difficult to clone someone’s voice without many audio samples of a voice.
Improvements in machine learning algorithms now make it possible to clone a
convincing voice based upon a mere 15-second audio sample of any verbal message.
10
(Smith, 2024) Some people have taken advantage of this technology to mimic celebrities'
voices, and scammers have taken advantage of this software to make phone calls in the
voice of a trusted figure. (The Terrifying A.I. Scam That Uses Your Loved One’s Voice |
The New Yorker, n.d.) For example, scammers voice clone a family member who is
supposedly kidnapped or in jail, needing an ransom payment or bail payment (Karimi,
2023) Although scammers have begun to use voice cloning, little research has evaluated
the effectiveness of voice cloning compared to other techniques.
This paper will be among the first systematic studies that show how effective
voice cloning can be. Due to the recent development of these convincing AI voice clones,
there has not been research on the implications and human impact, particularly regarding
phishing. As seen in the earlier news articles, this has a direct impact on many people’s
lives and should be researched.
AI voice cloning enhances these scams by targeting victims' emotions and
spreading misinformation. (Features, 2023) Cases of voice cloning for phishing messages
have cloned voices of family members calling and begging for money. (Rogers, n.d.) This
study aims to determine the effectiveness of voice cloning and phishing messages. There
have been some cases of voice cloning and phishing messages, but not much research has
determined how effective these phishing scams are compared to more traditional phishing
messages. The cost for voice cloning has dropped significantly, and it’s much easier for
someone to clone a voice, making it a tempting prospect for a scammer. This study will
highlight the risk of these messages and serve to warn against cloned vishing messages.
11
The purpose of this research is to understand the impact of voice cloning on the
success rate of vishing/phishing attacks and underscore the importance of training
individuals to recognize and respond to these advanced phishing techniques.
We hypothesize that calls utilizing cloned voices are more likely to deceive
recipients compared to traditional vishing calls or text messages with the same phishing
content. To evaluate this hypothesis, we conduct a field experiment with students in
university courses who receive a message purporting to be from a professor from one of
their classes.
The primary objectives are to:
Evaluate Success Rate: Measure the percentage of participants who respond to
vishing calls, believing them to be legitimate, compared to text-only and actual
voice calls.
Analyze Influence of Technical Background: Determine if participants with a
technical background are less likely to fall for voice-cloned vishing attacks
compared to those without such a background.
Identify Awareness and Training Needs: Identify gaps in awareness and training
among participants regarding the threat of voice cloning in phishing attacks.
Determine what techniques are most effective: Finding the percentage of people
who respond to voicemail messages versus voice note messages.
12
The research questions that we hope to answer include the following:
Which type of impersonation phishing message is most effective (Real voice,
voice clone, text-to-speech, or text message)?
Are people with a technical background less likely to fall for impersonation
phishing messages?
Do students in an advanced accounting course fall for impersonation phishing
messages less often than students in an introductory GE course?
13
METHODS
We conducted a field experiment with professors who taught large classes at
BYU, and students in those classes. We identified and approached professors teaching
large, diverse classes at Brigham Young University (BYU) to ensure a substantial and
varied participant pool. Initial recruitment involved sending emails to potential
professors, explaining the study's objectives and seeking their collaboration. We obtained
consent from 2 BYU professors to clone their voices and send messages to their students.
We sent this message to students.
Hello,
This is Professor [Professor's Name]. I hope you're doing well. I’m currently
updating the class registration records and I need your student ID number to fix
an error with your registration. Could you please text me your BYU student ID
number? You can find your 9-digit student ID number on your BYU ID card.
Thanks!
14
We used four message types to test different types of phishing scenarios.
Participants were randomly assigned to these four different groups.
f. Real Voice: Actual recording of the professor's voice requesting the student’s ID.
This is used in comparison with the cloned version of the professors voice, to
determine if there is a difference or if student pick up on differences in the cloned
version of the voice or the real version. Studies (cite this) show that many people
struggle to tell the difference between AI clones and real voices, and we wanted to
compare this difference.
g. Cloned Vishing: Cloned version of the professor's voice requesting the student’s
ID.
h. Generic Vishing: Generic robocall voicemail from someone else's voice of the
same gender as the professor, also requesting the student’s ID. Very few phishing
attacks use this technique (cite), and we used this group.
i. Smishing: Text message with the same content.
We first started by collecting audio samples from professors. We recorded two different
messages from the professors: one with recording the real message and then one minute
of the professor speaking about different topics, such as their research. We recorded the
professors talking about different things, such as their research. This is meant to emulate
how a phisher in the real world would use audio and videos of the target, which are
unrelated to the message the phisher would try to send. For example, a scammer could
15
find audio from a professors lecture and use this audio as the data to clone the
professors voice.
Even without the professor saying words like ‘ID’ and ‘registration’, both words that we
use in fake messages, we were able to mimic those exact words in a realistic voice clone.
To create the voice clone, we used a free website called Vocloner. Other software that can
clone voices include Elevenlabs, Speechify, and many more. It’s easy to use this software
and many are free to use, which means that scammers can easily create these scams with
low effort and a low cost.
Fig 1. Vocloner clone creation
Vocloner was able to make a realistic clone of the professor’s voice within a few seconds,
with only one minute of audio. We then placed the malicious message text in the cloning
software and downloaded the malicious message.
16
Fig 2. Volconer voice input box
To send out messages, we used a service called TextP2P. We used TextP2P as it had the
capability to send multi-media text messages, which was a requirement for sending voice
messages as mp4 files.
17
RESULTS
In our study, we investigated the effectiveness of various phishing message
modalities—text messages, real voice recordings, and cloned voice messages—in
deceiving participants. Our analysis utilized Fisher's Exact Test to assess the relationships
between these variables.
Which type of impersonation phishing message is most effective (Real voice, voice clone,
text-to-speech, or text message)?
Most participants did not respond to the message or reach out to the professor. A
total of 16% of participants responded with their student ID. Some students asked for
verification of the professors identity over text. For example, one student (in the text
message group) asked the professor to provide the topic of the previous class lecture to
verify if the professor was really sending the message. Other students sent the student ID
to the professor through official channels, such as to the professors email or phone
number. Finally, some students reached out directly to the professor and reported the
phishing attempt. Students who asked for verification, sent message through official
channels, or reported phishing are under the group ‘Expressed suspicion’. All voice
messages are in the voice group.
18
Response
Text
Voice
Voice
clone
Real
voice
Text-to-
speech
Total
Responded with ID
9 (26%)
14 (13%)
7 (20%)
4 (12%)
3 (9%)
23 (16%)
Expressed
suspicion
11
(31%)
11 (11%)
4 (11%)
3 (9%)
4 (12%)
22 (16%)
Asked for
verification
7 (20%)
5 (5%)
4 (11%)
0 (0%)
1 (3%)
12 (9%)
Sent via official
channel
3 (9%)
2 (2%)
0 (0%)
1 (3%)
1 (3%)
5 (4%)
Reported phishing
1 (3%)
3 (3%)
0 (0%)
2 (6%)
1 (3%)
4 (3%)
No response
15 (43%)
79 (77%)
24
(69%)
27 (79%)
28 (82%)
94 (68%)
Fig. 3: Responses (percentages rounded)
We observed that participants' response rates to cloned voice messages were
comparable to those for real voice recordings. This suggests that voice cloning
technology can produce messages that are as convincing as authentic recordings. 7
participants responded with their student ID in response to the cloned message, and 4
participants responded with their ID in response to the real voice.
Text vs. Voice messages
Responded with ID
Did not respond with ID
9 (26%)
26 (74%)
14 (14%)
89 (86%)
Fig. 4
19
Our data indicated that text messages were more likely to be reported as
suspicious compared to voice messages. Specifically, 15.22% of participants reported the
text messages as phishing attempts. Fisher's Exact Test revealed a statistically significant
association between the mode of communication and reporting behavior (p = .0048), with
an odds ratio of 4.26. This implies that participants were over four times more likely to
report text-based phishing attempts than voice-based ones.
Reported message as phishing
Did not report message as
phishing
11 (31%)
24 (69%)
10 (10%)
93 (90%)
Fig. 5
P-value: 0.0048 (statistically significant)
We found that gender and age are not significant factors for how people responded to the
messages. Age wasn’t significant, as most participants were all around the same age.
Asked for
verification
Reported
phishing
Responded
with ID
Sent via
official
channels
No
response
Grand
Total
Female
5 (9%)
1 (2%)
9 (15%)
4 (7%)
40 (68%)
59
Male
7 (9%)
3 (4%)
14 (17%)
1 (1%)
54 (67%)
81
Grand
total
12 (8%)
4 (3%)
23 (16%)
5 (4%)
94 (65%)
144
20
Fig. 6
Do students in an advanced accounting course fall for impersonation phishing messages
less often than students in an introductory GE course?
Ages ranged from 18-29, the average age of participants was 22.4. The introductory level
class average age was 21.2 and the higher-level accounting class average was 24.6 (as it’s
a grad level class). As such, there are many confounding variables to determine if there
are differences in how students responded to the messages based on the class, and if this
is a difference in age, level of education, or choice of major.
Are people with a technical background less likely to fall for impersonation phishing
messages?
We assessed whether participants' self-reported confidence levels affected their
responses to phishing messages. The analysis showed no significant association between
confidence levels and the likelihood of falling for or reporting phishing attempts (p >
.05). This indicates that confidence in one's ability to detect phishing does not necessarily
correlate with actual susceptibility or reporting behavior.












21

















Fig 7.
Some participants reached out to the professor through email or text or responded
to the false phone number asking for clarification. Some students instantly thought the
message was a scam, while other students asked if the professor could verify their
identity. When evaluating participants' skepticism, we found that those who received text
messages exhibited higher levels of skepticism compared to those who received voice
messages. The association between message modality and skepticism was statistically
significant (p = .0285), with an odds ratio of 3.52. This indicates that text messages were
more than three times as likely to create suspicion than voice messages.
Skeptical
Not skeptical
8
27
8
95
Fig. 8
P-value: 0.0285 (statistically significant)
We also examined whether a participant's field of study influenced their
susceptibility to phishing attempts. Comparing students from Biology and Accounting
disciplines, the data did not show a significant difference in the likelihood of falling for
22
phishing messages (p = .473). This suggests that academic background, in this context,
does not play a significant role in phishing susceptibility.
Lastly, we assessed whether participants' self-reported confidence levels affected
their responses to phishing messages. The analysis showed no significant association
between confidence levels and the likelihood of falling for or reporting phishing attempts
(p > .05). This indicates that confidence in one's ability to detect phishing does not
necessarily correlate with actual susceptibility or reporting behavior.
23
ANALYSIS
Our study aimed to evaluate the effectiveness of various phishing messages: an AI cloned
voice, a real voice, a text-to-speech voice, and a text message. We found varying results
in reporting and responses.
Some participants chose to send their student ID to the professor through official
channels, such as email or text. Students stated in their messages that they felt more
comfortable reaching out through official channels. Others asked the professor or the
phone number to verify if the message came from them. Many participants did not
respond, which matches other phishing study results such as (Ray et al., n.d.).
Participants who suspecting the message was a scam sent messages to the professors, and
one participant asked the professor to warn others in the class. This shows that many
college students are aware of these types of scams, particularly when considering the high
report rate with the text message. However, the response rate to the cloned message
shows that people aren’t as aware of this new technique to retrieve sensitive information.
24
Fig 9. Responses
The majority of participants did not respond to the message (none).
Interestingly, there are more responses from the text message – possibly showing how
people are more familiar with text, more of the text messages received phishing reports
from the students, showing that many students understand phishing messages can come
from a text number.
The real voice recording had a lower response rate than the cloned voice. Some research
has been done on how persuasive cloned voices are when compared to a real voice
(Dubiel et al., 2020), but further research is needed to understand why the cloned voice
had a higher response rate than the original voice. The tone of the voice recording for










  








25
both cloned voice samples is more monotone than the real voice recording. One student
notices this about the cloned voice and mentioned the monotone voice while reporting the
cloned voice message to the professor. This might indicate that participants responded
better to the more monotone voice, but again, more research is needed.
Another interesting factor is gender. The sample size wasn’t large enough to determine if
gender played a role in the responses, but more women asked to verify the professors
identity than men.
Key Findings:
1. Response with Personal Information:
o Text messages prompted the highest rate of participants responding with
their ID (25.71%), followed by cloned voice messages (20.00%).
26
o Generic text-to-speech messages resulted in the lowest response rate with
personal information (8.82%).
2. Requests for Verification:
o Participants receiving text messages were more inclined to seek
verification (20.00%) compared to those receiving cloned voice messages
(11.43%).
o No participants asked for verification in response to real voice messages.
3. Reporting Phishing Attempts:
o Real voice messages led to the highest rate of phishing reports (5.88%),
while text messages had a lower reporting rate (2.86%).
o Notably, no phishing reports were made in response to cloned voice
messages.
27
CONCLUSION
This study examined the effectiveness of various phishing messages and the responses
from students at Brigham Young University (BYU). Our findings indicate that cloned
voice vishing messages are as effective as real voice recording from professors in
situations where a message is requesting sensitive information. Text-based phishing
messages resulted in high report rates, with participants being 4.26 more likely to report
28
these messages when compared to the collective voice messages. Participants who
received text messages often questioned the legitimacy of the message when compared to
those who received voice messages. This shows that many people are aware of and
respond to text phishing. These results show the necessity for cybersecurity awareness
and understanding of new phishing threats. With recent technologies that can create novel
phishing attacks, it is essential for people to recognize and respond to these threats.
Further research should expand the sample size and diversity of participants. We plan to
complete a follow-up study with more classes and a large sample size, to further gain
insights into this new phishing technique. Other research should explore the
psychological impacts of how individuals respond to attacks and understand what factors
are spreading awareness or finding ways to warn others.
29
WORKS CITED
APWG | Phishing Activity Trends Reports. (n.d.). Retrieved August 7, 2024, from
https://apwg.org/trendsreports/
Dubiel, M., Halvey, M., Gallegos, P. O., & King, S. (2020). Persuasive Synthetic Speech:
Voice Perception and User Behaviour. Proceedings of the 2nd Conference on
Conversational User Interfaces, 1–9. https://doi.org/10.1145/3405755.3406120
30
Features, T. H. last updated C. from V. V. in. (2023, July 12). 4 AI scams to be aware of.
Moneyweekuk. https://moneyweek.com/personal-finance/ai-scams-to-be-aware-
of
Griffin, S. E., & Rackley, C. C. (2008). Vishing. Proceedings of the 5th Annual
Conference on Information Security Curriculum Development, 33–35.
https://doi.org/10.1145/1456625.1456635
Introducing Cloudflare’s 2023 phishing threats report. (n.d.). Retrieved July 27, 2024,
from https://blog.cloudflare.com/2023-phishing-report/
Karimi, F. (2023, April 29). ‘Mom, these bad men have me’: She believes scammers
cloned her daughters voice in a fake kidnapping. CNN.
https://www.cnn.com/2023/04/29/us/ai-scam-calls-kidnapping-cec/index.html
Ray, A., Saha, S., Chakrabarty, K., Collins, L., Lafata, K., & Emami-Naeini, P. (n.d.).
Exploring the Impact of Ethnicity on Susceptibility to Voice Phishing.
Rogers, R. (n.d.). How to Protect Yourself (and Your Loved Ones) From AI Scam Calls.
Wired. Retrieved August 7, 2024, from https://www.wired.com/story/how-to-
protect-yourself-ai-scam-calls-detect/
Smith, C. (2024, July 10). TerifAI is a terrifying AI chatbot that steals your voice while
you talk to it. BGR. https://bgr.com/tech/terifai-is-a-terrifying-ai-chatbot-that-
steals-your-voice-while-you-talk-to-it/
The Biden Deepfake Robocall Is Only the Beginning | WIRED. (n.d.). Retrieved August 7,
2024, from https://www.wired.com/story/biden-robocall-deepfake-danger/
31
The Terrifying A.I. Scam That Uses Your Loved One’s Voice | The New Yorker. (n.d.).
Retrieved July 11, 2024, from https://www.newyorker.com/science/annals-of-
artificial-intelligence/the-terrifying-ai-scam-that-uses-your-loved-ones-voice