Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
skip to main content
10.1145/3517745.3561467acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections

PhishInPatterns: measuring elicited user interactions at scale on phishing websites

Authors: Karthika Subramani, William Melicher, Oleksii Starov, Phani Vadrevu, Roberto PerdisciAuthors Info & Claims
IMC '22: Proceedings of the 22nd ACM Internet Measurement Conference
Pages 589 - 604
Published: 25 October 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Despite phishing attacks and detection systems being extensively studied, phishing is still on the rise and has recently reached an all-time high. Attacks are becoming increasingly sophisticated, leveraging new web design patterns to add perceived legitimacy and, at the same time, evade state-of-the-art detectors and web security crawlers.
    In this paper, we study phishing attacks from a new angle, focusing on how modern phishing websites are designed. Specifically, we aim to better understand what type of user interactions are elicited by phishing websites and how their user experience (UX) and interface (UI) design patterns can help them accomplish two main goals: i) lend a sense of professionalism and legitimacy to the phishing website, and ii) contribute to evading phishing detectors and web security crawlers. To study phishing at scale, we built an intelligent crawler that combines browser automation with machine learning methods to simulate user interactions with phishing pages and explore their UX and UI characteristics. Using our novel methodology, we explore more than 50,000 phishing websites and make the following new observations: i) modern phishing sites often impersonate a brand (e.g., Microsoft Office), but surprisingly, without necessarily cloning or closely mimicking the design of the corresponding legitimate website; ii) they often elicit personal information using a multi-step (or multi-page) process, to mimic users' experience on legitimate sites; iii) they embed modern user verification systems (including CAPTCHAs); and ironically, iv) they sometimes conclude the phishing experience by reassuring the user that their private data was not stolen. We believe our findings can help the community gain a more in-depth understanding of how web-based phishing attacks work from a users' perspective and can be used to inform the development of more accurate and robust phishing detectors.

    Supplementary Material

    M4V File (619.m4v)
    Presentation video

    References

    [1]
    2014. Faker: Library docuemntation. https://faker.readthedocs.io/en/master/.
    [2]
    2020. Google Safe Browsing : Blocklisting Platform. https://safebrowsing.google.com/.
    [3]
    2021. CISCO :2021 Cyber Security Trends. https://learn-umbrella.cisco.com/ebook-library/2021-cyber-security-threat-trends-phishing-crypto-top-the-list. (Last accessed Sep 19, 2022).
    [4]
    2021. Detectron2. https://github.com/facebookresearch/detectron2.
    [5]
    2021. Don't Get CAPTCHA'd By This New Phishing Technique! https://firstcallhelp.tamu.edu/2021/09/dont-get-captchad-by-this-new-phishing-technique/. (Last accessed Sep 19, 2022).
    [6]
    2021. SKlearn: SGDClassifier. https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html.
    [7]
    2022. OpenPhish: Phishing Intelligence. https://openphish.com/.
    [8]
    2022. Pytesseract Package. https://pypi.org/project/pytesseract/.
    [9]
    2022. What is hCaptcha? https://www.hcaptcha.com/what-is-hcaptcha-about.
    [10]
    2022. What is ReCaptcha? https://developers.google.com/recaptcha.
    [11]
    Sahar Abdelnabi, Katharina Krombholz, and Mario Fritz. 2020. VisualPhishNet: Zero-Day Phishing Website Detection by Visual Similarity. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security (Virtual Event, USA) (CCS '20). Association for Computing Machinery, New York, NY, USA, 1681--1698.
    [12]
    Bhupendra Acharya and Phani Vadrevu. 2021. PhishPrint: Evading Phishing Detection Crawlers by Prior Profiling. In 30th USENIX Security Symposium (USENIX Security 21). USENIX Association, 3775--3792. https://www.usenix.org/conference/usenixsecurity21/presentation/acharya
    [13]
    Sadia Afroz and Rachel Greenstadt. 2011. PhishZoo: Detecting Phishing Websites by Looking at Them. In Proceedings of the 2011 IEEE Fifth International Conference on Semantic Computing (ICSC '11). IEEE Computer Society, USA, 368--375.
    [14]
    S. Bagui, D. Nandi, S. Bagui, and R. J. White. 2019. Classifying Phishing Email Using Machine Learning and Deep Learning. In 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security). 1--2.
    [15]
    Eduardo Benavides, Walter Fuertes, Sandra Sanchez, and Manuel Sanchez. 2020. Classification of Phishing Attack Solutions by Employing Deep Learning Techniques: A Systematic Literature Review. In Developments and Advances in Defense and Security. Springer, 51--64.
    [16]
    Hugo Bijmans, Tim Booij, Anneke Schwedersky, Aria Nedgabat, and Rolf van Wegberg. 2021. Catching Phishers By Their Bait: Investigating the Dutch Phishing Landscape through Phishing Kit Detection. In Proceedings of the 30th USENIX Security Symposium. USENIX Association, 3757--3774.
    [17]
    Docker. 2019. Docker: Enterprise Container Platform. https://www.docker.com/. (Last accessed Nov. 1, 2019).
    [18]
    Google. 2019. Puppeteer: Chormium Browser Automation Tool. http://liwc.wpengine.com/compare-dictionaries/. (Last accessed Nov. 11, 2019).
    [19]
    Xiao Han, Nizar Kheir, and Davide Balzarotti. 2016. PhishEye: Live Monitoring of Sandboxed Phishing Kits. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (Vienna, Austria) (CCS '16). Association for Computing Machinery, New York, NY, USA, 1402--1413.
    [20]
    M. Hara, A. Yamada, and Y. Miyake. 2009. Visual similarity-based phishing detection without victim site information. In 2009 IEEE Symposium on Computational Intelligence in Cyber Security. 30--36.
    [21]
    Imran Hossen, Yazhou Tu, Md Fazle Rabby, Nazmul Islam, Hui Cao, and Xiali Hei. 2020. An Object Detection based Solver for Google's Image reCAPTCHA v2. In RAID.
    [22]
    M. Khonji, Y. Iraqi, and A. Jones. 2013. Phishing Detection: A Literature Survey. IEEE Communications Surveys Tutorials 15, 4 (2013), 2091--2121.
    [23]
    Brian Kondracki, Babak Amin Azad, Oleksii Starov, and Nick Nikiforakis. 2021. Catching Transparent Phish: Analyzing and Detecting MITM Phishing Toolkits (CCS '21). Association for Computing Machinery, New York, NY, USA, 36--50.
    [24]
    Yun Lin, Ruofan Liu, Dinil Mon Divakaran, Jun Yang Ng, Qing Zhou Chan, Yiwen Lu, Yuxuan Si, Fan Zhang, and Jin Song Dong. 2021. Phishpedia: A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages. In 30th USENIX Security Symposium (USENIX Security 21). USENIX Association, 3793--3810. https://www.usenix.org/conference/usenixsecurity21/presentation/lin
    [25]
    Gang Liu, Bite Qiu, and Liu Wenyin. 2010. Automatic Detection of Phishing Target from Phishing Webpage. In Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR '10). IEEE Computer Society, USA, 4153--4156.
    [26]
    Ruofan Liu, Yun Lin, Xianglin Yang, Siang Hwee Ng, Dinil Divakaran, and Jin Song Dong. 2022. Inferring Phishing Intention via Webpage Appearance and Dynamics: A Deep Vision Based Approach. In 30th {USENIX} Security Symposium ({USENIX} Security 21).
    [27]
    S. Marchal, K. Saari, N. Singh, and N. Asokan. 2016. Know Your Phish: Novel Techniques for Detecting Phishing Sites and Their Targets. In 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS). 323--333.
    [28]
    Sourena Maroofi, Maciej Korczyński, and Andrzej Duda. 2020. Are You Human? Resilience of Phishing Detection to Evasion Techniques Based on Human Verification. In Proceedings of the ACM Internet Measurement Conference (Virtual Event, USA) (IMC '20). Association for Computing Machinery, New York, NY, USA, 78--86.
    [29]
    Adam Oest, Yeganeh Safaei, Adam Doupé, Gail-Joon Ahn, Brad Wardman, and Gary Warner. 2018. Inside a phisher's mind: Understanding the anti-phishing ecosystem through phishing kit analysis. 2018 APWG Symposium on Electronic Crime Research (eCrime) (2018), 1--12.
    [30]
    Adam Oest, Yeganeh Safaei, Adam Doupé, Gail-Joon Ahn, Brad Wardman, and Kevin Tyers. 2019. PhishFarm: A Scalable Framework for Measuring the Effectiveness of Evasion Techniques against Browser Phishing Blacklists. In 2019 IEEE Symposium on Security and Privacy (SP). 1344--1361.
    [31]
    Adam Oest, Yeganeh Safaei, Penghui Zhang, Brad Wardman, Kevin Tyers, Yan Shoshitaishvili, and Adam Doupé. 2020. PhishTime: Continuous Longitudinal Measurement of the Effectiveness of Anti-phishing Blacklists. In USENIX Security Symposium.
    [32]
    Adam Oest, Penghui Zhang, Brad Wardman, Eric Nunes, Jakub Burgis, Ali Zand, Kurt Thomas, Adam Doupé, and Gail-Joon Ahn. 2020. Sunrise to Sunset: Analyzing the End-to-end Life Cycle and Effectiveness of Phishing Attacks at Scale. In USENIX Security Symposium.
    [33]
    Peng Peng, Chao Xu, Luke Quinn, Hang Hu, Bimal Viswanath, and Gang Wang. 2019. What Happens After You Leak Your Password: Understanding Credential Sharing on Phishing Sites. In Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security (Auckland, New Zealand) (Asia CCS '19). Association for Computing Machinery, New York, NY, USA, 181--192.
    [34]
    Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015).
    [35]
    Steve Sheng, Brad Wardman, Gary Warner, Lorrie Cranor, Jason Hong, and Chengshan Zhang. 2009. An empirical analysis of phishing blacklists. (2009).
    [36]
    Kurt Thomas, Frank Li, Ali Zand, Jacob Barrett, Juri Ranieri, Luca Invernizzi, Yarik Markov, Oxana Comanescu, Vijay Eranti, Angelika Moscicki, Daniel Margolis, Vern Paxson, and Elie Bursztein. 2017. Data Breaches, Phishing, or Malware? Understanding the Risks of Stolen Credentials. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (Dallas, Texas, USA) (CCS '17). Association for Computing Machinery, New York, NY, USA, 1421--1434.
    [37]
    Phani Vadrevu and Roberto Perdisci. 2019. What You See is NOT What You Get: Discovering and Tracking Social Engineering Attack Campaigns. In Proceedings of the Internet Measurement Conference (Amsterdam, Netherlands) (IMC '19). Association for Computing Machinery, New York, NY, USA, 308--321.
    [38]
    Grega Vrbančič, Iztok Fister, and Vili Podgorelec. 2018. Swarm Intelligence Approaches for Parameter Setting of Deep Learning Neural Network: Case Study on Phishing Websites Classification. In Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics (Novi Sad, Serbia) (WIMS '18). Association for Computing Machinery, New York, NY, USA, Article 9, 8 pages.
    [39]
    Bo Wei, Rebeen Ali Hamad, Longzhi Yang, Xuan He, Hao Wang, Bin Gao, and Wai Lok Woo. 2019. A Deep-Learning-Driven Light-Weight Phishing Detection Sensor. Sensors 19, 19 (2019), 4258.
    [40]
    Rodrigo Wilhelmy and Horacio Rosas. 2013. captcha dataset.
    [41]
    P. Yang, G. Zhao, and P. Zeng. 2019. Phishing Website Detection Based on Multidimensional Features Driven by Deep Learning. IEEE Access 7 (2019), 15196--15209.
    [42]
    Guixin Ye, Zhanyong Tang, Dingyi Fang, Zhanxing Zhu, Yansong Feng, Pengfei Xu, Xiaojiang Chen, and Zheng Wang. 2018. Yet Another Text Captcha Solver: A Generative Adversarial Network Based Approach. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (Toronto, Canada) (CCS '18). Association for Computing Machinery, New York, NY, USA, 332--348.
    [43]
    Ping Yi, Yuxiang Guan, Futai Zou, Yao Yao, Wei Wang, and Ting Zhu. 2018. Web phishing detection using a deep learning framework. Wireless Communications and Mobile Computing 2018 (2018).
    [44]
    Haijun Zhang, Gang Liu, Tommy W. S. Chow, and Wenyin Liu. 2011. Textual and Visual Content-Based Anti-Phishing: A Bayesian Approach. Trans. Neur. Netw. 22, 10 (Oct. 2011), 1532--1546.
    [45]
    Penghui Zhang, Adam Oest, Haehyun Cho, Zhibo Sun, RC Johnson, Brad Wardman, Shaown Sarker, Alexandros Kapravelos, Tiffany Bao, Ruoyu Wang, Yan Shoshitaishvili, Adam Doupé, and Gail-Joon Ahn. 2021. CrawlPhish: Large-scale Analysis of Client-side Cloaking Techniques in Phishing. 2021 IEEE Symposium on Security and Privacy (SP) (2021), 1109--1124.
    [46]
    Yu Zhou, Yongzheng Zhang, Jun Xiao, Yipeng Wang, and Weiyao Lin. 2014. Visual Similarity Based Anti-Phishing with the Combination of Local and Global Features. In Proceedings of the 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications (TRUSTCOM '14). IEEE Computer Society, USA, 189--196.

    Cited By

    View all
    • (2024)VORTEX : Visual phishing detectiOns aRe Through EXplanationsACM Transactions on Internet Technology10.1145/365466524:2(1-24)Online publication date: 6-May-2024
    • (2024)PhishinWebView: Analysis of Anti-Phishing Entities in Mobile Apps with WebView Targeted PhishingProceedings of the ACM Web Conference 202410.1145/3589334.3645708(1923-1932)Online publication date: 13-May-2024
    • (2024)Phishing Vs. Legit: Comparative Analysis of Client-Side Resources of Phishing and Target Brand WebsitesProceedings of the ACM Web Conference 202410.1145/3589334.3645535(1756-1767)Online publication date: 13-May-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    IMC '22: Proceedings of the 22nd ACM Internet Measurement Conference
    October 2022
    796 pages
    ISBN:9781450392594
    DOI:10.1145/3517745
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    • USENIX Assoc: USENIX Assoc

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 October 2022

    Permissions

    Request permissions for this article.
    Request Permissions

    Check for updates

    Author Tags

    1. captcha
    2. crawler
    3. neural networks
    4. phishing
    5. user experience

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    IMC '22
    IMC '22: ACM Internet Measurement Conference
    October 25 - 27, 2022
    Nice, France

    Acceptance Rates

    Overall Acceptance Rate 277 of 1,083 submissions, 26%

    Upcoming Conference

    IMC '24
    ACM Internet Measurement Conference
    November 4 - 6, 2024
    Madrid , AA , Spain

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)333
    • Downloads (Last 6 weeks)31
    Reflects downloads up to 16 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)VORTEX : Visual phishing detectiOns aRe Through EXplanationsACM Transactions on Internet Technology10.1145/365466524:2(1-24)Online publication date: 6-May-2024
    • (2024)PhishinWebView: Analysis of Anti-Phishing Entities in Mobile Apps with WebView Targeted PhishingProceedings of the ACM Web Conference 202410.1145/3589334.3645708(1923-1932)Online publication date: 13-May-2024
    • (2024)Phishing Vs. Legit: Comparative Analysis of Client-Side Resources of Phishing and Target Brand WebsitesProceedings of the ACM Web Conference 202410.1145/3589334.3645535(1756-1767)Online publication date: 13-May-2024
    • (2023)PhishReplicant: A Language Model-based Approach to Detect Generated Squatting Domain NamesProceedings of the 39th Annual Computer Security Applications Conference10.1145/3627106.3627111(1-13)Online publication date: 4-Dec-2023

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media

    View Table of Contents