Svoboda | Graniru | BBC Russia | Golosameriki | Facebook

A Survey on Breaking Technique of Text-Based CAPTCHA

Authors:

Daofu Gong Academic Editor:

Zhenxing QianAuthors Info & Claims

Security and Communication Networks, Volume 2017

https://doi.org/10.1155/2017/6898617

Published: 01 January 2017 Publication History

Abstract

The CAPTCHA has become an important issue in multimedia security. Aimed at a commonly used text-based CAPTCHA, this paper outlines some typical methods and summarizes the technological progress in text-based CAPTCHA breaking. First, the paper presents a comprehensive review of recent developments in the text-based CAPTCHA breaking field. Second, a framework of text-based CAPTCHA breaking technique is proposed. And the framework mainly consists of preprocessing, segmentation, combination, recognition, postprocessing, and other modules. Third, the research progress of the technique involved in each module is introduced, and some typical methods of segmentation and recognition are compared and analyzed. Lastly, the paper discusses some problems worth further research.

References

[1]

L. Von Ahn, M. Blum, and J. Langford, “Telling humans and computers apart automatically,” Communications of the ACM, vol. 47, no. 2, pp. 56–60, 2004.

Digital Library

[2]

K. Chellapilla and P. Y. Simard, “Using Machine Learning to Break Visual Human Interaction Proofs (HIPs),” in Proceedings of the Advances in Neural Information Processing Systems, pp. 265–272, ofAdvances in Neural Information Processing Systems, 2004.

[3]

N. Roshanbin and J. Miller, “A survey and analysis of current CAPTCHA approaches,” Journal of Web Engineering, vol. 12, no. 1-2, pp. 001–040, 2013.

Digital Library

[4]

J. Yan and A. S. E. Ahmad, “A low-cost attack on a microsoft CAPTCHA,” in Proceedings of the 15th ACM conference on Computer and Communications Security, CCS'08, pp. 543–554, USA, October 2008.

Digital Library

[5]

F. Jean-Baptiste and R. Paucher, “The Captchacker Project,” 2009, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.800.3065&rep=rep1&type=pdf.

[6]

S.-Y. Huang, Y.-K. Lee, G. Bell, and Z.-H. Ou, “An efficient segmentation algorithm for CAPTCHAs with line cluttering and character warping,” Multimedia Tools and Applications, vol. 48, no. 2, pp. 267–289, 2010.

Digital Library

[7]

R. A. Nachar, E. Inaty, P. J. Bonnin, and Y. Alayli, “Breaking down Captcha using edge corners and fuzzy logic segmentation/recognition technique,” Security and Communication Networks, vol. 8, no. 18, pp. 3995–4012, 2015.

[8]

L. von Ahn, M. Blum, N. J. Hopper, and J. Langford, “CAPTCHA: using hard AI problems for security,” in Advances in cryptology---EUROCRYPT 2003, vol. 2656 of Lecture Notes in Computer Science, pp. 294–311, Springer, Berlin, Germany, 2003.

[9]

https://www.google.com/recaptcha.

[10]

http://captcha.net/.

[11]

http://www.captcha.net/captchas/bongo.

[12]

A. Schlaikjer and A. Dual, “Use Speech CAPTCHA: Aiding Visually Impaired Web Users while Providing Transcriptions of Audio Streams,” LTI-CMU-07-014, Carnegie Mellon University, Pittsburgh, Pa, USA, 2007.

[13]

J. Tam, J. Simsa et al., “Improving Audio CAPTCHAs,” in Proceedings of the Symposium on Usable Privacy and Security, 2008.

[14]

J. Tam, S. Hyde, J. Simsa, and L. Von Ahn, “Breaking audio CAPTCHAs,” in Proceedings of the 22nd Annual Conference on Neural Information Processing Systems, NIPS 2008, pp. 1625–1632, can, December 2008.

[15]

H. S. Baird and K. Popat, “Human Interactive Proofs and Document Image Analysis,” in Proceedings of the International Workshop on Document Analysis Systems, vol. 2423 of Lecture Notes in Computer Science, pp. 507–518, Springer, 2002.

[16]

A. L. Coates, H. S. Baird, and R. J. Fateman, “Pessimal print: A reverse turing test,” in Proceedings of the 6th International Conference on Document Analysis and Recognition, ICDAR 2001, pp. 1154–1158, usa, September 2001.

[17]

M. Chew and H. S. Baird, “Baffletext: A human interactive proof,” in Proceedings of the Document Recognition and Retrieval X, pp. 305–316, USA, January 2003.

[18]

R. Chow, P. Golle, M. Jakobsson, L. Wang, and X. Wang, “Making CAPTCHAs clickable,” in Proceedings of the 9th Workshop on Mobile Computing Systems and Applications, HotMobile 2008, pp. 91–94, USA, February 2008.

Digital Library

[19]

P. Golle, “Machine learning attacks against the asirra CAPTCHA,” in Proceedings of the 15th ACM conference on Computer and Communications Security, CCS'08, pp. 535–542, USA, October 2008.

Digital Library

[20]

G. Mori and J. Malik, “Recognizing objects in adversarial clutter: breaking a visual CAPTCHA,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 134–144, June 2003.

[21]

M. Chew and J. D. Tygar, “Image Recognition CAPTCHAs,” in Proceedings of the 7th International Information Security Conference, vol. 3225 of Lecture Notes in Computer Science, pp. 268–279, Springer.

[22]

K. Chellapilla, K. Larson, P. Simard, and M. Czerwinski, “Designing human friendly human interaction proofs (HIPs),” in Proceedings of the the SIGCHI conference, p. 711, Portland, Oregon, USA, April 2005.

Digital Library

[23]

P. Y. Simard, R. Szeliski, J. Benaloh, J. Couvreur, and I. Calinov, “Using character recognition and segmentation to tell computer from humans,” in Proceedings of the 7th International Conference on Document Analysis and Recognition, ICDAR 2003, pp. 418–423, UK, August 2003.

[24]

K. Chellapilla, K. Larson, P. Y. Simard, and M. Czerwinski, “Building segmentation based human-friendly human interaction proofs (HIPs),” in Proceedings of the Second International Workshop on Human Interactive Proofs, HIP 2005, pp. 1–26, usa, May 2005.

[25]

K. Chellapilla, K. Larson, P. Simard, and M. Czerwinski, “Computers beat humans at single character recognition in reading based human interaction proofs (HIPs),” in Proceedings of the 2nd Conference on Email and Anti-Spam, usa, July 2005.

[26]

J. Elson, J. R. Douceur, J. Howell, and J. Saul, “Asirra: A CAPTCHA that exploits interest-aligned manual image categorization,” in Proceedings of the 14th ACM Conference on Computer and Communications Security, CCS'07, pp. 366–374, USA, November 2007.

Digital Library

[27]

Y. Rui and Z. Liu, “ARTiFACIAL: Automated reverse turing test using FACIAL features,” Multimedia Systems, vol. 9, no. 6, pp. 493–502, 2004.

Digital Library

[28]

K. A. Kluever and R. Zanibbi, “Balancing usability and security in a video CAPTCHA,” in Proceedings of the 5th Symposium On Usable Privacy and Security, SOUPS 2009, USA, July 2009.

Digital Library

[29]

R. Gossweiler, M. Kamvar, and S. Baluja, “What's up CAPTCHA? A CAPTCHA based on image orientation,” in Proceedings of the 18th International World Wide Web Conference, WWW 2009, pp. 841–850, Spain, April 2009.

Digital Library

[30]

I. J. Goodfellow, Y. Bulatov, J. Ibarz et al., “Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks,” 2014, https://www.researchgate.net/publication/259399973_Multi-digit_Number_Recognition_from_Street_View_Imagery_using_Deep_Convolutional_Neural_Networks.

[31]

T.-Y. Chan, “Using a test-to-speech synthesizer to generate a reverse Turing test,” in Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence, pp. 226–232, Sacramento, Calif, USA, 2003.

[32]

G. Kochanski, D. Lopresti, and C. Shih, “A reverse turing test using speech,” in Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP 2002, pp. 1357–1360, September 2002.

[33]

http://www.lancaster.ac.uk/people/yanj2/.

[34]

J. Yan and A. S. El Ahmad, “Breaking visual CAPTCHAs with naïve pattern recognition algorithms,” in Proceedings of the 23rd Annual Computer Security Applications Conference, ACSAC 2007, pp. 279–291, December 2007.

[35]

J. Yan and A. S. El Ahmad, “Usability of CAPTCHAs or usability issues in CAPTCHA design,” in Proceedings of the 4th Symposium on Usable Privacy and Security, SOUPS 2008, pp. 44–55, July 2008.

Digital Library

[36]

A. S. El Ahmad, J. Yan, and L. Marshall, “The robustness of a new CAPTCHA,” in Proceedings of the 3rd European Workshop on System Security, EUROSEC'10, pp. 36–41, April 2010.

Digital Library

[37]

B. B. Zhu, J. Yan, Q. Li, C. Yang, J. Liu, N. Xu, M. Yi, and K. Cai, “Attacks and design of image recognition CAPTCHAs,” in Proceedings of the 17th ACM Conference on Computer and Communications Security, CCS'10, pp. 187–200, October 2010.

Digital Library

[38]

A. S. E. Ahmad, J. Yan, and M. Tayara, “The Robustness of Google CAPTCHAs,” Computing Science Technical Report, CS-TR-1278, Newcastle University, 2011.

[39]

A. S. El Ahmad, J. Yan, and W.-Y. Ng, “CAPTCHA design: Color, usability, and security,” IEEE Internet Computing, vol. 16, no. 2, pp. 44–51, 2012.

Digital Library

[40]

A. Algwil, D. Ciresan, B. Liu, and J. Yan, “A security analysis of automated Chinese turing tests,” in Proceedings of the 32nd Annual Computer Security Applications Conference, ACSAC 2016, pp. 520–532, December 2016.

Digital Library

[41]

H. Gao, W. Wang, J. Qi, X. Wang, X. Liu, and J. Yan, “The robustness of hollow CAPTCHAs,” in Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, CCS 2013, pp. 1075–1085, November 2013.

Digital Library

[42]

H. Gao, J. Yan, F. Cao, Z. Zhang, L. Lei, M. Tang, P. Zhang, X. Zhou, X. Wang, and J. Li, “A Simple Generic Attack on Text Captchas,” in Proceedings of the Network and Distributed System Security Symposium, pp. 1–14, San Diego, Calif, USA, 2016.

[43]

http://web.xidian.edu.cn/hchgao/paper.html.

[44]

H. Gao, W. Wang, and Y. Fan, “Divide and conquer: An efficient attack on Yahoo! CAPTCHA,” in Proceedings of the 11th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom-2012, pp. 9–16, June 2012.

Digital Library

[45]

F. Dai, H. Gao, and D. Liu, “Breaking CAPTCHAs with second template matching and BP neural network algorithms,” International Journal of Information Processing and Management, vol. 4, no. 3, pp. 126–133, 2013.

[46]

H. Gao, W. Wang, Y. Fan, J. Qi, and X. Liu, “The robustness of "connecting characters together" CAPTCHAs,” Journal of Information Science and Engineering, vol. 30, no. 2, pp. 347–369, 2014.

[47]

H. Gao, X. Wang, F. Cao, Z. Zhang, L. Lei, J. Qi, and X. Liu, “Robustness of text-based completely automated public turing test to tell computers and humans apart,” IET Information Security, vol. 10, no. 1, pp. 45–52, 2016.

Digital Library

[48]

R. Hussain, H. Gao, and R. A. Shaikh, “Segmentation of connected characters in text-based CAPTCHAs for intelligent character recognition,” Multimedia Tools and Applications, pp. 1–15, 2016.

Digital Library

[49]

R. Hussain, H. Gao, R. A. Shaikh, and S. P. Soomro, “Recognition based segmentation of connected characters in text based CAPTCHAs,” in Proceedings of the 8th IEEE International Conference on Communication Software and Networks, ICCSN 2016, pp. 673–676, June 2016.

[50]

https://captcha.com/.

[51]

http://jcaptcha.sourceforge.net/.

[52]

http://www.hinsite.com.

[53]

http://caca.zoy.org/wiki/PWNtcha.

[54]

https://code.google.com/p/captchacker.

[55]

http://www.brains-n-brawn.com/default.aspx?vDir=aicaptcha.

[56]

http://www.cs.sfu.ca/~mori/research/gimpy/.

[57]

G. Moy, N. Jones, C. Harkless, and R. Potter, “Distortion estimation techniques in solving visual CAPTCHAs,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, pp. II23–II28, July 2004.

[58]

A. Bansal, D. Garg, and A. Gupta, “Breaking a Visual CAPTCHA: A Novel Approach using HMM,” 2008, https://pdfs.semanticscholar.org/3c2c/9af1e9a3b7095edaf8f205dfbadc30f917fb.pdf.

[59]

S. Li, S. A. H. Shah, M. A. U. Khan, S. A. Khayam, A.-R. Sadeghi, and R. Schmitz, “Breaking e-banking CAPTCHAs,” in Proceedings of the 26th Annual Computer Security Applications Conference, ACSAC 2010, pp. 171–180, December 2010.

Digital Library

[60]

C. Hong, B. Lopez-Pineda, K. Rajendran, and A. Recasens, “Breaking Microsoft’s CAPTCHA,” 2015, https://courses.csail.mit.edu/6.857/2016/files/hong-lopezpineda-rajendran-recansens.pdf.

[61]

O. Starostenko, C. Cruz-Perez, F. Uceda-Ponga, and V. Alarcon-Aquino, “Breaking text-based CAPTCHAs with variable word and character orientation,” Pattern Recognition, vol. 48, no. 4, pp. 1097–1108, 2015.

Digital Library

[62]

L. Zhang, L. Zhang, S.-G. Huang, and Z.-X. Shi, “A highly reliable CAPTCHA recognition algorithm based on rejection,” Acta Automatica Sinica, vol. 37, no. 7, pp. 891–900, 2011.

[63]

R. Chen, J. Yang, R.-G. Hu, and S.-G. Huang, “A novel LSTM-RNN decoding algorithm in CAPTCHA recognition,” in Proceedings of the 3rd International Conference on Instrumentation and Measurement, Computer, Communication and Control, IMCCC 2013, pp. 766–771, September 2013.

Digital Library

[64]

S. Sano, T. Otsuka, K. Itoyama, and H. G. Okuno, “HMM-based attacks on Google’s ReCAPTCHA with continuous visual and audio symbols,” Journal of Information Processing, vol. 23, no. 6, pp. 814–826, 2015.

[65]

J. Sauvola and M. Pietikäinen, “Adaptive document image binarization,” Pattern Recognition, vol. 33, no. 2, pp. 225–236, 2000.

[66]

N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62–66, 1979.

[67]

C. J. Hilditch, “Linear Skeletons from Square Cupboards,” Machine Intelligence, pp. 403–420, 1969.

[68]

T. Y. Zhang and C. Y. Suen, “A fast parallel algorithm for thinning digital patterns,” Communications of the ACM, vol. 27, no. 3, pp. 236–239, 1984.

Digital Library

Cited By

Bhowmick RIndra RGanguli IPaul JSil J(2023)Breaking Captcha System with Minimal Exertion through Deep Learning: Real-time Risk Assessment on Indian Government WebsitesDigital Threats: Research and Practice10.1145/35849744:2(1-24)Online publication date: 10-Aug-2023
https://dl.acm.org/doi/10.1145/3584974
Wang PGao HGuo XXiao CQi FYan Z(2023)An Experimental Investigation of Text-based CAPTCHA Attacks and Their RobustnessACM Computing Surveys10.1145/355975455:9(1-38)Online publication date: 16-Jan-2023
https://dl.acm.org/doi/10.1145/3559754
Chua STay KChua MBalachandran VJoshi AFernandez MVerma R(2022)Using Adversarial Defences Against Image Classification CAPTCHAProceedings of the Twelfth ACM Conference on Data and Application Security and Privacy10.1145/3508398.3519367(355-357)Online publication date: 14-Apr-2022
https://dl.acm.org/doi/10.1145/3508398.3519367
Show More Cited By

Index Terms

A Survey on Breaking Technique of Text-Based CAPTCHA

Index terms have been assigned to the content through auto-classification.

Recommendations

Yet Another Text Captcha Solver: A Generative Adversarial Network Based Approach
CCS '18: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security

Despite several attacks have been proposed, text-based CAPTCHAs are still being widely used as a security mechanism. One of the reasons for the pervasive use of text captchas is that many of the prior attacks are scheme-specific and require a labor-...
Breaking a 3d-based CAPTCHA scheme
ICISC'11: Proceedings of the 14th international conference on Information Security and Cryptology

CAPTCHA is a standard defence mechanism against bots, or automated programs, that attempt to use web-based services meant for human users. While there are many different types of CAPTCHA schemes that have emerged over the years, to date, the most widely ...
Handwritten CAPTCHA recognizer: a text CAPTCHA breaking method based on style transfer network
Abstract
The CAPTCHA technology can be used to ensure big multimedia data security, which includes CAPTCHA design and CAPTCHA recognition. For the existing methods are difficult to achieve high breaking accuracy for complex handwritten text CAPTCHA, a ...

Comments

Information & Contributors

Information

Published In

Security and Communication Networks Volume 2017, Issue

2017

1833 pages

ISSN:1939-0114

EISSN:1939-0122

Issue’s Table of Contents

Copyright © 2017 Jun Chen et al.

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 01 January 2017

Qualifiers

Review-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bhowmick RIndra RGanguli IPaul JSil J(2023)Breaking Captcha System with Minimal Exertion through Deep Learning: Real-time Risk Assessment on Indian Government WebsitesDigital Threats: Research and Practice10.1145/35849744:2(1-24)Online publication date: 10-Aug-2023
https://dl.acm.org/doi/10.1145/3584974
Wang PGao HGuo XXiao CQi FYan Z(2023)An Experimental Investigation of Text-based CAPTCHA Attacks and Their RobustnessACM Computing Surveys10.1145/355975455:9(1-38)Online publication date: 16-Jan-2023
https://dl.acm.org/doi/10.1145/3559754
Chua STay KChua MBalachandran VJoshi AFernandez MVerma R(2022)Using Adversarial Defences Against Image Classification CAPTCHAProceedings of the Twelfth ACM Conference on Data and Application Security and Privacy10.1145/3508398.3519367(355-357)Online publication date: 14-Apr-2022
https://dl.acm.org/doi/10.1145/3508398.3519367
Zhang NEbrahimi MLi WChen H(2022)Counteracting Dark Web Text-Based CAPTCHA with Generative Adversarial Learning for Proactive Cyber Threat IntelligenceACM Transactions on Management Information Systems10.1145/350522613:2(1-21)Online publication date: 10-Mar-2022
https://dl.acm.org/doi/10.1145/3505226
Mocanu IYang ZBelle V(2022)Breaking CAPTCHA with Capsule NetworksNeural Networks10.1016/j.neunet.2022.06.041154:C(246-254)Online publication date: 1-Oct-2022
https://dl.acm.org/doi/10.1016/j.neunet.2022.06.041
Mülazimoğlu EÇakır MAcartürk C(2021)The Role of Visual Features in Text-Based CAPTCHAsComputational Intelligence and Neuroscience10.1155/2021/88424202021Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1155/2021/8842420
Kumar MJindal M(2021)Benchmarks for Designing a Secure Devanagari CAPTCHASN Computer Science10.1007/s42979-020-00445-z2:1Online publication date: 19-Jan-2021
https://dl.acm.org/doi/10.1007/s42979-020-00445-z
Zhang XLiu XSarkodie-Gyan TLi Z(2021)Development of a character CAPTCHA recognition system for the visually impaired community using deep learningMachine Vision and Applications10.1007/s00138-020-01160-832:1Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1007/s00138-020-01160-8
Zhang NEbrahimi MLi WChen H(2020)A Generative Adversarial Learning Framework for Breaking Text-Based CAPTCHA in the Dark Web2020 IEEE International Conference on Intelligence and Security Informatics (ISI)10.1109/ISI49825.2020.9280537(1-6)Online publication date: 9-Nov-2020
https://dl.acm.org/doi/10.1109/ISI49825.2020.9280537
Ferreira DLeira LMihaylova PGeorgieva P(2019)Breaking Text-Based CAPTCHA with Sparse Convolutional Neural NetworksPattern Recognition and Image Analysis10.1007/978-3-030-31321-0_35(404-415)Online publication date: 1-Jul-2019
https://dl.acm.org/doi/10.1007/978-3-030-31321-0_35

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents