Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
skip to main content
10.1109/SP.2012.34guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Detecting Hoaxes, Frauds, and Deception in Writing Style Online

Published: 20 May 2012 Publication History

Abstract

In digital forensics, questions often arise about the authors of documents: their identity, demographic background, and whether they can be linked to other documents. The field of stylometry uses linguistic features and machine learning techniques to answer these questions. While stylometry techniques can identify authors with high accuracy in non-adversarial scenarios, their accuracy is reduced to random guessing when faced with authors who intentionally obfuscate their writing style or attempt to imitate that of another author. While these results are good for privacy, they raise concerns about fraud. We argue that some linguistic features change when people hide their writing style and by identifying those features, stylistic deception can be recognized. The major contribution of this work is a method for detecting stylistic deception in written documents. We show that using a large feature set, it is possible to distinguish regular documents from deceptive documents with 96.6% accuracy (F-measure). We also present an analysis of linguistic features that can be modified to hide writing style.

Cited By

View all
  • (2023)A multi-scenario approach to continuously learn and understand norm violationsAutonomous Agents and Multi-Agent Systems10.1007/s10458-023-09619-437:2Online publication date: 16-Aug-2023
  • (2022)A Self-Supervised Representation Learning of Sentence Structure for Authorship AttributionACM Transactions on Knowledge Discovery from Data10.1145/349120316:4(1-16)Online publication date: 8-Jan-2022
  • (2021)Advanced Machine Learning techniques for fake news (online disinformation) detectionApplied Soft Computing10.1016/j.asoc.2020.107050101:COnline publication date: 1-Mar-2021
  • Show More Cited By

Index Terms

  1. Detecting Hoaxes, Frauds, and Deception in Writing Style Online
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    SP '12: Proceedings of the 2012 IEEE Symposium on Security and Privacy
    May 2012
    600 pages
    ISBN:9780769546810

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 20 May 2012

    Author Tags

    1. deception
    2. machine learning
    3. privacy
    4. stylometry

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 20 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A multi-scenario approach to continuously learn and understand norm violationsAutonomous Agents and Multi-Agent Systems10.1007/s10458-023-09619-437:2Online publication date: 16-Aug-2023
    • (2022)A Self-Supervised Representation Learning of Sentence Structure for Authorship AttributionACM Transactions on Knowledge Discovery from Data10.1145/349120316:4(1-16)Online publication date: 8-Jan-2022
    • (2021)Advanced Machine Learning techniques for fake news (online disinformation) detectionApplied Soft Computing10.1016/j.asoc.2020.107050101:COnline publication date: 1-Mar-2021
    • (2020)The Limitations of Stylometry for Detecting Machine-Generated Fake NewsComputational Linguistics10.1162/coli_a_0038046:2(499-510)Online publication date: 1-Jun-2020
    • (2020)A Survey of Fake NewsACM Computing Surveys10.1145/339504653:5(1-40)Online publication date: 28-Sep-2020
    • (2020)FNEDACM Transactions on Information Systems10.1145/338625338:3(1-33)Online publication date: 5-May-2020
    • (2020)A Survey on Privacy in Social MediaACM/IMS Transactions on Data Science10.1145/33430381:1(1-38)Online publication date: 12-Mar-2020
    • (2019)Polarization and Fake NewsACM Transactions on the Web10.1145/331680913:2(1-22)Online publication date: 27-Mar-2019
    • (2019)Text Analysis in Adversarial SettingsACM Computing Surveys10.1145/331033152:3(1-36)Online publication date: 18-Jun-2019
    • (2019)The Web of False InformationJournal of Data and Information Quality10.1145/330969911:3(1-37)Online publication date: 7-May-2019
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media