Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
skip to main content

Web Content Classification Using Distributions of Subjective Quality Evaluations

Authors: Maria Rafalak, Dominik Deja, Adam Wierzbicki, Radosław Nielek, Michał KąkolAuthors Info & Claims
Article No.: 21, Pages 1 - 30
Published: 15 November 2016 Publication History
  • Get Citation Alerts
  • Abstract

    Machine learning algorithms and recommender systems trained on human ratings are widely in use today. However, human ratings may be associated with a high level of uncertainty and are subjective, influenced by demographic or psychological factors. We propose a new approach to the design of object classes from human ratings: the use of entire distributions to construct classes. By avoiding aggregation for class definition, our approach loses no information and can deal with highly volatile or conflicting ratings. The approach is based the concept of the Earth Mover's Distance (EMD), a measure of distance for distributions. We evaluate the proposed approach based on four datasets obtained from diverse Web content or movie quality evaluation services or experiments. We show that clusters discovered in these datasets using the EMD measure are characterized by a consistent and simple interpretation. Quality classes defined using entire rating distributions can be fitted to clusters of distributions in the four datasets using two parameters, resulting in a good overall fit. We also consider the impact of the composition of small samples on the distributions that are the basis of our classification approach. We show that using distributions based on small samples of 10 evaluations is still robust to several demographic and psychological variables. This observation suggests that the proposed approach can be used in practice for quality evaluation, even for highly uncertain and subjective ratings.

    References

    [1]
    Jesus Bobadilla, F. Ortega, A. Hernando, and A. Gutiérrez. 2013. Recommender systems survey. Knowledge-Based Systems 46, (July 2013), 109--132.
    [2]
    Alexander P. Dawid and Allan M. Skene. 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. Applied Statistics. (1979), 20--28.
    [3]
    Carsten Eickhoff and Arjen P. de Vries. 2013. Increasing cheat robustness of crowdsourcing tasks. Information Retrieval 16, 2 (2013), 121--137.
    [4]
    Lucie Flekova, Oliver Ferschke, and Iryna Gurevych. 2014. What makes a good biography?: Multidimensional quality analysis based on Wikipedia article feedback data. In Proceedings of the 23rd International Conference on World Wide Web (WWW’14). ACM, New York, 855--866.
    [5]
    Benoît Frénay and Michel Verleysen. 2014. Classification in the presence of label noise: A survey. IEEE Transactions on Neural Networks and Learning Systems 25, 5 (2014), 845--869.
    [6]
    Jerome Friedman, Trevor Hastie, and Robert Tibshirani. 2001. The Elements of Statistical Learning. Springer, Berlin.
    [7]
    Eszter Hargittai and Yuli Patrick Hsieh. 2012. Succinct survey measures of web-use skills. Social Science Computer Review 30, 1, 95--107.
    [8]
    Dirk Hovy, Taylor Berg-Kirkpatrick, Ashish Vaswani, and Eduard Hovy. 2013. Learning whom to trust with MACE. In HLT-NAACL, 1120--1130.
    [9]
    Nan Hu, Jie Zhang, and Paul A. Pavlou. 2009. Overcoming the J-shaped distribution of product reviews. Communications of the ACM 52, 10, 144--147.
    [10]
    Panagiotis G. Ipeirotis, Foster Provost, Victor S. Sheng, and Jing Wang. 2014. Repeated labeling using multiple noisy labelers. Data Mining and Knowledge Discovery 28, 2, 402--441.
    [11]
    Susan Jamieson. 2004. Likert scales: How to (ab)use them. Medical Education 38, 12, 1217--1218.
    [12]
    Michał Jankowski-Lorek, Radosław Nielek, Adam Wierzbicki, Kazimierz Zieliński, 2014. Predicting controversy of Wikipedia articles using the article feedback tool. In Proceedings of the 2014 International Conference on Social Computing. ACM, 22.
    [13]
    Michał Kąkol, Michał Jankowski-Lorek, Katarzyna Abramczuk, Adam Wierzbicki, and Michelle Catasta. 2013. On the subjectivity and bias of web content credibility evaluations. In Proceedings of the 22nd International Conference on World Wide Web Companion. 1131--1136.
    [14]
    Aniket Kittur, Ed H. Chi, and Bongwon Suh. 2008. Crowdsourcing user studies with Mechanical Turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 453--456.
    [15]
    Vassilis Kostakos. 2009. Is the crowd's wisdom biased? A quantitative analysis of three online communities. In Proceedings of the 2009 International Conference on Computational Science and Engineering (CSE’09). IEEE, 251--255.
    [16]
    Ludmila Kuncheva, Christopher J. Whitaker, Catherine A. Shipp, and Robert P. W. Duin. 2003. Limits on the majority evaluation accuracy in classifier fusion. Pattern Analysis & Applications 6, 1, 22--31.
    [17]
    Robert Leik. 1966. A measure of ordinal consensus. Pacific Sociological Review 9, 2, 85--90.
    [18]
    Xiu Liu, Radosław Nielek, Paulina Adamska, Adam Wierzbicki, and Karl Aberer. 2015. Towards a highly effective and robust Web credibility evaluation system. Decision Support Systems 79 (2015), 99--108.
    [19]
    Andrea Malossini, Enrico Blanzieri, and Raymond T. Ng. 2006. Detecting potential labeling errors in microarrays by data perturbation. Bioinformatics 22, 17 (2006), 2114--2121.
    [20]
    Mikołaj Morzy and Adam Wierzbicki. 2006. The sound of silence: Mining implicit feedbacks to compute reputation. In International Workshop on Internet and Network Economics. Springer, 365--376.
    [21]
    Arjun Mukherjee, Bing Liu, and Natalie Glance. 2012. Spotting fake reviewer groups in consumer reviews. In Proceedings of the 21st International Conference on World Wide Web. ACM, New York, 191--200.
    [22]
    Stefanie Nowak and Stefan Rüger. 2010. How reliable are annotations via crowdsourcing: A study about inter-annotator agreement for multi-label image annotation. In Proceedings of the International Conference on Multimedia Information Retrieval. ACM, 557--566.
    [23]
    Radosław Nielek, Aleksander Wawer, Michał Jankowski-Lorek, and Adam Wierzbicki. 2013. Temporal, cultural and thematic aspects of web credibility. In Social Informatics. Springer International Publishing, 419-428.
    [24]
    Alexandra Olteanu, Stanislav Peshterliev, Xin Liu, and Karl Aberer. 2013. Web credibility: Features exploration and credibility prediction. Advances in Information Retrieval. Springer, Berlin, 557--568.
    [25]
    Maria Rafalak, Katarzyna Abramczuk, and Adam Wierzbicki. 2014a. Incredible: Is (almost) all web content trustworthy? Analysis of psychological factors related to website credibility evaluation. Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1117--1122.
    [26]
    Maria Rafalak, Piotr Bilski, and Adam Wierzbicki. 2014b. Analysis of demographical factors’ influence on websites’ credibility evaluation. Human-Computer Interaction. Applications and Services. Springer International Publishing, 57--68.
    [27]
    Vikas C. Raykar and Shipeng Yu. 2012. Eliminating spammers and ranking annotators for crowdsourced labeling tasks. Journal of Machine Learning Research 13, 491--518.
    [28]
    Yossi Rubner, Carlo Tomasi, and Leonidas J. Guibas. 2000. The earth mover's distance as a metric for image retrieval. International Journal of Computer Vision. 40, 2 (2000), 99--121.
    [29]
    Subhash Shinde and Uday Kulkami. 2012. Hybrid personalizad recommender system using centering--bunching-based clustering algorithm. Expert Systems with Applications 39, 1, 1381--1387.
    [30]
    Padhraic Smyth, Usama Fayyad, Michael Burl, Pietro Perona, and Pierre Baldi. 1995. Inferring ground truth from subjective labelling of Venus images. In Advances in Neural Information Processing Systems. 1085--1092.
    [31]
    Rion Snow, Brendan O’Connor, Daniel Jurafsky, and Andrew Y. Ng. 2008. Cheap and fast—but is it good?: Evaluating non-expert annotations for natural language tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistic, 254--263.
    [32]
    Alexander Sorokin and David Forsyth. 2008. Utility data annotation with Amazon Mechanical Turk. Urbana 51, 61(2008), 820.
    [33]
    William J. Tastle and Mark J. Wierman. 2007. Consensus and dissention: A measure of ordinal dispersion. International Journal of Approximate Reasoning 45, 3, 531--545.
    [34]
    Cees Van der Eijk. 2001. Measuring agreement in ordered rating scales. Quality and Quantity 35, 3, 325--341.
    [35]
    Aleksander Wawer, Radosław Nielek, and Adam Wierzbicki. 2014. Predicting webpage credibility using linguistic features. In Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 1135--1140.
    [36]
    Peter Welinder, Steve Branson, Serge Belongie, and Pietro Perona. 2010. The multidimensional wisdom of crowds. In Advances in Neural Information Processing Systems. 2424--2432.
    [37]
    Jacob Whitehill, Paul Ruvolo, Tingfan Wu, Jacob Bergsma, and Javier Movellan. 2009. Whose evaluation should count more: Optimal integration of labels from labelers of unknown expertise. In Advances in Neural Information Processing Systems. 2035--2043.
    [38]
    Baba Yukino and Hisashi Kashima. 2013. Statistical quality estimation for general Crowdsourcing tasks. Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 554--562.

    Cited By

    View all
    • (2020)A comprehensive survey of data miningInternational Journal of Information Technology10.1007/s41870-020-00427-7Online publication date: 6-Feb-2020
    • (2018)An Ontology-based Term Weighting Technique for Web Document CategorizationProcedia Computer Science10.1016/j.procs.2018.07.010133(75-81)Online publication date: 2018
    • (2018)Computing controversy: Formal model and algorithms for detecting controversy on Wikipedia and in search queriesInformation Processing & Management10.1016/j.ipm.2017.08.00554:1(14-36)Online publication date: Jan-2018

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    ACM Transactions on the Web  Volume 10, Issue 4
    December 2016
    169 pages
    ISSN:1559-1131
    EISSN:1559-114X
    DOI:10.1145/3017848
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 November 2016
    Accepted: 01 August 2016
    Revised: 01 August 2016
    Received: 01 April 2015
    Published in TWEB Volume 10, Issue 4

    Permissions

    Request permissions for this article.
    Request Permissions

    Check for updates

    Author Tags

    1. Web content quality
    2. classification design
    3. earth mover’s distance
    4. rating distribution
    5. robustness
    6. sample composition

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • European Union's Seventh Framework Programme for research, technological development and demonstration

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)39
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 28 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)A comprehensive survey of data miningInternational Journal of Information Technology10.1007/s41870-020-00427-7Online publication date: 6-Feb-2020
    • (2018)An Ontology-based Term Weighting Technique for Web Document CategorizationProcedia Computer Science10.1016/j.procs.2018.07.010133(75-81)Online publication date: 2018
    • (2018)Computing controversy: Formal model and algorithms for detecting controversy on Wikipedia and in search queriesInformation Processing & Management10.1016/j.ipm.2017.08.00554:1(14-36)Online publication date: Jan-2018

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media

    View Issue’s Table of Contents