Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
skip to main content
10.1145/3292500.3330890acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Relation Extraction via Domain-aware Transfer Learning

Published: 25 July 2019 Publication History

Abstract

Relation extraction in knowledge base construction has been researched for the last decades due to its applicability to many problems. Most classical works, such as supervised information extraction and distant supervision, focus on how to construct the knowledge base (KB) by utilizing the large number of labels or certain related KBs. However, in many real-world scenarios, the existing methods may not perform well when a new knowledge base is required but only scarce labels or few related KBs available. In this paper, we propose a novel approach called, Relation Extraction via Domain-aware Transfer Learning (ReTrans), to extract relation mentions from a given text corpus by exploring the experience from a large amount of existing KBs which may not be closely related to the target relation. We first propose to initialize the representation of relation mentions from the massive text corpus and update those representations according to existing KBs. Based on the representations of relation mentions, we investigate the contribution of each KB to the target task and propose to select useful KBs for boosting the effectiveness of the proposed approach. Based on selected KBs, we develop a novel domain-aware transfer learning framework to transfer knowledge from source domains to the target domain, aiming to infer the true relation mentions in the unstructured text corpus. Most importantly, we give the stability and generalization bound of ReTrans. Experimental results on the real world datasets well demonstrate that the effectiveness of our approach, which outperforms all the state-of-the-art baselines.

References

[1]
Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. 2007. Dbpedia: A nucleus for a web of open data. In The semantic web. Springer, 722--735.
[2]
Nguyen Bach and Sameer Badaskar. 2007. A review of relation extraction. Literature review for Language and Statistics II 2 (2007).
[3]
John Blitzer, Mark Dredze, and Fernando Pereira. 2007. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th annual meeting of the association of computational linguistics. 440--447.
[4]
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data. AcM, 1247--1250.
[5]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Advances in neural information processing systems. 2787--2795.
[6]
Olivier Bousquet and André Elisseeff. 2002. Stability and generalization. Journal of machine learning research 2, Mar (2002), 499--526.
[7]
Razvan Bunescu and Raymond Mooney. 2007. Learning to extract relations from the web using minimal supervision. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 576--583.
[8]
Aron Culotta and Jeffrey Sorensen. 2004. Dependency tree kernels for relation extraction. In Proceedings of the 42nd annual meeting on association for computational linguistics. Association for Computational Linguistics, 423.
[9]
Chuong B. Do and Andrew Y. Ng. 2005. Transfer Learning for Text Classification. In Proceedings of the 18th International Conference on Neural Information Processing Systems (NIPS'05). MIT Press, Cambridge, MA, USA, 299--306. http://dl.acm.org/ citation.cfm?id=2976248.2976286
[10]
Joe Ellis, Xuansong Li, Kira Griffitt, Stephanie Strassel, and Jonathan Wright. 2012. Linguistic Resources for 2013 Knowledge Base Population Evaluations. In TAC.
[11]
Oren Etzioni, Michael Cafarella, Doug Downey, Stanley Kok, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S Weld, and Alexander Yates. 2004. Webscale information extraction in knowitall:(preliminary results). In Proceedings of the 13th international conference on World Wide Web. ACM, 100--110.
[12]
Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th international conference on machine learning (ICML-11). 513--520.
[13]
Arthur Gretton, Dino Sejdinovic, Heiko Strathmann, Sivaraman Balakrishnan, Massimiliano Pontil, Kenji Fukumizu, and Bharath K Sriperumbudur. 2012. Optimal kernel choice for large-scale two-sample tests. In Advances in neural information processing systems. 1205--1213.
[14]
Zhou GuoDong, Su Jian, Zhang Jie, and Zhang Min. 2005. Exploring various knowledge in relation extraction. In Proceedings of the 43rd annual meeting on association for computational linguistics. Association for Computational Linguistics, 427--434.
[15]
Raphael Hoffmann, Congle Zhang, Xiao Ling, Luke Zettlemoyer, and Daniel S Weld. 2011. Knowledge-based weak supervision for information extraction of overlapping relations. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 541--550.
[16]
Solomon Kullback and Richard A Leibler. 1951. On information and sufficiency. The annals of mathematical statistics 22, 1 (1951), 79--86.
[17]
Yankai Lin, Shiqi Shen, Zhiyuan Liu, Huanbo Luan, and Maosong Sun. 2016. Neural Relation Extraction with Selective Attention over Instances. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2124--2133.
[18]
Xiao Ling and Daniel S Weld. 2012. Fine-Grained Entity Recognition. In AAAI, Vol. 12. 94--100.
[19]
Mingsheng Long, Yue Cao, Jianmin Wang, and Michael I Jordan. 2015. Learning transferable features with deep adaptation networks. arXiv preprint arXiv:1502.02791 (2015).
[20]
Mingsheng Long, Jianmin Wang, Guiguang Ding, Dou Shen, and Qiang Yang. 2014. Transfer Learning with Graph Co-Regularization. IEEE Trans. Knowl. Data Eng. 26, 7 (2014), 1805--1818.
[21]
Andreas Maurer. 2005. Algorithmic stability and meta-learning. Journal of Machine Learning Research 6, Jun (2005), 967--994.
[22]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.
[23]
Mike Mintz, Steven Bills, Rion Snow, and Dan Jurafsky. 2009. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2. Association for Computational Linguistics, 1003--1011.
[24]
Ndapandula Nakashole, Tomasz Tylenda, and Gerhard Weikum. 2013. Finegrained semantic typing of emerging entities. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1. 1488--1497.
[25]
Sinno Jialin Pan, Ivor W Tsang, James T Kwok, and Qiang Yang. 2011. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks 22, 2 (2011), 199--210.
[26]
Sinno Jialin Pan, Qiang Yang, et al. 2010. A survey on transfer learning. IEEE Transactions on knowledge and data engineering 22, 10 (2010), 1345--1359.
[27]
Weike Pan, Evan Wei Xiang, Nathan Nan Liu, and Qiang Yang. 2010. Transfer Learning in Collaborative Filtering for Sparsity Reduction. In AAAI, Vol. 10. 230--235.
[28]
Rajat Raina, Alexis Battle, Honglak Lee, Benjamin Packer, and Andrew Y Ng. 2007. Self-taught learning: transfer learning from unlabeled data. In Proceedings of the 24th international conference on Machine learning. ACM, 759--766.
[29]
Xiang Ren, Zeqiu Wu, Wenqi He, Meng Qu, Clare R Voss, Heng Ji, Tarek F Abdelzaher, and Jiawei Han. 2017. Cotype: Joint extraction of typed entities and relations with knowledge bases. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1015--1024.
[30]
Sebastian Riedel, Limin Yao, and Andrew McCallum. 2010. Modeling relations and their mentions without labeled text. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 148--163.
[31]
Fabian M Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowledge. In Proceedings of the 16th international conference on World Wide Web. ACM, 697--706.
[32]
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1067--1077.
[33]
Tatiana Tommasi, Francesco Orabona, and Barbara Caputo. 2014. Learning categories from few examples with multi model knowledge transfer. IEEE transactions on pattern analysis and machine intelligence 36, 5 (2014), 928--941.
[34]
Eric Tzeng, Judy Hoffman, Trevor Darrell, and Kate Saenko. 2015. Simultaneous deep transfer across domains and tasks. In Proceedings of the IEEE International Conference on Computer Vision. 4068--4076.
[35]
Ying Wei, Yu Zheng, and Qiang Yang. 2016. Transfer knowledge between cities. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1905--1914.
[36]
Jun Yang, Rong Yan, and Alexander G Hauptmann. 2007. Adapting SVM classifiers to data with shifted distributions. In Data Mining Workshops, 2007. ICDM Workshops 2007. Seventh IEEE International Conference on. IEEE, 69--76.
[37]
Yi Yao and Gianfranco Doretto. 2010. Boosting for transfer learning with multiple sources. In Computer vision and pattern recognition (CVPR), 2010 IEEE conference on. IEEE, 1855--1862.
[38]
WEI Ying, Yu Zhang, Junzhou Huang, and Qiang Yang. 2018. Transfer Learning via Learning to Transfer. In International Conference on Machine Learning. 5072-- 5081.
[39]
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks?. In Advances in neural information processing systems. 3320--3328.
[40]
Daojian Zeng, Kang Liu, Yubo Chen, and Jun Zhao. 2015. Distant supervision for relation extraction via piecewise convolutional neural networks. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 1753--1762.
[41]
Lei Zhang,Wangmeng Zuo, and David Zhang. 2016. LSDT: Latent sparse domain transfer learning for visual adaptation. IEEE Transactions on Image Processing 25, 3 (2016), 1177--1191.

Cited By

View all
  • (2024)Offloading the computational complexity of transfer learning with generic featuresPeerJ Computer Science10.7717/peerj-cs.193810(e1938)Online publication date: 25-Mar-2024
  • (2024)A Survey on Challenges and Advances in Natural Language Processing with a Focus on Legal Informatics and Low-Resource LanguagesElectronics10.3390/electronics1303064813:3(648)Online publication date: 4-Feb-2024
  • (2024)Communication-efficient Multi-service Mobile Traffic Prediction by Leveraging Cross-service CorrelationsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671730(794-805)Online publication date: 25-Aug-2024
  • Show More Cited By

Index Terms

  1. Relation Extraction via Domain-aware Transfer Learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
    July 2019
    3305 pages
    ISBN:9781450362016
    DOI:10.1145/3292500
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 July 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. relation extraction
    2. transfer learning

    Qualifiers

    • Research-article

    Funding Sources

    • National Science Foundation of China (NSFC)
    • Didi-HKUST joint research lab project
    • Hong Kong ITC ITF grants
    • Microsoft Research Asia Collaborative Research Grant
    • Hong Kong RGC
    • Science and Technology Planning Project of Guangdong Province
    • Wechat Research Grant

    Conference

    KDD '19
    Sponsor:

    Acceptance Rates

    KDD '19 Paper Acceptance Rate 110 of 1,200 submissions, 9%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)65
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 20 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Offloading the computational complexity of transfer learning with generic featuresPeerJ Computer Science10.7717/peerj-cs.193810(e1938)Online publication date: 25-Mar-2024
    • (2024)A Survey on Challenges and Advances in Natural Language Processing with a Focus on Legal Informatics and Low-Resource LanguagesElectronics10.3390/electronics1303064813:3(648)Online publication date: 4-Feb-2024
    • (2024)Communication-efficient Multi-service Mobile Traffic Prediction by Leveraging Cross-service CorrelationsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671730(794-805)Online publication date: 25-Aug-2024
    • (2024)Effective Data Selection and Replay for Unsupervised Continual Learning2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00119(1449-1463)Online publication date: 13-May-2024
    • (2024)GradGCL: Gradient Graph Contrastive Learning2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00095(1171-1184)Online publication date: 13-May-2024
    • (2024)Biomedical event causal relation extraction with deep knowledge fusion and Roberta-based data augmentationMethods10.1016/j.ymeth.2024.08.007Online publication date: Sep-2024
    • (2024)Extracting Structural Knowledge for Professional Text InferenceComputer Supported Cooperative Work and Social Computing10.1007/978-981-99-9640-7_25(334-347)Online publication date: 5-Jan-2024
    • (2023)A Survey on Multimodal Knowledge Graphs: Construction, Completion and ApplicationsMathematics10.3390/math1108181511:8(1815)Online publication date: 11-Apr-2023
    • (2023)A Review: Data and Semantic Augmentation for Relation Classification in Low ResourceProceedings of the 2023 6th International Conference on Algorithms, Computing and Artificial Intelligence10.1145/3639631.3639665(195-201)Online publication date: 22-Dec-2023
    • (2023)Incremental Tabular Learning on Heterogeneous Feature SpaceProceedings of the ACM on Management of Data10.1145/35886981:1(1-18)Online publication date: 30-May-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media