Semantic frame-based statistical approach for topic detection

Yung Chun Chang, Yu Lun Hsieh, Cen Chieh Chen, Chad Liu, Chun Hung Lu, Wen Lian Hsu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

We propose a statistical frame-based approach (FBA) for natural language processing, and demonstrate its advantage over traditional machine learning methods by using topic detection as a case study. FBA perceives and identifies semantic knowledge in a more general manner by collecting important linguistic patterns within documents through a unique flexible matching scheme that allows word insertion, deletion and substitution (IDS) to capture linguistic structures within the text. In addition, FBA can also overcome major issues of the rule-based approach by reducing human effort through its highly automated pattern generation and summarization. Using Yahoo! Chinese news corpus containing about 140,000 news articles, we provide a comprehensive performance evaluation that demonstrates the effectiveness of FBA in detecting the topic of a document by exploiting the semantic association and the context within the text. Moreover, it outperforms common topic models like Näive Bayes, Vector Space Model, and LDA-SVM.

Original languageEnglish
Title of host publicationProceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014
PublisherFaculty of Pharmaceutical Sciences, Chulalongkorn University
Pages75-84
Number of pages10
ISBN (Electronic)9786165518871
Publication statusPublished - 2014
Externally publishedYes
Event28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014 - Phuket, Thailand
Duration: Dec 12 2014Dec 14 2014

Conference

Conference28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014
CountryThailand
CityPhuket
Period12/12/1412/14/14

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science (miscellaneous)

Fingerprint Dive into the research topics of 'Semantic frame-based statistical approach for topic detection'. Together they form a unique fingerprint.

  • Cite this

    Chang, Y. C., Hsieh, Y. L., Chen, C. C., Liu, C., Lu, C. H., & Hsu, W. L. (2014). Semantic frame-based statistical approach for topic detection. In Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014 (pp. 75-84). Faculty of Pharmaceutical Sciences, Chulalongkorn University.