Semantic frame-based statistical approach for topic detection

Yung Chun Chang, Yu Lun Hsieh, Cen Chieh Chen, Chad Liu, Chun Hung Lu, Wen Lian Hsu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

We propose a statistical frame-based approach (FBA) for natural language processing, and demonstrate its advantage over traditional machine learning methods by using topic detection as a case study. FBA perceives and identifies semantic knowledge in a more general manner by collecting important linguistic patterns within documents through a unique flexible matching scheme that allows word insertion, deletion and substitution (IDS) to capture linguistic structures within the text. In addition, FBA can also overcome major issues of the rule-based approach by reducing human effort through its highly automated pattern generation and summarization. Using Yahoo! Chinese news corpus containing about 140,000 news articles, we provide a comprehensive performance evaluation that demonstrates the effectiveness of FBA in detecting the topic of a document by exploiting the semantic association and the context within the text. Moreover, it outperforms common topic models like Näive Bayes, Vector Space Model, and LDA-SVM.

Original languageEnglish
Title of host publicationProceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014
PublisherFaculty of Pharmaceutical Sciences, Chulalongkorn University
Pages75-84
Number of pages10
ISBN (Electronic)9786165518871
Publication statusPublished - 2014
Externally publishedYes
Event28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014 - Phuket, Thailand
Duration: Dec 12 2014Dec 14 2014

Conference

Conference28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014
CountryThailand
CityPhuket
Period12/12/1412/14/14

Fingerprint

Linguistics
Semantics
Vector spaces
Learning systems
Substitution reactions
Association reactions
Processing
Frame Semantics

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science (miscellaneous)

Cite this

Chang, Y. C., Hsieh, Y. L., Chen, C. C., Liu, C., Lu, C. H., & Hsu, W. L. (2014). Semantic frame-based statistical approach for topic detection. In Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014 (pp. 75-84). Faculty of Pharmaceutical Sciences, Chulalongkorn University.

Semantic frame-based statistical approach for topic detection. / Chang, Yung Chun; Hsieh, Yu Lun; Chen, Cen Chieh; Liu, Chad; Lu, Chun Hung; Hsu, Wen Lian.

Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014. Faculty of Pharmaceutical Sciences, Chulalongkorn University, 2014. p. 75-84.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chang, YC, Hsieh, YL, Chen, CC, Liu, C, Lu, CH & Hsu, WL 2014, Semantic frame-based statistical approach for topic detection. in Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014. Faculty of Pharmaceutical Sciences, Chulalongkorn University, pp. 75-84, 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014, Phuket, Thailand, 12/12/14.
Chang YC, Hsieh YL, Chen CC, Liu C, Lu CH, Hsu WL. Semantic frame-based statistical approach for topic detection. In Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014. Faculty of Pharmaceutical Sciences, Chulalongkorn University. 2014. p. 75-84
Chang, Yung Chun ; Hsieh, Yu Lun ; Chen, Cen Chieh ; Liu, Chad ; Lu, Chun Hung ; Hsu, Wen Lian. / Semantic frame-based statistical approach for topic detection. Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014. Faculty of Pharmaceutical Sciences, Chulalongkorn University, 2014. pp. 75-84
@inproceedings{4c0c26219eda485aa3cfe33139fcbc0b,
title = "Semantic frame-based statistical approach for topic detection",
abstract = "We propose a statistical frame-based approach (FBA) for natural language processing, and demonstrate its advantage over traditional machine learning methods by using topic detection as a case study. FBA perceives and identifies semantic knowledge in a more general manner by collecting important linguistic patterns within documents through a unique flexible matching scheme that allows word insertion, deletion and substitution (IDS) to capture linguistic structures within the text. In addition, FBA can also overcome major issues of the rule-based approach by reducing human effort through its highly automated pattern generation and summarization. Using Yahoo! Chinese news corpus containing about 140,000 news articles, we provide a comprehensive performance evaluation that demonstrates the effectiveness of FBA in detecting the topic of a document by exploiting the semantic association and the context within the text. Moreover, it outperforms common topic models like N{\"a}ive Bayes, Vector Space Model, and LDA-SVM.",
author = "Chang, {Yung Chun} and Hsieh, {Yu Lun} and Chen, {Cen Chieh} and Chad Liu and Lu, {Chun Hung} and Hsu, {Wen Lian}",
year = "2014",
language = "English",
pages = "75--84",
booktitle = "Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014",
publisher = "Faculty of Pharmaceutical Sciences, Chulalongkorn University",

}

TY - GEN

T1 - Semantic frame-based statistical approach for topic detection

AU - Chang, Yung Chun

AU - Hsieh, Yu Lun

AU - Chen, Cen Chieh

AU - Liu, Chad

AU - Lu, Chun Hung

AU - Hsu, Wen Lian

PY - 2014

Y1 - 2014

N2 - We propose a statistical frame-based approach (FBA) for natural language processing, and demonstrate its advantage over traditional machine learning methods by using topic detection as a case study. FBA perceives and identifies semantic knowledge in a more general manner by collecting important linguistic patterns within documents through a unique flexible matching scheme that allows word insertion, deletion and substitution (IDS) to capture linguistic structures within the text. In addition, FBA can also overcome major issues of the rule-based approach by reducing human effort through its highly automated pattern generation and summarization. Using Yahoo! Chinese news corpus containing about 140,000 news articles, we provide a comprehensive performance evaluation that demonstrates the effectiveness of FBA in detecting the topic of a document by exploiting the semantic association and the context within the text. Moreover, it outperforms common topic models like Näive Bayes, Vector Space Model, and LDA-SVM.

AB - We propose a statistical frame-based approach (FBA) for natural language processing, and demonstrate its advantage over traditional machine learning methods by using topic detection as a case study. FBA perceives and identifies semantic knowledge in a more general manner by collecting important linguistic patterns within documents through a unique flexible matching scheme that allows word insertion, deletion and substitution (IDS) to capture linguistic structures within the text. In addition, FBA can also overcome major issues of the rule-based approach by reducing human effort through its highly automated pattern generation and summarization. Using Yahoo! Chinese news corpus containing about 140,000 news articles, we provide a comprehensive performance evaluation that demonstrates the effectiveness of FBA in detecting the topic of a document by exploiting the semantic association and the context within the text. Moreover, it outperforms common topic models like Näive Bayes, Vector Space Model, and LDA-SVM.

UR - http://www.scopus.com/inward/record.url?scp=84994070599&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84994070599&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84994070599

SP - 75

EP - 84

BT - Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014

PB - Faculty of Pharmaceutical Sciences, Chulalongkorn University

ER -