Classification of PICO elements by text features systematically extracted from PubMed abstracts

Ke Chun Huang, Charles Chih Ho Liu, Shung Shiang Yang, Furen Xiao, Jau Min Wong, Chun Chih Liao, I. Jen Chiang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

We propose and evaluate a systematic approach to detect and classify Patient/Problem, Intervention, Comparison and Outcome (PICO) from the medical literature. The training and test corpora were generated systematically and automatically from structured PubMed abstracts. 23,472 sentences by exact pattern match of head words of P-I-O categories. Afterward, the terms with top frequencies were used as the features of Naïve Bayesian classifier. This approach achieves F-measure values of 0.91 for Patient/Problem, 0.75 for Intervention and 0.88 for Outcome, comparable to previous studied based on mixed textural, paragraphical, and semantic features. In conclusion, we show that by stricter pattern matching criteria of training set, detection and classification of PICO elements can be reproducible with minimal expert intervention. The results of this work are higher than previous studies.

Original languageEnglish
Title of host publicationProceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011
Pages279-283
Number of pages5
DOIs
Publication statusPublished - 2011
Event2011 IEEE International Conference on Granular Computing, GrC 2011 - Kaohsiung, Taiwan
Duration: Nov 8 2011Nov 10 2011

Other

Other2011 IEEE International Conference on Granular Computing, GrC 2011
CountryTaiwan
CityKaohsiung
Period11/8/1111/10/11

Fingerprint

Pattern matching
Classifiers
Semantics

Keywords

  • information extraction
  • natural language processing
  • question answering
  • text mining

ASJC Scopus subject areas

  • Software

Cite this

Huang, K. C., Liu, C. C. H., Yang, S. S., Xiao, F., Wong, J. M., Liao, C. C., & Chiang, I. J. (2011). Classification of PICO elements by text features systematically extracted from PubMed abstracts. In Proceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011 (pp. 279-283). [6122608] https://doi.org/10.1109/GRC.2011.6122608

Classification of PICO elements by text features systematically extracted from PubMed abstracts. / Huang, Ke Chun; Liu, Charles Chih Ho; Yang, Shung Shiang; Xiao, Furen; Wong, Jau Min; Liao, Chun Chih; Chiang, I. Jen.

Proceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011. 2011. p. 279-283 6122608.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Huang, KC, Liu, CCH, Yang, SS, Xiao, F, Wong, JM, Liao, CC & Chiang, IJ 2011, Classification of PICO elements by text features systematically extracted from PubMed abstracts. in Proceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011., 6122608, pp. 279-283, 2011 IEEE International Conference on Granular Computing, GrC 2011, Kaohsiung, Taiwan, 11/8/11. https://doi.org/10.1109/GRC.2011.6122608
Huang KC, Liu CCH, Yang SS, Xiao F, Wong JM, Liao CC et al. Classification of PICO elements by text features systematically extracted from PubMed abstracts. In Proceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011. 2011. p. 279-283. 6122608 https://doi.org/10.1109/GRC.2011.6122608
Huang, Ke Chun ; Liu, Charles Chih Ho ; Yang, Shung Shiang ; Xiao, Furen ; Wong, Jau Min ; Liao, Chun Chih ; Chiang, I. Jen. / Classification of PICO elements by text features systematically extracted from PubMed abstracts. Proceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011. 2011. pp. 279-283
@inproceedings{01073af1c551473385559ef2ac1f731c,
title = "Classification of PICO elements by text features systematically extracted from PubMed abstracts",
abstract = "We propose and evaluate a systematic approach to detect and classify Patient/Problem, Intervention, Comparison and Outcome (PICO) from the medical literature. The training and test corpora were generated systematically and automatically from structured PubMed abstracts. 23,472 sentences by exact pattern match of head words of P-I-O categories. Afterward, the terms with top frequencies were used as the features of Na{\"i}ve Bayesian classifier. This approach achieves F-measure values of 0.91 for Patient/Problem, 0.75 for Intervention and 0.88 for Outcome, comparable to previous studied based on mixed textural, paragraphical, and semantic features. In conclusion, we show that by stricter pattern matching criteria of training set, detection and classification of PICO elements can be reproducible with minimal expert intervention. The results of this work are higher than previous studies.",
keywords = "information extraction, natural language processing, question answering, text mining",
author = "Huang, {Ke Chun} and Liu, {Charles Chih Ho} and Yang, {Shung Shiang} and Furen Xiao and Wong, {Jau Min} and Liao, {Chun Chih} and Chiang, {I. Jen}",
year = "2011",
doi = "10.1109/GRC.2011.6122608",
language = "English",
isbn = "9781457703713",
pages = "279--283",
booktitle = "Proceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011",

}

TY - GEN

T1 - Classification of PICO elements by text features systematically extracted from PubMed abstracts

AU - Huang, Ke Chun

AU - Liu, Charles Chih Ho

AU - Yang, Shung Shiang

AU - Xiao, Furen

AU - Wong, Jau Min

AU - Liao, Chun Chih

AU - Chiang, I. Jen

PY - 2011

Y1 - 2011

N2 - We propose and evaluate a systematic approach to detect and classify Patient/Problem, Intervention, Comparison and Outcome (PICO) from the medical literature. The training and test corpora were generated systematically and automatically from structured PubMed abstracts. 23,472 sentences by exact pattern match of head words of P-I-O categories. Afterward, the terms with top frequencies were used as the features of Naïve Bayesian classifier. This approach achieves F-measure values of 0.91 for Patient/Problem, 0.75 for Intervention and 0.88 for Outcome, comparable to previous studied based on mixed textural, paragraphical, and semantic features. In conclusion, we show that by stricter pattern matching criteria of training set, detection and classification of PICO elements can be reproducible with minimal expert intervention. The results of this work are higher than previous studies.

AB - We propose and evaluate a systematic approach to detect and classify Patient/Problem, Intervention, Comparison and Outcome (PICO) from the medical literature. The training and test corpora were generated systematically and automatically from structured PubMed abstracts. 23,472 sentences by exact pattern match of head words of P-I-O categories. Afterward, the terms with top frequencies were used as the features of Naïve Bayesian classifier. This approach achieves F-measure values of 0.91 for Patient/Problem, 0.75 for Intervention and 0.88 for Outcome, comparable to previous studied based on mixed textural, paragraphical, and semantic features. In conclusion, we show that by stricter pattern matching criteria of training set, detection and classification of PICO elements can be reproducible with minimal expert intervention. The results of this work are higher than previous studies.

KW - information extraction

KW - natural language processing

KW - question answering

KW - text mining

UR - http://www.scopus.com/inward/record.url?scp=84863028228&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863028228&partnerID=8YFLogxK

U2 - 10.1109/GRC.2011.6122608

DO - 10.1109/GRC.2011.6122608

M3 - Conference contribution

AN - SCOPUS:84863028228

SN - 9781457703713

SP - 279

EP - 283

BT - Proceedings - 2011 IEEE International Conference on Granular Computing, GrC 2011

ER -