PICO element detection in medical text without metadata: Are first sentences enough?

Ke Chun Huang, I-Jen Chiang, Furen Xiao, Chun Chih Liao, Charles Chih Ho Liu, Jau Min Wong

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

Efficient identification of patient, intervention, comparison, and outcome (PICO) components in medical articles is helpful in evidence-based medicine. The purpose of this study is to clarify whether first sentences of these components are good enough to train naive Bayes classifiers for sentence-level PICO element detection. We extracted 19,854 structured abstracts of randomized controlled trials with any P/I/O label from PubMed for naive Bayes classifiers training. Performances of classifiers trained by first sentences of each section ( CF) and those trained by all sentences ( CA) were compared using all sentences by ten-fold cross-validation. The results measured by recall, precision, and F-measures show that there are no significant differences in performance between CF and CA for detection of O-element ( F-measure. = 0.731. ±. 0.009 vs. 0.738. ±. 0.010, p= 0.123). However, CA perform better for I-elements, in terms of recall (0.752. ±. 0.012 vs. 0.620. ±. 0.007, p

Original languageEnglish
Pages (from-to)940-946
Number of pages7
JournalJournal of Biomedical Informatics
Volume46
Issue number5
DOIs
Publication statusPublished - Oct 2013

Fingerprint

Metadata
Classifiers
Evidence-Based Medicine
PubMed
Randomized Controlled Trials
Medicine
Labels

Keywords

  • Evidence-based medicine
  • Information extraction
  • Information retrieval
  • Natural language processing
  • Question answering
  • Text mining

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics

Cite this

Huang, K. C., Chiang, I-J., Xiao, F., Liao, C. C., Liu, C. C. H., & Wong, J. M. (2013). PICO element detection in medical text without metadata: Are first sentences enough? Journal of Biomedical Informatics, 46(5), 940-946. https://doi.org/10.1016/j.jbi.2013.07.009

PICO element detection in medical text without metadata : Are first sentences enough? / Huang, Ke Chun; Chiang, I-Jen; Xiao, Furen; Liao, Chun Chih; Liu, Charles Chih Ho; Wong, Jau Min.

In: Journal of Biomedical Informatics, Vol. 46, No. 5, 10.2013, p. 940-946.

Research output: Contribution to journalArticle

Huang, Ke Chun ; Chiang, I-Jen ; Xiao, Furen ; Liao, Chun Chih ; Liu, Charles Chih Ho ; Wong, Jau Min. / PICO element detection in medical text without metadata : Are first sentences enough?. In: Journal of Biomedical Informatics. 2013 ; Vol. 46, No. 5. pp. 940-946.
@article{64546e29b9bb4e58a534c457c4ef9c30,
title = "PICO element detection in medical text without metadata: Are first sentences enough?",
abstract = "Efficient identification of patient, intervention, comparison, and outcome (PICO) components in medical articles is helpful in evidence-based medicine. The purpose of this study is to clarify whether first sentences of these components are good enough to train naive Bayes classifiers for sentence-level PICO element detection. We extracted 19,854 structured abstracts of randomized controlled trials with any P/I/O label from PubMed for naive Bayes classifiers training. Performances of classifiers trained by first sentences of each section ( CF) and those trained by all sentences ( CA) were compared using all sentences by ten-fold cross-validation. The results measured by recall, precision, and F-measures show that there are no significant differences in performance between CF and CA for detection of O-element ( F-measure. = 0.731. ±. 0.009 vs. 0.738. ±. 0.010, p= 0.123). However, CA perform better for I-elements, in terms of recall (0.752. ±. 0.012 vs. 0.620. ±. 0.007, p",
keywords = "Evidence-based medicine, Information extraction, Information retrieval, Natural language processing, Question answering, Text mining",
author = "Huang, {Ke Chun} and I-Jen Chiang and Furen Xiao and Liao, {Chun Chih} and Liu, {Charles Chih Ho} and Wong, {Jau Min}",
year = "2013",
month = "10",
doi = "10.1016/j.jbi.2013.07.009",
language = "English",
volume = "46",
pages = "940--946",
journal = "Journal of Biomedical Informatics",
issn = "1532-0464",
publisher = "Academic Press Inc.",
number = "5",

}

TY - JOUR

T1 - PICO element detection in medical text without metadata

T2 - Are first sentences enough?

AU - Huang, Ke Chun

AU - Chiang, I-Jen

AU - Xiao, Furen

AU - Liao, Chun Chih

AU - Liu, Charles Chih Ho

AU - Wong, Jau Min

PY - 2013/10

Y1 - 2013/10

N2 - Efficient identification of patient, intervention, comparison, and outcome (PICO) components in medical articles is helpful in evidence-based medicine. The purpose of this study is to clarify whether first sentences of these components are good enough to train naive Bayes classifiers for sentence-level PICO element detection. We extracted 19,854 structured abstracts of randomized controlled trials with any P/I/O label from PubMed for naive Bayes classifiers training. Performances of classifiers trained by first sentences of each section ( CF) and those trained by all sentences ( CA) were compared using all sentences by ten-fold cross-validation. The results measured by recall, precision, and F-measures show that there are no significant differences in performance between CF and CA for detection of O-element ( F-measure. = 0.731. ±. 0.009 vs. 0.738. ±. 0.010, p= 0.123). However, CA perform better for I-elements, in terms of recall (0.752. ±. 0.012 vs. 0.620. ±. 0.007, p

AB - Efficient identification of patient, intervention, comparison, and outcome (PICO) components in medical articles is helpful in evidence-based medicine. The purpose of this study is to clarify whether first sentences of these components are good enough to train naive Bayes classifiers for sentence-level PICO element detection. We extracted 19,854 structured abstracts of randomized controlled trials with any P/I/O label from PubMed for naive Bayes classifiers training. Performances of classifiers trained by first sentences of each section ( CF) and those trained by all sentences ( CA) were compared using all sentences by ten-fold cross-validation. The results measured by recall, precision, and F-measures show that there are no significant differences in performance between CF and CA for detection of O-element ( F-measure. = 0.731. ±. 0.009 vs. 0.738. ±. 0.010, p= 0.123). However, CA perform better for I-elements, in terms of recall (0.752. ±. 0.012 vs. 0.620. ±. 0.007, p

KW - Evidence-based medicine

KW - Information extraction

KW - Information retrieval

KW - Natural language processing

KW - Question answering

KW - Text mining

UR - http://www.scopus.com/inward/record.url?scp=84883806449&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84883806449&partnerID=8YFLogxK

U2 - 10.1016/j.jbi.2013.07.009

DO - 10.1016/j.jbi.2013.07.009

M3 - Article

C2 - 23899909

AN - SCOPUS:84883806449

VL - 46

SP - 940

EP - 946

JO - Journal of Biomedical Informatics

JF - Journal of Biomedical Informatics

SN - 1532-0464

IS - 5

ER -