Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields

Hong Jie Dai, Shabbir Syed-Abdul, Chih Wei Chen, Chieh Chen Wu

研究成果: 雜誌貢獻文章同行評審

25 引文 斯高帕斯(Scopus)

摘要

Electronic health record (EHR) is a digital data format that collects electronic health information about an individual patient or population. To enhance the meaningful use of EHRs, information extraction techniques have been developed to recognize clinical concepts mentioned in EHRs. Nevertheless, the clinical judgment of an EHR cannot be known solely based on the recognized concepts without considering its contextual information. In order to improve the readability and accessibility of EHRs, this work developed a section heading recognition system for clinical documents. In contrast to formulating the section heading recognition task as a sentence classification problem, this work proposed a token-based formulation with the conditional random field (CRF) model. A standard section heading recognition corpus was compiled by annotators with clinical experience to evaluate the performance and compare it with sentence classification and dictionary-based approaches. The results of the experiments showed that the proposed method achieved a satisfactory F-score of 0.942, which outperformed the sentence-based approach and the best dictionary-based system by 0.087 and 0.096, respectively. One important advantage of our formulation over the sentence-based approach is that it presented an integrated solution without the need to develop additional heuristics rules for isolating the headings from the surrounding section contents.

原文英語
文章編號873012
期刊BioMed Research International
2015
DOIs
出版狀態已發佈 - 2015

ASJC Scopus subject areas

  • 生物化學、遺傳與分子生物學 (全部)
  • 免疫學與微生物學 (全部)

指紋

深入研究「Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields」主題。共同形成了獨特的指紋。

引用此