TEMPTING system: A hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries

Yung Chun Chang, Hong Jie Dai, Johnny Chi Yang Wu, Jian Ming Chen, Richard Tzong Han Tsai, Wen Lian Hsu

Research output: Contribution to journalArticle

22 Citations (Scopus)

Abstract

Patient discharge summaries provide detailed medical information about individuals who have been hospitalized. To make a precise and legitimate assessment of the abundant data, a proper time layout of the sequence of relevant events should be compiled and used to drive a patient-specific timeline, which could further assist medical personnel in making clinical decisions. The process of identifying the chronological order of entities is called temporal relation extraction. In this paper, we propose a hybrid method to identify appropriate temporal links between a pair of entities. The method combines two approaches: one is rule-based and the other is based on the maximum entropy model. We develop an integration algorithm to fuse the results of the two approaches. All rules and the integration algorithm are formally stated so that one can easily reproduce the system and results. To optimize the system's configuration, we used the 2012 i2b2 challenge TLINK track dataset and applied threefold cross validation to the training set. Then, we evaluated its performance on the training and test datasets. The experiment results show that the proposed TEMPTING (TEMPoral relaTion extractING) system (ranked seventh) achieved an F-score of 0.563, which was at least 30% better than that of the baseline system, which randomly selects TLINK candidates from all pairs and assigns the TLINK types. The TEMPTING system using the hybrid method also outperformed the stage-based TEMPTING system. Its F-scores were 3.51% and 0.97% better than those of the stage-based system on the training set and test set, respectively.

Original languageEnglish
Pages (from-to)S54-S62
JournalJournal of Biomedical Informatics
Volume46
Issue numberSUPPL.
DOIs
Publication statusPublished - 2013

Fingerprint

Patient Discharge Summaries
Learning systems
Electric fuses
Entropy
Personnel
Experiments
Machine Learning

Keywords

  • Hybrid method
  • Maximum entropy
  • Natural language processing
  • Temporal relation extraction
  • Text mining

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics

Cite this

TEMPTING system : A hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries. / Chang, Yung Chun; Dai, Hong Jie; Wu, Johnny Chi Yang; Chen, Jian Ming; Tsai, Richard Tzong Han; Hsu, Wen Lian.

In: Journal of Biomedical Informatics, Vol. 46, No. SUPPL., 2013, p. S54-S62.

Research output: Contribution to journalArticle

Chang, Yung Chun ; Dai, Hong Jie ; Wu, Johnny Chi Yang ; Chen, Jian Ming ; Tsai, Richard Tzong Han ; Hsu, Wen Lian. / TEMPTING system : A hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries. In: Journal of Biomedical Informatics. 2013 ; Vol. 46, No. SUPPL. pp. S54-S62.
@article{42e9eaf8bfca4459894fbaa113d89004,
title = "TEMPTING system: A hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries",
abstract = "Patient discharge summaries provide detailed medical information about individuals who have been hospitalized. To make a precise and legitimate assessment of the abundant data, a proper time layout of the sequence of relevant events should be compiled and used to drive a patient-specific timeline, which could further assist medical personnel in making clinical decisions. The process of identifying the chronological order of entities is called temporal relation extraction. In this paper, we propose a hybrid method to identify appropriate temporal links between a pair of entities. The method combines two approaches: one is rule-based and the other is based on the maximum entropy model. We develop an integration algorithm to fuse the results of the two approaches. All rules and the integration algorithm are formally stated so that one can easily reproduce the system and results. To optimize the system's configuration, we used the 2012 i2b2 challenge TLINK track dataset and applied threefold cross validation to the training set. Then, we evaluated its performance on the training and test datasets. The experiment results show that the proposed TEMPTING (TEMPoral relaTion extractING) system (ranked seventh) achieved an F-score of 0.563, which was at least 30{\%} better than that of the baseline system, which randomly selects TLINK candidates from all pairs and assigns the TLINK types. The TEMPTING system using the hybrid method also outperformed the stage-based TEMPTING system. Its F-scores were 3.51{\%} and 0.97{\%} better than those of the stage-based system on the training set and test set, respectively.",
keywords = "Hybrid method, Maximum entropy, Natural language processing, Temporal relation extraction, Text mining",
author = "Chang, {Yung Chun} and Dai, {Hong Jie} and Wu, {Johnny Chi Yang} and Chen, {Jian Ming} and Tsai, {Richard Tzong Han} and Hsu, {Wen Lian}",
year = "2013",
doi = "10.1016/j.jbi.2013.09.007",
language = "English",
volume = "46",
pages = "S54--S62",
journal = "Journal of Biomedical Informatics",
issn = "1532-0464",
publisher = "Academic Press Inc.",
number = "SUPPL.",

}

TY - JOUR

T1 - TEMPTING system

T2 - A hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries

AU - Chang, Yung Chun

AU - Dai, Hong Jie

AU - Wu, Johnny Chi Yang

AU - Chen, Jian Ming

AU - Tsai, Richard Tzong Han

AU - Hsu, Wen Lian

PY - 2013

Y1 - 2013

N2 - Patient discharge summaries provide detailed medical information about individuals who have been hospitalized. To make a precise and legitimate assessment of the abundant data, a proper time layout of the sequence of relevant events should be compiled and used to drive a patient-specific timeline, which could further assist medical personnel in making clinical decisions. The process of identifying the chronological order of entities is called temporal relation extraction. In this paper, we propose a hybrid method to identify appropriate temporal links between a pair of entities. The method combines two approaches: one is rule-based and the other is based on the maximum entropy model. We develop an integration algorithm to fuse the results of the two approaches. All rules and the integration algorithm are formally stated so that one can easily reproduce the system and results. To optimize the system's configuration, we used the 2012 i2b2 challenge TLINK track dataset and applied threefold cross validation to the training set. Then, we evaluated its performance on the training and test datasets. The experiment results show that the proposed TEMPTING (TEMPoral relaTion extractING) system (ranked seventh) achieved an F-score of 0.563, which was at least 30% better than that of the baseline system, which randomly selects TLINK candidates from all pairs and assigns the TLINK types. The TEMPTING system using the hybrid method also outperformed the stage-based TEMPTING system. Its F-scores were 3.51% and 0.97% better than those of the stage-based system on the training set and test set, respectively.

AB - Patient discharge summaries provide detailed medical information about individuals who have been hospitalized. To make a precise and legitimate assessment of the abundant data, a proper time layout of the sequence of relevant events should be compiled and used to drive a patient-specific timeline, which could further assist medical personnel in making clinical decisions. The process of identifying the chronological order of entities is called temporal relation extraction. In this paper, we propose a hybrid method to identify appropriate temporal links between a pair of entities. The method combines two approaches: one is rule-based and the other is based on the maximum entropy model. We develop an integration algorithm to fuse the results of the two approaches. All rules and the integration algorithm are formally stated so that one can easily reproduce the system and results. To optimize the system's configuration, we used the 2012 i2b2 challenge TLINK track dataset and applied threefold cross validation to the training set. Then, we evaluated its performance on the training and test datasets. The experiment results show that the proposed TEMPTING (TEMPoral relaTion extractING) system (ranked seventh) achieved an F-score of 0.563, which was at least 30% better than that of the baseline system, which randomly selects TLINK candidates from all pairs and assigns the TLINK types. The TEMPTING system using the hybrid method also outperformed the stage-based TEMPTING system. Its F-scores were 3.51% and 0.97% better than those of the stage-based system on the training set and test set, respectively.

KW - Hybrid method

KW - Maximum entropy

KW - Natural language processing

KW - Temporal relation extraction

KW - Text mining

UR - http://www.scopus.com/inward/record.url?scp=84897052209&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84897052209&partnerID=8YFLogxK

U2 - 10.1016/j.jbi.2013.09.007

DO - 10.1016/j.jbi.2013.09.007

M3 - Article

C2 - 24060600

AN - SCOPUS:84897052209

VL - 46

SP - S54-S62

JO - Journal of Biomedical Informatics

JF - Journal of Biomedical Informatics

SN - 1532-0464

IS - SUPPL.

ER -