Medical knowledge infused convolutional neural networks for cohort selection in clinical trials

Chi Jen Chen, Neha Warikoo, Yung Chun Chang, Jin Hua Chen, Wen Lian Hsu

Research output: Contribution to journalArticle

Abstract

OBJECTIVE: In this era of digitized health records, there has been a marked interest in using de-identified patient records for conducting various health related surveys. To assist in this research effort, we developed a novel clinical data representation model entitled medical knowledge-infused convolutional neural network (MKCNN), which is used for learning the clinical trial criteria eligibility status of patients to participate in cohort studies. MATERIALS AND METHODS: In this study, we propose a clinical text representation infused with medical knowledge (MK). First, we isolate the noise from the relevant data using a medically relevant description extractor; then we utilize log-likelihood ratio based weights from selected sentences to highlight "met" and "not-met" knowledge-infused representations in bichannel setting for each instance. The combined medical knowledge-infused representation (MK) from these modules helps identify significant clinical criteria semantics, which in turn renders effective learning when used with a convolutional neural network architecture. RESULTS: MKCNN outperforms other Medical Knowledge (MK) relevant learning architectures by approximately 3%; notably SVM and XGBoost implementations developed in this study. MKCNN scored 86.1% on F1metric, a gain of 6% above the average performance assessed from the submissions for n2c2 task. Although pattern/rule-based methods show a higher average performance for the n2c2 clinical data set, MKCNN significantly improves performance of machine learning implementations for clinical datasets. CONCLUSION: MKCNN scored 86.1% on the F1 score metric. In contrast to many of the rule-based systems introduced during the n2c2 challenge workshop, our system presents a model that heavily draws on machine-based learning. In addition, the MK representations add more value to clinical comprehension and interpretation of natural texts.

Original languageEnglish
Pages (from-to)1227-1236
Number of pages10
JournalJournal of the American Medical Informatics Association : JAMIA
Volume26
Issue number11
DOIs
Publication statusPublished - Nov 1 2019

Fingerprint

Clinical Trials
Learning
Health Surveys
Semantics
Noise
Cohort Studies
Education
Weights and Measures
Health
Research
Datasets
Machine Learning

Keywords

  • clinical trials
  • cohort selection
  • convolutional neural network
  • medical records
  • natural language processing

ASJC Scopus subject areas

  • Health Informatics

Cite this

@article{fdc25acca6ed498c88c5aad4bc9f319d,
title = "Medical knowledge infused convolutional neural networks for cohort selection in clinical trials",
abstract = "OBJECTIVE: In this era of digitized health records, there has been a marked interest in using de-identified patient records for conducting various health related surveys. To assist in this research effort, we developed a novel clinical data representation model entitled medical knowledge-infused convolutional neural network (MKCNN), which is used for learning the clinical trial criteria eligibility status of patients to participate in cohort studies. MATERIALS AND METHODS: In this study, we propose a clinical text representation infused with medical knowledge (MK). First, we isolate the noise from the relevant data using a medically relevant description extractor; then we utilize log-likelihood ratio based weights from selected sentences to highlight {"}met{"} and {"}not-met{"} knowledge-infused representations in bichannel setting for each instance. The combined medical knowledge-infused representation (MK) from these modules helps identify significant clinical criteria semantics, which in turn renders effective learning when used with a convolutional neural network architecture. RESULTS: MKCNN outperforms other Medical Knowledge (MK) relevant learning architectures by approximately 3{\%}; notably SVM and XGBoost implementations developed in this study. MKCNN scored 86.1{\%} on F1metric, a gain of 6{\%} above the average performance assessed from the submissions for n2c2 task. Although pattern/rule-based methods show a higher average performance for the n2c2 clinical data set, MKCNN significantly improves performance of machine learning implementations for clinical datasets. CONCLUSION: MKCNN scored 86.1{\%} on the F1 score metric. In contrast to many of the rule-based systems introduced during the n2c2 challenge workshop, our system presents a model that heavily draws on machine-based learning. In addition, the MK representations add more value to clinical comprehension and interpretation of natural texts.",
keywords = "clinical trials, cohort selection, convolutional neural network, medical records, natural language processing",
author = "Chen, {Chi Jen} and Neha Warikoo and Chang, {Yung Chun} and Chen, {Jin Hua} and Hsu, {Wen Lian}",
year = "2019",
month = "11",
day = "1",
doi = "10.1093/jamia/ocz128",
language = "English",
volume = "26",
pages = "1227--1236",
journal = "Journal of the American Medical Informatics Association : JAMIA",
issn = "1067-5027",
publisher = "Oxford University Press",
number = "11",

}

TY - JOUR

T1 - Medical knowledge infused convolutional neural networks for cohort selection in clinical trials

AU - Chen, Chi Jen

AU - Warikoo, Neha

AU - Chang, Yung Chun

AU - Chen, Jin Hua

AU - Hsu, Wen Lian

PY - 2019/11/1

Y1 - 2019/11/1

N2 - OBJECTIVE: In this era of digitized health records, there has been a marked interest in using de-identified patient records for conducting various health related surveys. To assist in this research effort, we developed a novel clinical data representation model entitled medical knowledge-infused convolutional neural network (MKCNN), which is used for learning the clinical trial criteria eligibility status of patients to participate in cohort studies. MATERIALS AND METHODS: In this study, we propose a clinical text representation infused with medical knowledge (MK). First, we isolate the noise from the relevant data using a medically relevant description extractor; then we utilize log-likelihood ratio based weights from selected sentences to highlight "met" and "not-met" knowledge-infused representations in bichannel setting for each instance. The combined medical knowledge-infused representation (MK) from these modules helps identify significant clinical criteria semantics, which in turn renders effective learning when used with a convolutional neural network architecture. RESULTS: MKCNN outperforms other Medical Knowledge (MK) relevant learning architectures by approximately 3%; notably SVM and XGBoost implementations developed in this study. MKCNN scored 86.1% on F1metric, a gain of 6% above the average performance assessed from the submissions for n2c2 task. Although pattern/rule-based methods show a higher average performance for the n2c2 clinical data set, MKCNN significantly improves performance of machine learning implementations for clinical datasets. CONCLUSION: MKCNN scored 86.1% on the F1 score metric. In contrast to many of the rule-based systems introduced during the n2c2 challenge workshop, our system presents a model that heavily draws on machine-based learning. In addition, the MK representations add more value to clinical comprehension and interpretation of natural texts.

AB - OBJECTIVE: In this era of digitized health records, there has been a marked interest in using de-identified patient records for conducting various health related surveys. To assist in this research effort, we developed a novel clinical data representation model entitled medical knowledge-infused convolutional neural network (MKCNN), which is used for learning the clinical trial criteria eligibility status of patients to participate in cohort studies. MATERIALS AND METHODS: In this study, we propose a clinical text representation infused with medical knowledge (MK). First, we isolate the noise from the relevant data using a medically relevant description extractor; then we utilize log-likelihood ratio based weights from selected sentences to highlight "met" and "not-met" knowledge-infused representations in bichannel setting for each instance. The combined medical knowledge-infused representation (MK) from these modules helps identify significant clinical criteria semantics, which in turn renders effective learning when used with a convolutional neural network architecture. RESULTS: MKCNN outperforms other Medical Knowledge (MK) relevant learning architectures by approximately 3%; notably SVM and XGBoost implementations developed in this study. MKCNN scored 86.1% on F1metric, a gain of 6% above the average performance assessed from the submissions for n2c2 task. Although pattern/rule-based methods show a higher average performance for the n2c2 clinical data set, MKCNN significantly improves performance of machine learning implementations for clinical datasets. CONCLUSION: MKCNN scored 86.1% on the F1 score metric. In contrast to many of the rule-based systems introduced during the n2c2 challenge workshop, our system presents a model that heavily draws on machine-based learning. In addition, the MK representations add more value to clinical comprehension and interpretation of natural texts.

KW - clinical trials

KW - cohort selection

KW - convolutional neural network

KW - medical records

KW - natural language processing

UR - http://www.scopus.com/inward/record.url?scp=85073184183&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85073184183&partnerID=8YFLogxK

U2 - 10.1093/jamia/ocz128

DO - 10.1093/jamia/ocz128

M3 - Article

C2 - 31390470

AN - SCOPUS:85073184183

VL - 26

SP - 1227

EP - 1236

JO - Journal of the American Medical Informatics Association : JAMIA

JF - Journal of the American Medical Informatics Association : JAMIA

SN - 1067-5027

IS - 11

ER -