Integrative Analysis of High-throughput Cancer Studies With Contrasted Penalization

Xingjie Shi, Jin Liu, Jian Huang, Yong Zhou, Benchang Shia, Shuangge Ma

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

In cancer studies with high-throughput genetic and genomic measurements, integrative analysis provides a way to effectively pool and analyze heterogeneous raw data from multiple independent studies and outperforms "classic" meta-analysis and single-dataset analysis. When marker selection is of interest, the genetic basis of multiple datasets can be described using the homogeneity model or the heterogeneity model. In this study, we consider marker selection under the heterogeneity model, which includes the homogeneity model as a special case and can be more flexible. Penalization methods have been developed in the literature for marker selection. This study advances from the published ones by introducing the contrast penalties, which can accommodate the within- and across-dataset structures of covariates/regression coefficients and, by doing so, further improve marker selection performance. Specifically, we develop a penalization method that accommodates the across-dataset structures by smoothing over regression coefficients. An effective iterative algorithm, which calls an inner coordinate descent iteration, is developed. Simulation shows that the proposed method outperforms the benchmark with more accurate marker identification. The analysis of breast cancer and lung cancer prognosis studies with gene expression measurements shows that the proposed method identifies genes different from those using the benchmark and has better prediction performance.

Original languageEnglish
Pages (from-to)144-151
Number of pages8
JournalGenetic Epidemiology
Volume38
Issue number2
DOIs
Publication statusPublished - Feb 2014
Externally publishedYes

Fingerprint

Benchmarking
Lung Neoplasms
Neoplasms
Meta-Analysis
Breast Neoplasms
Gene Expression
Datasets
Genes

Keywords

  • Contrasted penalization
  • High-throughput cancer studies
  • Integrative analysis
  • Marker selection

ASJC Scopus subject areas

  • Genetics(clinical)
  • Epidemiology

Cite this

Integrative Analysis of High-throughput Cancer Studies With Contrasted Penalization. / Shi, Xingjie; Liu, Jin; Huang, Jian; Zhou, Yong; Shia, Benchang; Ma, Shuangge.

In: Genetic Epidemiology, Vol. 38, No. 2, 02.2014, p. 144-151.

Research output: Contribution to journalArticle

Shi, Xingjie ; Liu, Jin ; Huang, Jian ; Zhou, Yong ; Shia, Benchang ; Ma, Shuangge. / Integrative Analysis of High-throughput Cancer Studies With Contrasted Penalization. In: Genetic Epidemiology. 2014 ; Vol. 38, No. 2. pp. 144-151.
@article{7054d4ff5b0740dca994aa1f4b888e5a,
title = "Integrative Analysis of High-throughput Cancer Studies With Contrasted Penalization",
abstract = "In cancer studies with high-throughput genetic and genomic measurements, integrative analysis provides a way to effectively pool and analyze heterogeneous raw data from multiple independent studies and outperforms {"}classic{"} meta-analysis and single-dataset analysis. When marker selection is of interest, the genetic basis of multiple datasets can be described using the homogeneity model or the heterogeneity model. In this study, we consider marker selection under the heterogeneity model, which includes the homogeneity model as a special case and can be more flexible. Penalization methods have been developed in the literature for marker selection. This study advances from the published ones by introducing the contrast penalties, which can accommodate the within- and across-dataset structures of covariates/regression coefficients and, by doing so, further improve marker selection performance. Specifically, we develop a penalization method that accommodates the across-dataset structures by smoothing over regression coefficients. An effective iterative algorithm, which calls an inner coordinate descent iteration, is developed. Simulation shows that the proposed method outperforms the benchmark with more accurate marker identification. The analysis of breast cancer and lung cancer prognosis studies with gene expression measurements shows that the proposed method identifies genes different from those using the benchmark and has better prediction performance.",
keywords = "Contrasted penalization, High-throughput cancer studies, Integrative analysis, Marker selection",
author = "Xingjie Shi and Jin Liu and Jian Huang and Yong Zhou and Benchang Shia and Shuangge Ma",
year = "2014",
month = "2",
doi = "10.1002/gepi.21781",
language = "English",
volume = "38",
pages = "144--151",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",
number = "2",

}

TY - JOUR

T1 - Integrative Analysis of High-throughput Cancer Studies With Contrasted Penalization

AU - Shi, Xingjie

AU - Liu, Jin

AU - Huang, Jian

AU - Zhou, Yong

AU - Shia, Benchang

AU - Ma, Shuangge

PY - 2014/2

Y1 - 2014/2

N2 - In cancer studies with high-throughput genetic and genomic measurements, integrative analysis provides a way to effectively pool and analyze heterogeneous raw data from multiple independent studies and outperforms "classic" meta-analysis and single-dataset analysis. When marker selection is of interest, the genetic basis of multiple datasets can be described using the homogeneity model or the heterogeneity model. In this study, we consider marker selection under the heterogeneity model, which includes the homogeneity model as a special case and can be more flexible. Penalization methods have been developed in the literature for marker selection. This study advances from the published ones by introducing the contrast penalties, which can accommodate the within- and across-dataset structures of covariates/regression coefficients and, by doing so, further improve marker selection performance. Specifically, we develop a penalization method that accommodates the across-dataset structures by smoothing over regression coefficients. An effective iterative algorithm, which calls an inner coordinate descent iteration, is developed. Simulation shows that the proposed method outperforms the benchmark with more accurate marker identification. The analysis of breast cancer and lung cancer prognosis studies with gene expression measurements shows that the proposed method identifies genes different from those using the benchmark and has better prediction performance.

AB - In cancer studies with high-throughput genetic and genomic measurements, integrative analysis provides a way to effectively pool and analyze heterogeneous raw data from multiple independent studies and outperforms "classic" meta-analysis and single-dataset analysis. When marker selection is of interest, the genetic basis of multiple datasets can be described using the homogeneity model or the heterogeneity model. In this study, we consider marker selection under the heterogeneity model, which includes the homogeneity model as a special case and can be more flexible. Penalization methods have been developed in the literature for marker selection. This study advances from the published ones by introducing the contrast penalties, which can accommodate the within- and across-dataset structures of covariates/regression coefficients and, by doing so, further improve marker selection performance. Specifically, we develop a penalization method that accommodates the across-dataset structures by smoothing over regression coefficients. An effective iterative algorithm, which calls an inner coordinate descent iteration, is developed. Simulation shows that the proposed method outperforms the benchmark with more accurate marker identification. The analysis of breast cancer and lung cancer prognosis studies with gene expression measurements shows that the proposed method identifies genes different from those using the benchmark and has better prediction performance.

KW - Contrasted penalization

KW - High-throughput cancer studies

KW - Integrative analysis

KW - Marker selection

UR - http://www.scopus.com/inward/record.url?scp=84892545864&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84892545864&partnerID=8YFLogxK

U2 - 10.1002/gepi.21781

DO - 10.1002/gepi.21781

M3 - Article

C2 - 24395534

AN - SCOPUS:84892545864

VL - 38

SP - 144

EP - 151

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

IS - 2

ER -