Identification of cancer genomic markers via integrative sparse boosting

Yuan Huang, Jian Huang, Ben Chang Shia, Shuangge Ma

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

In high-throughput cancer genomic studies, markers identified from the analysis of single data sets often suffer a lack of reproducibility because of the small sample sizes. An ideal solution is to conduct large-scale prospective studies, which are extremely expensive and time consuming. A cost-effective remedy is to pool data from multiple comparable studies and conduct integrative analysis. Integrative analysis of multiple data sets is challenging because of the high dimensionality of genomic measurements and heterogeneity among studies. In this article, we propose a sparse boosting approach for marker identification in integrative analysis of multiple heterogeneous cancer diagnosis studies with gene expression measurements. The proposed approach can effectively accommodate the heterogeneity among multiple studies and identify markers with consistent effects across studies. Simulation shows that the proposed approach has satisfactory identification results and outperforms alternatives including an intensity approach and meta-analysis. The proposed approach is used to identify markers of pancreatic cancer and liver cancer. The Author 2011. Published by Oxford University Press. All rights reserved.

Original languageEnglish
Pages (from-to)509-522
Number of pages14
JournalBiostatistics
Volume13
Issue number3
DOIs
Publication statusPublished - Jul 2012
Externally publishedYes

Fingerprint

Boosting
Liver Neoplasms
Genomics
Cancer
Pancreatic Neoplasms
Sample Size
Meta-Analysis
Neoplasms
Prospective Studies
Gene Expression
Costs and Cost Analysis
Reproducibility
Small Sample Size
High Throughput
Liver
Dimensionality
Datasets
Alternatives
Costs
Simulation

Keywords

  • Cancer genomics
  • Marker identification
  • Sparse boosting

ASJC Scopus subject areas

  • Medicine(all)
  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

Identification of cancer genomic markers via integrative sparse boosting. / Huang, Yuan; Huang, Jian; Shia, Ben Chang; Ma, Shuangge.

In: Biostatistics, Vol. 13, No. 3, 07.2012, p. 509-522.

Research output: Contribution to journalArticle

Huang, Yuan ; Huang, Jian ; Shia, Ben Chang ; Ma, Shuangge. / Identification of cancer genomic markers via integrative sparse boosting. In: Biostatistics. 2012 ; Vol. 13, No. 3. pp. 509-522.
@article{0bf8fb0c1d184a178be2dc8690e5807c,
title = "Identification of cancer genomic markers via integrative sparse boosting",
abstract = "In high-throughput cancer genomic studies, markers identified from the analysis of single data sets often suffer a lack of reproducibility because of the small sample sizes. An ideal solution is to conduct large-scale prospective studies, which are extremely expensive and time consuming. A cost-effective remedy is to pool data from multiple comparable studies and conduct integrative analysis. Integrative analysis of multiple data sets is challenging because of the high dimensionality of genomic measurements and heterogeneity among studies. In this article, we propose a sparse boosting approach for marker identification in integrative analysis of multiple heterogeneous cancer diagnosis studies with gene expression measurements. The proposed approach can effectively accommodate the heterogeneity among multiple studies and identify markers with consistent effects across studies. Simulation shows that the proposed approach has satisfactory identification results and outperforms alternatives including an intensity approach and meta-analysis. The proposed approach is used to identify markers of pancreatic cancer and liver cancer. The Author 2011. Published by Oxford University Press. All rights reserved.",
keywords = "Cancer genomics, Marker identification, Sparse boosting",
author = "Yuan Huang and Jian Huang and Shia, {Ben Chang} and Shuangge Ma",
year = "2012",
month = "7",
doi = "10.1093/biostatistics/kxr033",
language = "English",
volume = "13",
pages = "509--522",
journal = "Biostatistics",
issn = "1465-4644",
publisher = "Oxford University Press",
number = "3",

}

TY - JOUR

T1 - Identification of cancer genomic markers via integrative sparse boosting

AU - Huang, Yuan

AU - Huang, Jian

AU - Shia, Ben Chang

AU - Ma, Shuangge

PY - 2012/7

Y1 - 2012/7

N2 - In high-throughput cancer genomic studies, markers identified from the analysis of single data sets often suffer a lack of reproducibility because of the small sample sizes. An ideal solution is to conduct large-scale prospective studies, which are extremely expensive and time consuming. A cost-effective remedy is to pool data from multiple comparable studies and conduct integrative analysis. Integrative analysis of multiple data sets is challenging because of the high dimensionality of genomic measurements and heterogeneity among studies. In this article, we propose a sparse boosting approach for marker identification in integrative analysis of multiple heterogeneous cancer diagnosis studies with gene expression measurements. The proposed approach can effectively accommodate the heterogeneity among multiple studies and identify markers with consistent effects across studies. Simulation shows that the proposed approach has satisfactory identification results and outperforms alternatives including an intensity approach and meta-analysis. The proposed approach is used to identify markers of pancreatic cancer and liver cancer. The Author 2011. Published by Oxford University Press. All rights reserved.

AB - In high-throughput cancer genomic studies, markers identified from the analysis of single data sets often suffer a lack of reproducibility because of the small sample sizes. An ideal solution is to conduct large-scale prospective studies, which are extremely expensive and time consuming. A cost-effective remedy is to pool data from multiple comparable studies and conduct integrative analysis. Integrative analysis of multiple data sets is challenging because of the high dimensionality of genomic measurements and heterogeneity among studies. In this article, we propose a sparse boosting approach for marker identification in integrative analysis of multiple heterogeneous cancer diagnosis studies with gene expression measurements. The proposed approach can effectively accommodate the heterogeneity among multiple studies and identify markers with consistent effects across studies. Simulation shows that the proposed approach has satisfactory identification results and outperforms alternatives including an intensity approach and meta-analysis. The proposed approach is used to identify markers of pancreatic cancer and liver cancer. The Author 2011. Published by Oxford University Press. All rights reserved.

KW - Cancer genomics

KW - Marker identification

KW - Sparse boosting

UR - http://www.scopus.com/inward/record.url?scp=84863597943&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863597943&partnerID=8YFLogxK

U2 - 10.1093/biostatistics/kxr033

DO - 10.1093/biostatistics/kxr033

M3 - Article

VL - 13

SP - 509

EP - 522

JO - Biostatistics

JF - Biostatistics

SN - 1465-4644

IS - 3

ER -