Batch normalizer: A fast total abundance regression calibration method to simultaneously adjust batch and injection order effects in liquid chromatography/time-of-flight mass spectrometry-based metabolomics data and comparison with current calibration methods

San Yuan Wang, Ching Hua Kuo, Yufeng J. Tseng

Research output: Contribution to journalArticle

59 Citations (Scopus)

Abstract

Metabolomics is a powerful tool for understanding phenotypes and discovering biomarkers. Combinations of multiple batches or data sets in large cross-sectional epidemiology studies are frequently utilized in metabolomics, but various systematic biases can introduce both batch and injection order effects and often require proper calibrations prior to chemometric analyses. We present a novel algorithm, Batch Normalizer, to calibrate large scale metabolomic data. Batch Normalizer utilizes a regression model with consideration of the total abundance of each sample to improve its calibration performance, and it is able to remove both batch effect and injection order effects. This calibration method was tested using liquid chromatography/time-of- flight mass spectrometry (LC/TOF-MS) chromatograms of 228 plasma samples and 23 pooled quality control (QC) samples. We evaluated the performance of Batch Normalizer by examining the distribution of relative standard deviation (RSD) for all peaks detected in the pooled QC samples, the average Pearson correlation coefficients for all peaks between any two of QC samples, and the distribution of QC samples in the scores plot of a principal component analysis (PCA). After calibration by Batch Normalizer, the number of peaks in QC samples with RSD less than 15% increased from 11 to 914, all of the QC samples were closely clustered in PCA scores plot, and the average Pearson correlation coefficients for all peaks of QC samples increased from 0.938 to 0.976. This method was compared to 7 commonly used calibration methods. We discovered that using Batch Normalizer to calibrate LC/TOF-MS data produces the best calibration results.

Original languageEnglish
Pages (from-to)1037-1046
Number of pages10
JournalAnalytical Chemistry
Volume85
Issue number2
DOIs
Publication statusPublished - Jan 15 2013
Externally publishedYes

Fingerprint

Liquid chromatography
Quality control
Mass spectrometry
Calibration
Principal component analysis
Epidemiology
Biomarkers
Metabolomics
Plasmas

ASJC Scopus subject areas

  • Analytical Chemistry

Cite this

@article{7f5e3dfc419b4f2abb5a0399b0f6ba87,
title = "Batch normalizer: A fast total abundance regression calibration method to simultaneously adjust batch and injection order effects in liquid chromatography/time-of-flight mass spectrometry-based metabolomics data and comparison with current calibration methods",
abstract = "Metabolomics is a powerful tool for understanding phenotypes and discovering biomarkers. Combinations of multiple batches or data sets in large cross-sectional epidemiology studies are frequently utilized in metabolomics, but various systematic biases can introduce both batch and injection order effects and often require proper calibrations prior to chemometric analyses. We present a novel algorithm, Batch Normalizer, to calibrate large scale metabolomic data. Batch Normalizer utilizes a regression model with consideration of the total abundance of each sample to improve its calibration performance, and it is able to remove both batch effect and injection order effects. This calibration method was tested using liquid chromatography/time-of- flight mass spectrometry (LC/TOF-MS) chromatograms of 228 plasma samples and 23 pooled quality control (QC) samples. We evaluated the performance of Batch Normalizer by examining the distribution of relative standard deviation (RSD) for all peaks detected in the pooled QC samples, the average Pearson correlation coefficients for all peaks between any two of QC samples, and the distribution of QC samples in the scores plot of a principal component analysis (PCA). After calibration by Batch Normalizer, the number of peaks in QC samples with RSD less than 15{\%} increased from 11 to 914, all of the QC samples were closely clustered in PCA scores plot, and the average Pearson correlation coefficients for all peaks of QC samples increased from 0.938 to 0.976. This method was compared to 7 commonly used calibration methods. We discovered that using Batch Normalizer to calibrate LC/TOF-MS data produces the best calibration results.",
author = "Wang, {San Yuan} and Kuo, {Ching Hua} and Tseng, {Yufeng J.}",
year = "2013",
month = "1",
day = "15",
doi = "10.1021/ac302877x",
language = "English",
volume = "85",
pages = "1037--1046",
journal = "Analytical Chemistry",
issn = "0003-2700",
publisher = "American Chemical Society",
number = "2",

}

TY - JOUR

T1 - Batch normalizer

T2 - A fast total abundance regression calibration method to simultaneously adjust batch and injection order effects in liquid chromatography/time-of-flight mass spectrometry-based metabolomics data and comparison with current calibration methods

AU - Wang, San Yuan

AU - Kuo, Ching Hua

AU - Tseng, Yufeng J.

PY - 2013/1/15

Y1 - 2013/1/15

N2 - Metabolomics is a powerful tool for understanding phenotypes and discovering biomarkers. Combinations of multiple batches or data sets in large cross-sectional epidemiology studies are frequently utilized in metabolomics, but various systematic biases can introduce both batch and injection order effects and often require proper calibrations prior to chemometric analyses. We present a novel algorithm, Batch Normalizer, to calibrate large scale metabolomic data. Batch Normalizer utilizes a regression model with consideration of the total abundance of each sample to improve its calibration performance, and it is able to remove both batch effect and injection order effects. This calibration method was tested using liquid chromatography/time-of- flight mass spectrometry (LC/TOF-MS) chromatograms of 228 plasma samples and 23 pooled quality control (QC) samples. We evaluated the performance of Batch Normalizer by examining the distribution of relative standard deviation (RSD) for all peaks detected in the pooled QC samples, the average Pearson correlation coefficients for all peaks between any two of QC samples, and the distribution of QC samples in the scores plot of a principal component analysis (PCA). After calibration by Batch Normalizer, the number of peaks in QC samples with RSD less than 15% increased from 11 to 914, all of the QC samples were closely clustered in PCA scores plot, and the average Pearson correlation coefficients for all peaks of QC samples increased from 0.938 to 0.976. This method was compared to 7 commonly used calibration methods. We discovered that using Batch Normalizer to calibrate LC/TOF-MS data produces the best calibration results.

AB - Metabolomics is a powerful tool for understanding phenotypes and discovering biomarkers. Combinations of multiple batches or data sets in large cross-sectional epidemiology studies are frequently utilized in metabolomics, but various systematic biases can introduce both batch and injection order effects and often require proper calibrations prior to chemometric analyses. We present a novel algorithm, Batch Normalizer, to calibrate large scale metabolomic data. Batch Normalizer utilizes a regression model with consideration of the total abundance of each sample to improve its calibration performance, and it is able to remove both batch effect and injection order effects. This calibration method was tested using liquid chromatography/time-of- flight mass spectrometry (LC/TOF-MS) chromatograms of 228 plasma samples and 23 pooled quality control (QC) samples. We evaluated the performance of Batch Normalizer by examining the distribution of relative standard deviation (RSD) for all peaks detected in the pooled QC samples, the average Pearson correlation coefficients for all peaks between any two of QC samples, and the distribution of QC samples in the scores plot of a principal component analysis (PCA). After calibration by Batch Normalizer, the number of peaks in QC samples with RSD less than 15% increased from 11 to 914, all of the QC samples were closely clustered in PCA scores plot, and the average Pearson correlation coefficients for all peaks of QC samples increased from 0.938 to 0.976. This method was compared to 7 commonly used calibration methods. We discovered that using Batch Normalizer to calibrate LC/TOF-MS data produces the best calibration results.

UR - http://www.scopus.com/inward/record.url?scp=84872531645&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872531645&partnerID=8YFLogxK

U2 - 10.1021/ac302877x

DO - 10.1021/ac302877x

M3 - Article

C2 - 23240878

AN - SCOPUS:84872531645

VL - 85

SP - 1037

EP - 1046

JO - Analytical Chemistry

JF - Analytical Chemistry

SN - 0003-2700

IS - 2

ER -