Measurement-Error and Heterogeneity-Driven Stochastic Models for Personalized Health Care with Sampling Design

Project: A - Government Institutionb - Ministry of Science and Technology

Project Details


Markov regression models used for personalized medicine or personalized health care has gained popularity. However, some issues still remained. Firstly, while the identification of subjects at risk of disease progression to each state of multi-state outcome plays an important role in personalized health care, the measurements of multi-state outcome are prone to the errors of classification and unexplained heterogeneity. There is a lacking of a stochastic model (such as Hidden Markov model) to deal with measurement errors and residual heterogeneity. Secondly, although the application of stochastic models to estimate the forces of multi-state disease progression using data on population-based program is well developed, the collection of such big data is quite costly for certain covariates (new biomarkers). Moreover, it is often not feasible to get empirical data for the whole target population to quantify their roles in the identification of subject at higher risk of disease progression. The application of two-stage sampling design is an alternative solution to address this issue with efficiency. In terms of the role of this current project played in the integrated project, in order to match the framework of generalized stochastic regression model (Project 1) for personalized health care to calibrate various types of multi-state stochastic process, this current Project 2 is integrated as one part of the integrated project entitled as “Stochastic models for systematic personalized health care” to develop calibrated and two-stage-sample-based efficient generalized stochastic regression model to support three other parts of applications to personalized chronic disease and cancer prevention (Project 3), infectious disease control (Project 4), and the calibrated economic decision model (the second part of Project 1). We therefore propose a three-year project to achieve the following aims (1) to develop a generalized measurement-error-/heterogeneity-driven Hidden Markov model (HMM) to calibrate the measurement errors based on small data from two-stage sampling design and the incorporation of latent variable (random effect) to capture the unmeasured covariates; (2) to calibrate the effects sizes regarding the effect of risk factors on multi-state outcome with two-stage sampling validation design to elucidate the mechanism of misclassification; (3) to develop the computer algorithm to examine how different types of misclassification and various degrees of unobserved heterogeneity affect the effects sizes of covariates given multi-state outcomes; (First year, August 2017-July 2018) (4) to develop a generalized k-state model for fitting the data obtained from the two-stage design in comparison with the conventional multi-state stochastic model; (5) to investigate the influence of how different sampling schemes on the effect size of covariates; (6) to compare the estimated results from (4)-(5) to those using full data to evaluate possible biases inherent from the process of sampling scheme; (Second year, August 2018-July 2019) (7) to assess whether and how the final multi-state outcomes are affected after the calibration of causal Hidden Markov regression models (the combination of at least two types of stochastic process); (8) to quantify the degree of measurement errors identified from causal Hidden Markov model using various types of Hidden Markov model; (9) to calculate how and whether a two-stage sampling design can be still applied to causal Hidden Markov regression model; and (10) to evaluate power function or sampling size given the calibrated bias and two-stage sampling schemes. (Third year, August 2019-July 2020) The proposed Hidden Markov model to deal with measurement error and unexplained heterogeneity in a systematic way is conducive to throwing light on the influence of both measurement errors and unexplained heterogeneity on personal health care. Moreover, the development of causal Hidden Markov models with calibration is very helpful for evaluating intervention and prevention under personal health care in a precise and efficient way.
Effective start/end date8/1/187/1/19


  • measurement error
  • heterogeneity
  • two-sage sampling design
  • personalized health care
  • stochastic model
  • multi-state model
  • generalized stochastic regression model