Background: The recent controversy about using mammography to screen for breast cancer based on randomized controlled trials over 3 decades in Western countries has not only eclipsed the paradigm of evidence-based medicine, but also puts health decision-makers in countries where breast cancer screening is still being considered in a dilemma to adopt or abandon such a well-established screening modality. Methods: We reanalyzed the empirical data from the Health Insurance Plan trial in 1963 to the UK age trial in 1991 and their follow-up data published until 2015. We first performed Bayesian conjugated meta-analyses on the heterogeneity of attendance rate, sensitivity, and over-detection and their impacts on advanced stage breast cancer and death from breast cancer across trials using Bayesian Poisson fixed- and random-effect regression model. Bayesian meta-analysis of causal model was then developed to assess a cascade of causal relationships regarding the impact of both attendance and sensitivity on 2 main outcomes. Results: The causes of heterogeneity responsible for the disparities across the trials were clearly manifested in 3 components. The attendance rate ranged from 61.3% to 90.4%. The sensitivity estimates show substantial variation from 57.26% to 87.97% but improved with time from 64% in 1963 to 82% in 1980 when Bayesian conjugated meta-analysis was conducted in chronological order. The percentage of over-detection shows a wide range from 0% to 28%, adjusting for long lead-time. The impacts of the attendance rate and sensitivity on the 2 main outcomes were statistically significant. Causal inference made by linking these causal relationships with emphasis on the heterogeneity of the attendance rate and sensitivity accounted for the variation in the reduction of advanced breast cancer (none-30%) and of mortality (none-31%). We estimated a 33% (95% CI: 24-42%) and 13% (95% CI: 6-20%) breast cancer mortality reduction for the best scenario (90% attendance rate and 95% sensitivity) and the poor scenario (30% attendance rate and 55% sensitivity), respectively. Conclusion: Elucidating the scenarios from high to low performance and learning from the experiences of these trials helps screening policy-makers contemplate on how to avoid errors made in ineffective studies and emulate the effective studies to save women lives. Abbreviations: CI = confidence interval, DAG = directed acyclic graphic, HIP = Health Insurance Plan, I/E = incidence of interval cancer/expected incidence, IC = interval cancer, LNS = last negative screening, MCMC = Markov Chain Monte Carlo, NBSS = National Breast Screening Study, NSO = number of screenings required for over-detecting, PCDP = preclinical detectable phase, RR = relative risk.