In many clinical and epidemiological studies, collecting longitudinal measurements together with time-to-event outcomes is essential. Accurately estimating the association between longitudinal markers and event risks, as well as identifying key markers for prediction, is especially important in the presence of competing risks. However, as the number of markers increases, fitting full joint models becomes computationally difficult and may lead to convergence issues. We propose a two-stage Bayesian approach for variable selection in joint models with multiple longitudinal markers and competing risks. The method efficiently identifies important longitudinal markers and covariates. In the first stage, a one-marker joint model is fitted for each marker with the competing risks outcome, and individual marker trajectories are predicted, reducing bias from informative dropout. In the second stage, a cause-specific hazards model is fitted, incorporating the predicted current values of all markers as time-dependent covariates. We consider both continuous and Dirac spike-and-slab priors for Bayesian variable selection, implemented through MCMC algorithms. Our approach enables risk prediction using a large number of longitudinal markers, which is often infeasible for standard joint models. We evaluate performance through simulation studies, examining both variable selection and predictive accuracy. Finally, we apply the method to predict dementia risk in the Three-City (3C) study, a French cohort with competing risks of death. To facilitate use, we provide an R package, VSJM, available at: https:/github.com/tbaghfalaki/VSJM.
翻译:在许多临床与流行病学研究中,收集纵向测量数据及时间-事件结局至关重要。在存在竞争风险的情况下,准确估计纵向标志物与事件风险之间的关联,并识别用于预测的关键标志物尤为重要。然而,随着标志物数量的增加,拟合完整的联合模型在计算上变得困难,并可能导致收敛问题。我们提出了一种两阶段贝叶斯方法,用于多纵向标志物与竞争风险联合模型中的变量选择。该方法能有效识别重要的纵向标志物与协变量。在第一阶段,针对每个标志物与竞争风险结局拟合单标志物联合模型,并预测个体标志物轨迹,以减少因信息性脱落引起的偏倚。在第二阶段,拟合特定原因风险模型,将所有标志物的预测当前值作为时变协变量纳入。我们考虑了连续型与狄拉克尖峰-厚板先验用于贝叶斯变量选择,并通过MCMC算法实现。我们的方法支持使用大量纵向标志物进行风险预测,而这在标准联合模型中通常难以实现。我们通过模拟研究评估性能,考察变量选择与预测准确性。最后,我们将该方法应用于法国三城(3C)队列研究,以预测痴呆风险,该研究存在死亡的竞争风险。为便于使用,我们提供了R软件包VSJM,可从以下网址获取:https:/github.com/tbaghfalaki/VSJM。