Small and medium-sized enterprises (SMEs) play a crucial role in driving economic growth. Monitoring their financial performance and discovering relevant covariates are essential for risk assessment, business planning, and policy formulation. This paper focuses on predicting profits for SMEs. Two major challenges are faced in this study: 1) SMEs data are stored across different institutions, and centralized analysis is restricted due to data security concerns; 2) data from various institutions contain different levels of missing values, resulting in a complex missingness issue. To tackle these issues, we introduce an innovative approach named Vertical Federated Expectation Maximization (VFEM), designed for federated learning under a missing data scenario. We embed a new EM algorithm into VFEM to address complex missing patterns when full dataset access is unfeasible. Furthermore, we establish the linear convergence rate for the VFEM and establish a statistical inference framework, enabling covariates to influence assessment and enhancing model interpretability. Extensive simulation studies are conducted to validate its finite sample performance. Finally, we thoroughly investigate a real-life profit prediction problem for SMEs using VFEM. Our findings demonstrate that VFEM provides a promising solution for addressing data isolation and missing values, ultimately improving the understanding of SMEs' financial performance.
翻译:中小企业在推动经济增长中扮演着关键角色。监测其财务表现并发现相关协变量对于风险评估、商业规划和政策制定至关重要。本文聚焦于中小企业利润预测研究。本研究面临两大挑战:1) 中小企业数据分散存储于不同机构,因数据安全考量而无法进行集中式分析;2) 各机构数据存在不同程度的缺失值,形成复杂的缺失数据问题。为解决这些问题,我们提出了一种创新方法——垂直联邦期望最大化算法,专为缺失数据场景下的联邦学习设计。我们在该算法中嵌入新型期望最大化算法,以应对无法访问完整数据集时的复杂缺失模式。此外,我们建立了该算法的线性收敛速率,并构建了统计推断框架,使协变量能够影响评估过程,从而增强模型可解释性。通过大量模拟研究验证了其有限样本性能。最后,我们使用该算法深入研究了中小企业利润预测的实际问题。研究结果表明,该算法为应对数据孤岛和缺失值问题提供了有效解决方案,最终提升了对中小企业财务表现的理解能力。