This paper investigates the theoretical foundation and develops analytical formulas for sample size and power calculations for causal inference with observational data. By analyzing the variance of the inverse probability weighting estimator of the average treatment effect, we decompose the power calculations into three components: propensity score distribution, potential outcome distribution, and their correlation. We show that to determine the minimal sample size of an observational study, it is sufficient under mild conditions to have two parameters additional to the standard inputs in the power calculation of randomized trials, which quantify the strength of the confounder-treatment and the confounder-outcome association, respectively. For the former, we propose using the Bhattacharyya coefficient, which measures the covariate overlap and, together with the treatment proportion, leads to a uniquely identifiable and easily computable propensity score distribution. For the latter, we propose a sensitivity parameter bounded by the R-squared statistic of the regression of the outcome on covariates. Utilizing the Lyapunov Central Limit Theorem on the linear combination of covariates, our procedure does not require distributional assumptions on the multivariate covariates. We develop an associated R package PSpower.
翻译:暂无翻译