Compatibility condition and compatibility constant have been commonly used to evaluate the prediction error of the lasso when the number of variables exceeds the number of observations. However, the computation of the compatibility constant is generally difficult because it is a complicated nonlinear optimization problem. In this study, we present a numerical approach to compute the compatibility constant when the zero/nonzero pattern of true regression coefficients is given. We show that the optimization problem reduces to a quadratic program (QP) once the signs of the nonzero coefficients are specified. In this case, the compatibility constant can be obtained by solving QPs for all possible sign combinations. We also formulate a mixed-integer quadratic programming (MIQP) approach that can be applied when the number of true nonzero coefficients is moderately large. We investigate the finite-sample behavior of the compatibility constant for simulated data under a wide variety of parameter settings and compare the mean squared error with its theoretical error bound based on the compatibility constant. The behavior of the compatibility constant in finite samples is also investigated through a real data analysis.
翻译:相容性条件与相容性常数常被用于评估当变量数目超过观测样本数时Lasso的预测误差。然而,由于该问题是一个复杂的非线性优化问题,相容性常数的计算通常较为困难。本研究提出了一种在给定真实回归系数零/非零模式的条件下计算相容性常数的数值方法。我们证明,一旦非零系数的符号被确定,该优化问题可简化为一个二次规划问题。在此情况下,可通过求解所有可能符号组合对应的二次规划来获得相容性常数。我们还构建了一种混合整数二次规划方法,适用于真实非零系数数量适中的情形。我们通过多种参数设置下的模拟数据研究了相容性常数在有限样本中的表现,并将均方误差与基于相容性常数的理论误差界进行了比较。通过实际数据分析,进一步探究了相容性常数在有限样本中的行为特征。