This paper presents and discusses forms of estimation by regularized regression and model selection using the LASSO method - Least Absolute Shrinkage and Selection Operator. LASSO is recognized as one of the main supervised learning methods applied to high-dimensional econometrics, allowing work with large volumes of data and multiple correlated controls. Conceptual issues related to the consequences of high dimensionality in modern econometrics and the principle of sparsity, which underpins regularization procedures, are addressed. The study examines the main post-double selection and post-regularization models, including variations applied to instrumental variable models. A brief description of the lassopack routine package, its syntaxes, and examples of HD, HDS (High-Dimension Sparse), and IV-HDS models, with combinations involving fixed effects estimators, is also presented. Finally, the potential application of the approach in research focused on air transport is discussed, with emphasis on an empirical study on the operational efficiency of airlines and aircraft fuel consumption.
翻译:本文提出并讨论了通过正则化回归和LASSO方法(最小绝对收缩与选择算子)进行模型选择与估计的形式。LASSO被公认为应用于高维计量经济学的主要监督学习方法之一,能够处理海量数据及多重相关控制变量。研究探讨了现代计量经济学中高维性影响的相关概念问题,以及支撑正则化过程的稀疏性原则。本文系统考察了主要的后双重选择与后正则化模型,包括应用于工具变量模型的变体。同时简要介绍了lassopack程序包及其语法,展示了HD(高维)、HDS(高维稀疏)和IV-HDS模型的示例,其中包含与固定效应估计器的组合应用。最后,探讨了该方法在航空运输研究中的潜在应用,重点分析了关于航空公司运营效率与飞机燃油消耗的实证研究。