Driver distraction behavior recognition using in-vehicle cameras demands real-time inference on edge devices. However, lightweight models often fail to capture fine-grained behavioral cues, resulting in reduced performance on unseen drivers or under varying conditions. ROI-based methods also increase computational cost, making it difficult to balance efficiency and accuracy. This work addresses the need for a lightweight architecture that overcomes these constraints. We propose Computationally efficient Dynamic region of Interest Routing and domain-invariant Adversarial learning for lightweight driver behavior recognition (C-DIRA). The framework combines saliency-driven Top-K ROI pooling and fused classification for local feature extraction and integration. Dynamic ROI routing enables selective computation by applying ROI inference only to high difficulty data samples. Moreover, pseudo-domain labeling and adversarial learning are used to learn domain-invariant features robust to driver and background variation. Experiments on the State Farm Distracted Driver Detection Dataset show that C-DIRA maintains high accuracy with significantly fewer FLOPs and lower latency than prior lightweight models. It also demonstrates robustness under visual degradation such as blur and low-light, and stable performance across unseen domains. These results confirm C-DIRA's effectiveness in achieving compactness, efficiency, and generalization.
翻译:利用车载摄像头进行驾驶员分心行为识别需要在边缘设备上实现实时推理。然而,轻量级模型往往难以捕捉细粒度的行为线索,导致在未见过的驾驶员或不同条件下的性能下降。基于感兴趣区域(ROI)的方法也会增加计算成本,使得效率与准确性难以平衡。本研究旨在满足对能够克服这些限制的轻量级架构的需求。我们提出了面向轻量级驾驶员行为识别的计算高效动态感兴趣区域路由与域不变对抗学习框架(C-DIRA)。该框架结合了显著性驱动的Top-K ROI池化与融合分类,用于局部特征的提取与整合。动态ROI路由通过仅对高难度数据样本应用ROI推理,实现了选择性计算。此外,利用伪域标注和对抗学习来学习对驾驶员和背景变化具有鲁棒性的域不变特征。在State Farm分心驾驶员检测数据集上的实验表明,与先前的轻量级模型相比,C-DIRA在显著减少浮点运算次数和降低延迟的同时,保持了高准确率。该模型还在模糊和低光照等视觉退化条件下表现出鲁棒性,并在未见过的域上展现了稳定的性能。这些结果证实了C-DIRA在实现紧凑性、高效性和泛化能力方面的有效性。