Early-exit networks are effective solutions for reducing the overall energy consumption and latency of deep learning models by adjusting computation based on the complexity of input data. By incorporating intermediate exit branches into the architecture, they provide less computation for simpler samples, which is particularly beneficial for resource-constrained devices where energy consumption is crucial. However, designing early-exit networks is a challenging and time-consuming process due to the need to balance efficiency and performance. Recent works have utilized Neural Architecture Search (NAS) to design more efficient early-exit networks, aiming to reduce average latency while improving model accuracy by determining the best positions and number of exit branches in the architecture. Another important factor affecting the efficiency and accuracy of early-exit networks is the depth and types of layers in the exit branches. In this paper, we use hardware-aware NAS to strengthen exit branches, considering both accuracy and efficiency during optimization. Our performance evaluation on the CIFAR-10, CIFAR-100, and SVHN datasets demonstrates that our proposed framework, which considers varying depths and layers for exit branches along with adaptive threshold tuning, designs early-exit networks that achieve higher accuracy with the same or lower average number of MACs compared to the state-of-the-art approaches.
翻译:早退网络通过根据输入数据的复杂度调整计算量,是降低深度学习模型整体能耗和延迟的有效解决方案。通过在架构中引入中间退出分支,它们为较简单的样本提供更少的计算,这对于能耗至关重要的资源受限设备尤为有益。然而,设计早退网络是一个具有挑战性且耗时的过程,因为需要平衡效率与性能。近期研究利用神经架构搜索(NAS)来设计更高效的早退网络,旨在通过确定架构中退出分支的最佳位置和数量,在提高模型精度的同时降低平均延迟。影响早退网络效率和精度的另一个重要因素是退出分支的深度和层类型。本文采用硬件感知NAS来增强退出分支,在优化过程中同时考虑精度和效率。我们在CIFAR-10、CIFAR-100和SVHN数据集上的性能评估表明,所提出的框架通过为退出分支考虑不同深度和层类型,并结合自适应阈值调优,设计的早退网络在相同或更低的平均MACs(乘加运算)下,相比现有最优方法实现了更高的精度。