We focus on a binary classification problem in an edge intelligence system where false negatives are more costly than false positives. The system has a compact, locally deployed model, which is supplemented by a larger, remote model, which is accessible via the network by incurring an offloading cost. For each sample, our system first uses the locally deployed model for inference. Based on the output of the local model, the sample may be offloaded to the remote model. This work aims to understand the fundamental trade-off between classification accuracy and the offloading costs within such a hierarchical inference (HI) system. To optimise this system, we propose an online learning framework that continuously adapts a pair of thresholds on the local model's confidence scores. These thresholds determine the prediction of the local model and whether a sample is classified locally or offloaded to the remote model. We present a closed-form solution for the setting where the local model is calibrated. For the more general case of uncalibrated models, we introduce H2T2, an online two-threshold hierarchical inference policy, and prove it achieves sublinear regret. H2T2 is model-agnostic, requires no training, and learns during the inference phase using limited feedback. Simulations on real-world datasets show that H2T2 consistently outperforms naive and single-threshold HI policies, sometimes even surpassing offline optima. The policy also demonstrates robustness to distribution shifts and adapts effectively to mismatched classifiers.
翻译:本文研究边缘智能系统中的二分类问题,其中假阴性的代价高于假阳性。该系统部署了一个紧凑的本地模型,并可通过网络访问一个更大的远程模型(需承担卸载成本)。对于每个样本,系统首先使用本地模型进行推理。根据本地模型的输出,样本可能被卸载至远程模型。本研究旨在理解此类分层推理(HI)系统中分类精度与卸载成本之间的基本权衡关系。为优化该系统,我们提出一种在线学习框架,持续调整本地模型置信度分数的双阈值。这些阈值决定本地模型的预测结果以及样本是在本地分类还是卸载至远程模型。针对本地模型已校准的场景,我们给出了闭式解。针对未校准模型的更一般情况,我们提出了H2T2——一种在线双阈值分层推理策略,并证明其可实现次线性遗憾。H2T2具有模型无关性,无需训练,可在推理阶段利用有限反馈进行学习。在真实数据集上的仿真表明,H2T2始终优于朴素及单阈值HI策略,有时甚至超越离线最优解。该策略还展现出对分布漂移的鲁棒性,并能有效适应不匹配的分类器。