Mobile edge devices (e.g., AR/VR headsets) typically need to complete timely inference tasks while operating with limited on-board computing and energy resources. In this paper, we investigate the problem of collaborative inference in wireless edge networks, where energy-constrained edge devices aim to complete inference tasks within given deadlines. These tasks are carried out using neural networks, and the edge device seeks to optimize inference performance under energy and delay constraints. The inference process can be split between the edge device and an edge server, thereby achieving collaborative inference over wireless networks. We formulate an inference utility optimization problem subject to energy and delay constraints, and propose a novel solution called Bayes-Split-Edge, which leverages Bayesian optimization for collaborative split inference over wireless edge networks. Our solution jointly optimizes the transmission power and the neural network split point. The Bayes-Split-Edge framework incorporates a novel hybrid acquisition function that balances inference task utility, sample efficiency, and constraint violation penalties. We evaluate our approach using the VGG19 model on the ImageNet-Mini dataset, and Resnet101 on Tiny-ImageNet, and real-world mMobile wireless channel datasets. Numerical results demonstrate that Bayes-Split-Edge achieves up to 2.4x reduction in evaluation cost compared to standard Bayesian optimization and achieves near-linear convergence. It also outperforms several baselines, including CMA-ES, DIRECT, exhaustive search, and Proximal Policy Optimization (PPO), while matching exhaustive search performance under tight constraints. These results confirm that the proposed framework provides a sample-efficient solution requiring maximum 20 function evaluations and constraint-aware optimization for wireless split inference in edge computing systems.
翻译:暂无翻译