利用强化学习,进行个性化注意-防暴露控制 (Personalized Attention-Aware Exposure Control Using Reinforcement Learning)

We propose a reinforcement learning approach for real-time exposure control of a mobile camera that is personalizable. Our approach is based on Markov Decision Process (MDP). In the camera viewfinder or live preview mode, given the current frame, our system predicts the change in exposure so as to optimize the trade-off among image quality, fast convergence, and minimal temporal oscillation. We model the exposure prediction function as a fully convolutional neural network that can be trained through Gaussian policy gradient in an end-to-end fashion. As a result, our system can associate scene semantics with exposure values; it can also be extended to personalize the exposure adjustments for a user and device. We improve the learning performance by incorporating an attention module that links semantics with exposure. This attention module generalizes the conventional spot or matrix metering techniques. We validate our system using the MIT FiveK and our own datasets captured using iPhone 7 and Google Pixel. Experimental results show that our system exhibits stable real-time behavior while improving visual quality compared to what is achieved through native camera control.

翻译：我们建议了个人可以个人化的移动相机实时曝光控制强化学习方法。我们的方法基于Markov 决策程序( MDP ) 。根据当前框架,我们的系统预测了曝光量的变化,以便优化图像质量、快速趋同和最低时间振荡之间的平衡。我们将曝光量预测功能模拟成一个完全共进的神经网络,可以通过高斯政策梯度进行终端到终端式的培训。因此,我们的系统可以将现场语义与曝光值联系起来; 也可以扩展为用户和装置的曝光量调整个性化。我们通过整合一个将语义与曝光点连接起来的注意模块来改进学习绩效。这个关注模块概括了常规点或矩阵测量技术。我们用 MIST FiveK 和我们用 iPhone 7 和 Google Pixel 采集的数据集来验证我们的系统。实验结果表明, 我们的系统展示了稳定的实时行为, 与通过本地摄像器控制实现的图像质量相比, 同时提高了我们系统的视觉质量。