Underwater video pairs are fairly difficult to obtain due to the complex underwater imaging. In this case, most existing video underwater enhancement methods are performed by directly applying the single-image enhancement model frame by frame, but a natural issue is lacking temporal consistency. To relieve the problem, we rethink the temporal manifold inherent in natural videos and observe a temporal consistency prior in dynamic scenes from the local temporal frequency perspective. Building upon the specific prior and no paired-data condition, we propose an implicit representation manner for enhanced video signals, which is conducted in the wavelet-based temporal consistency field, WaterWave. Specifically, under the constraints of the prior, we progressively filter and attenuate the inconsistent components while preserving motion details and scenes, achieving a natural-flowing video. Furthermore, to represent temporal frequency bands more accurately, an underwater flow correction module is designed to rectify estimated flows considering the transmission in underwater scenes. Extensive experiments demonstrate that WaterWave significantly enhances the quality of videos generated using single-image underwater enhancements. Additionally, our method demonstrates high potential in downstream underwater tracking tasks, such as UOSTrack and MAT, outperforming the original video by a large margin, i.e., 19.7% and 9.7% on precise respectively.
翻译:由于水下成像环境复杂,获取成对的水下视频数据相当困难。在此情况下,现有大多数水下视频增强方法通常直接逐帧应用单幅图像增强模型,但一个自然的问题是缺乏时序一致性。为缓解此问题,我们重新思考了自然视频中固有的时序流形,并从局部时序频率的视角观察到动态场景中的时序一致性先验。基于这一特定先验以及无配对数据的条件,我们提出了一种针对增强视频信号的隐式表示方法,该方法在基于小波的时序一致性场——WaterWave中实现。具体而言,在先验约束下,我们逐步滤除并衰减不一致的成分,同时保留运动细节与场景信息,从而生成流畅自然的视频。此外,为更准确地表示时序频带,我们设计了一个水下流场校正模块,该模块结合水下场景中的透射特性对估计的光流进行修正。大量实验表明,WaterWave显著提升了基于单幅图像水下增强方法生成的视频质量。同时,我们的方法在下游水下跟踪任务(如UOSTrack和MAT)中展现出巨大潜力,其性能大幅超越原始视频,在精确度指标上分别提升了19.7%和9.7%。