Many GPUs have incorporated hardware-accelerated video encoders, which allow video encoding tasks to be offloaded from the main CPU and provide higher power efficiency. Over the years, many new video codecs such as H.265/HEVC, VP9, and AV1 were added to the latest GPU boards. Recently, the rise of live video content such as VTuber, game live-streaming, and live event broadcasts, drives the demand for high-efficiency hardware encoders in the GPUs to tackle these real-time video encoding tasks, especially at higher resolutions such as 4K/8K UHD. In this paper, RD performance, encoding speed, as well as power consumption of hardware encoders in several generations of NVIDIA, Intel GPUs as well as Qualcomm Snapdragon Mobile SoCs were evaluated and compared to the software counterparts, including the latest H.266/VVC codec, using several metrics including PSNR, SSIM, and machine-learning based VMAF. The results show that modern GPU hardware encoders can match the RD performance of software encoders in real-time encoding scenarios, and while encoding speed increased in newer hardware, there is mostly negligible RD performance improvement between hardware generations. Finally, the bitrate required for each hardware encoder to match YouTube transcoding quality was also calculated.
翻译:许多GPU已集成硬件加速视频编码器,可将视频编码任务从主CPU卸载,并提供更高的能效。近年来,H.265/HEVC、VP9和AV1等新型视频编解码器已被纳入最新GPU板卡。随着虚拟主播、游戏直播和实时活动转播等实时视频内容的兴起,对GPU中高效硬件编码器的需求日益增长,以应对实时视频编码任务,特别是在4K/8K超高清等高分辨率场景。本文评估并比较了多代NVIDIA、Intel GPU及高通骁龙移动SoC中硬件编码器的率失真性能、编码速度与功耗,并与软件编码器(包括最新的H.266/VVC编解码器)进行对比,评估指标涵盖PSNR、SSIM和基于机器学习的VMAF。结果表明,现代GPU硬件编码器在实时编码场景中可匹配软件编码器的率失真性能;尽管新一代硬件的编码速度有所提升,但硬件代际间的率失真性能改进大多可忽略不计。最后,本文还计算了各硬件编码器匹配YouTube转码质量所需的比特率。