The demand for high-quality, real-time video streaming has grown exponentially, with 4K Ultra High Definition (UHD) becoming the new standard for many applications such as live broadcasting, TV services, and interactive cloud gaming. This trend has driven the integration of dedicated hardware encoders into modern Graphics Processing Units (GPUs). Nowadays, these encoders support advanced codecs like HEVC and AV1 and feature specialized Low-Latency and Ultra Low-Latency tuning, targeting end-to-end latencies of < 2 seconds and < 500 ms, respectively. As the demand for such capabilities grows toward the 6G era, a clear understanding of their performance implications is essential. In this work, we evaluate the low-latency encoding modes on GPUs from NVIDIA, Intel, and AMD from both Rate-Distortion (RD) performance and latency perspectives. The results are then compared against both the normal-latency tuning of hardware encoders and leading software encoders. Results show hardware encoders achieve significantly lower E2E latency than software solutions with slightly better RD performance. While standard Low-Latency tuning yields a poor quality-latency trade-off, the Ultra Low-Latency mode reduces E2E latency to 83 ms (5 frames) without additional RD impact. Furthermore, hardware encoder latency is largely insensitive to quality presets, enabling high-quality, low-latency streams without compromise.
翻译:对高质量实时视频流的需求呈指数级增长,4K超高清(UHD)已成为直播、电视服务和交互式云游戏等许多应用的新标准。这一趋势推动了专用硬件编码器在现代图形处理器(GPU)中的集成。如今,这些编码器支持HEVC和AV1等先进编解码器,并具备专门的低延迟和超低延迟调优功能,分别针对小于2秒和小于500毫秒的端到端延迟。随着此类需求在6G时代不断增长,清晰理解其性能影响至关重要。在本研究中,我们从率失真(RD)性能和延迟两个角度,评估了NVIDIA、Intel和AMD GPU上的低延迟编码模式。结果与硬件编码器的正常延迟调优以及领先的软件编码器进行了比较。结果显示,硬件编码器在RD性能略优的同时,实现了比软件解决方案显著更低的端到端延迟。虽然标准低延迟调优在质量与延迟权衡方面表现较差,但超低延迟模式可将端到端延迟降至83毫秒(5帧)且不产生额外的RD性能损失。此外,硬件编码器的延迟对质量预设参数极不敏感,从而能够实现无妥协的高质量低延迟视频流。