Capture stages are high-end sources of state-of-the-art recordings for downstream applications in movies, games, and other media. One crucial step in almost all pipelines is matting, i.e., separating captured performances from the background. While common matting algorithms deliver remarkable performance in other applications like teleconferencing and mobile entertainment, we found that they struggle significantly with the peculiarities of capture stage content. The goal of our work is to share insights into those challenges as a curated list of these characteristics along with a constructive discussion for proactive intervention and present a guideline to practitioners for an improved workflow to mitigate unresolved challenges. To this end, we also demonstrate an efficient pipeline to adapt state-of-the-art approaches to such custom setups without the need for extensive annotations, both offline and real-time. For an objective evaluation, we introduce a validation methodology using a state-of-the-art diffusion model to demonstrate the benefits of our approach.
翻译:捕获舞台是电影、游戏及其他媒体下游应用中获取前沿录制内容的高端来源。几乎所有处理流程中的关键步骤均为抠像,即将捕获的表演内容与背景分离。尽管常见的抠像算法在视频会议和移动娱乐等其他应用中表现出色,但我们发现它们在处理捕获舞台内容的特殊性时面临显著困难。本研究旨在通过整理这些特性作为精选列表,并展开建设性讨论以促进主动干预,从而分享对这些挑战的见解,同时为从业者提供改进工作流程的指南,以缓解未解决的难题。为此,我们展示了一种高效流程,可在无需大量标注的情况下,使前沿方法适配此类定制化设置,同时支持离线和实时处理。为进行客观评估,我们引入了一种基于前沿扩散模型的验证方法,以证明本方法的优势。