2.5D effects, such as occlusion and perspective foreshortening, enhance visual dynamics and realism by incorporating 3D depth cues into 2D designs. However, creating such effects remains challenging and labor-intensive due to the complexity of depth perception. We introduce DepthScape, a human-AI collaborative system that facilitates 2.5D effect creation by directly placing design elements into 3D reconstructions. Using monocular depth reconstruction, DepthScape transforms images into 3D reconstructions where visual contents are placed to automatically achieve realistic occlusion and perspective foreshortening. To further simplify 3D placement through a 2D viewport, DepthScape uses a vision-language model to analyze source images and extract key visual components as content anchors for direct manipulation editing. We evaluate DepthScape with nine participants of varying design backgrounds, confirming the effectiveness of our creation pipeline. We also test on 100 professional stock images to assess robustness, and conduct an expert evaluation that confirms the quality of DepthScape's results.
翻译:2.5D效果(如遮挡与透视缩短)通过将三维深度线索融入二维设计,增强了视觉动态性与真实感。然而,由于深度感知的复杂性,创作此类效果仍具有挑战性且耗时费力。本文提出DepthScape,一种人机协同系统,通过将设计元素直接置入三维重建场景来简化2.5D效果的创作流程。该系统利用单目深度重建技术,将图像转化为三维重建模型,使视觉内容自动实现逼真的遮挡与透视缩短效果。为通过二维视口进一步简化三维布局,DepthScape采用视觉语言模型分析源图像,提取关键视觉组件作为内容锚点以支持直接操控编辑。我们招募了九名不同设计背景的参与者对DepthScape进行评估,验证了创作流程的有效性。同时,使用100张专业素材库图像测试系统鲁棒性,并通过专家评审确认了DepthScape生成结果的质量。