Training deep networks with limited labeled data while achieving a strong generalization ability is key in the quest to reduce human annotation efforts. This is the goal of semi-supervised learning, which exploits more widely available unlabeled data to complement small labeled data sets. In this paper, we propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels. Concretely, we learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images supplemented with only few labeled ones. We build our architecture on top of StyleGAN2, augmented with a label synthesis branch. Image labeling at test time is achieved by first embedding the target image into the joint latent space via an encoder network and test-time optimization, and then generating the label from the inferred embedding. We evaluate our approach in two important domains: medical image segmentation and part-based face segmentation. We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization, such as transferring from CT to MRI in medical imaging, and photographs of real faces to paintings, sculptures, and even cartoons and animal faces. Project Page: \url{https://nv-tlabs.github.io/semanticGAN/}
翻译:以有限的标签数据进行深层次培训网络,在获得有限的标签数据的同时实现强有力的概括化能力,是寻求减少人类笔记努力的关键。这是半监督学习的目标,它利用更加广泛可获得的无标签数据来补充标签的小型数据集。在本文中,我们提议了使用图像和标签的基因化模型的歧视性像素层面任务的新框架。具体地说,我们学习了一种基因化的对称网络,它捕捉了共同图像-标签分布,并得到了使用大量无标签图像的高效培训,这些图像仅配有少量标签。我们在StelegGAN2顶部上建了我们的建筑,并配有标签合成分支。我们在测试时间将目标图像通过编码网络和测试时间优化嵌入联合隐蔽空间,从而实现图像标签的实现。我们从医学图像分解和部分面部位分解到面部分层面的我们的方法。我们展示了与几个基准相比的内在功能很强,我们是第一个展示极端外部图像的架构,例如从编码网络/图像转换到图像和图像。M-HR图像和图像。