Deep convolutional neural networks have made outstanding contributions in many fields such as computer vision in the past few years and many researchers published well-trained network for downloading. But recent studies have shown serious concerns about integrity due to model-reuse attacks and backdoor attacks. In order to protect these open-source networks, many algorithms have been proposed such as watermarking. However, these existing algorithms modify the contents of the network permanently and are not suitable for integrity authentication. In this paper, we propose a reversible watermarking algorithm for integrity authentication. Specifically, we present the reversible watermarking problem of deep convolutional neural networks and utilize the pruning theory of model compression technology to construct a host sequence used for embedding watermarking information by histogram shift. As shown in the experiments, the influence of embedding reversible watermarking on the classification performance is less than 0.5% and the parameters of the model can be fully recovered after extracting the watermarking. At the same time, the integrity of the model can be verified by applying the reversible watermarking: if the model is modified illegally, the authentication information generated by original model will be absolutely different from the extracted watermarking information.
翻译:深相神经网络在过去几年中在计算机视觉等许多领域作出了杰出贡献,许多研究人员公布了经过良好训练的下载网络。但最近的研究表明,由于模型再使用攻击和后门攻击,人们对完整性表示严重关切。为了保护这些开放源网络,提出了许多算法,例如水标记。然而,这些现有的算法永久性地改变了网络的内容,不适合完整性认证。在本文件中,我们提议了一种可逆的水标记算法,用于完整性认证。具体地说,我们提出了深深相神经网络的可逆水标记问题,并利用模型压缩技术的操纵理论来构建一个主机序列,用于通过直方图转换嵌入水标记信息。正如实验所示,在分类性能上嵌入可逆水标记的影响不到0.5 %, 模型参数在提取水标记后可以完全恢复。 同时,模型的完整性可以通过应用可逆的水标记来验证:如果模型是非法修改的,则从原始水模型中提取的绝对的认证信息将会是不同的。