其他分享
首页 > 其他分享> > 【CVPR 2021】Joint Generative and Contrastive Learning for Unsupervised Person Re-identification

【CVPR 2021】Joint Generative and Contrastive Learning for Unsupervised Person Re-identification

作者:互联网

在这里插入图片描述

方法概述

1,提出了一种用于无监督行人重识别的联合生成对比学习框架,生成和对比模块互相提高对方的性能。
2, 在生成模块中,我们引入了3D网格生成器。
3, 在对比模块,我们提出了一种视角无关的损失,来减少生成样本和原始样本之间的类内变化。

文章目录

内容概要

论文名称简称会议/期刊出版年份baselinebackbone数据集
Joint Generative and Contrastive Learning for Unsupervised Person Re-identificationGCLCVPR2021【JVTC】Li, J., Zhang, S.: Joint visual and temporal consistency for unsupervised domain adaptive person re- identification. pp. 1–14 (2020)ImageNet [32] pre-trained ResNet50 [17] with slight modificationsMarket-1501、DukeMTMC-reID, MSMT17 [41]

在线链接:https://openaccess.thecvf.com/content/CVPR2021/html/Chen_Joint_Generative_and_Contrastive_Learning_for_Unsupervised_Person_Re-Identification_CVPR_2021_paper.html
源码链接: https: //github.com/chenhao2345/GCL.

工作概述

1, we incorporate a Generative Adversarial Network (GAN) and a contrastive learning module into one joint training framework.
2, While the GAN provides online data augmentation for contrastive learning, the contrastive module learns view-invariant fea- tures for generation.
3, we propose a mesh- based view generator. Specifically, mesh projections serve as references towards generating novel views of a per- son.
4,we propose a view-invariant loss to fa- cilitate contrastive learning between original and gener- ated views.

成果概述

our method significantly outperforms state-of-the-art methods under both, fully unsupervised and unsupervised domain adaptive settings on several large scale ReID dat- sets.

方法详解

方法框架

在这里插入图片描述

Figure 2: (a) General architecture of GCL: Generative and contrastive modules are coupled by the shared identity encoder Eid. (b) Generative module: The decoder G combines the identity features encoded by Eid and structure features Estr to generate a novel view x′
new with a cycle consistency. © Contrastive module: View-invariance is enhanced by maximizing the agreement between original Eid(x), synthesized Eid(x′
new) and memory fpos representations.

在这里插入图片描述

Figure 3: Example images as generated by the View Generator via 3D mesh rotation based on left input image.

具体实现

1,GCL框架主要包含了 生成模块和 对比模块两个模块。
2, 在生成模块中,文章通过HMR构建3D网格,提取图像的外观和姿势。 然后通过对姿势进行不同角度的旋转来重新构成样本,以此从样本、特征和解码结果三个层面构成损失gan。
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

3, 在对比模块中,文章维护了一个内存条(memory bank)来存储样本的特征向量,并在迭代过程中根据公式5更新。然后从前面诸多的样本中构造正负样本对,然后求对比损失。
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

4,联合训练采用热启动的形式,基于baseline工作训练先进行40epoch学习gan损失,在最后20个epoch才学习总体损失(公式9)
在这里插入图片描述

实验结果

在这里插入图片描述
在这里插入图片描述

总体评价

1, 基本上所有创新点都基于一开始想到了是用3D网格来生成样本,在这个基础上,后面的创新点都水到渠成的出来了。
2,感觉各种样本的合成以及组合有点繁杂了。
3,当没有一个漂亮的大图的时候,多部分组图也可以成为framework。画图不够高端。

引用格式

@inproceedings{DBLP:conf/cvpr/ChenWLDB21,
author = {Hao Chen and
Yaohui Wang and
Benoit Lagadec and
Antitza Dantcheva and
Fran{\c{c}}ois Br{’{e}}mond},
title = {Joint Generative and Contrastive Learning for Unsupervised Person
Re-Identification},
booktitle = {{CVPR}},
pages = {2004–2013},
publisher = {Computer Vision Foundation / {IEEE}},
year = {2021}
}

参考文献

[1] Slawomir Bak, Peter Carr, and Jean-Francois Lalonde. Do- main adaptation through synthesis for unsupervised person re-identification. In ECCV, 2018. 1
[2] Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale GAN training for high fidelity natural image synthesis. In ICLR, 2019. 2
[3] Caroline Chan, Shiry Ginosar, Tinghui Zhou, and Alexei A Efros. Everybody dance now. In ICCV, 2019. 2
[4] Hao Chen, Benoit Lagadec, and Francois Bremond. Learn- ing discriminative and generalizable representations by spatial-channel partition for person re-identification. In WACV, 2020. 1
[5] Hao Chen, Benoit Lagadec, and Francois Bremond. En- hancing diversity in teacher-student networks via asymmet- ric branches for unsupervised person re-identification. In WACV, 2021. 3
[6] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Ge- offrey Hinton. A simple framework for contrastive learning of visual representations. In ICML, 2020. 1, 2
[7] Xinlei Chen, Haoqi Fan, Ross Girshick, and Kaiming He. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297, 2020. 1, 2
[8] Yanbei Chen, Xiatian Zhu, and Shaogang Gong. Instance- guided context rendering for cross-domain person re- identification. In ICCV, 2019. 3, 6
[9]Weijian Deng, Liang Zheng, Qixiang Ye, Guoliang Kang, Yi Yang, and Jianbin Jiao. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In CVPR, 2018. 1, 3
[10] Chanho Eom and Bumsub Ham. Learning disentangled rep- resentation for robust person re-identification. In NeurIPS, 2019. 2, 7, 8
[11] Martin Ester, Hans-Peter Kriegel, J¨org Sander, and Xiaowei Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD, 1996. 4
[12] Yang Fu, Yunchao Wei, Guanshuo Wang, Yuqian Zhou, Honghui Shi, and Thomas S Huang. Self-similarity group- ing: A simple unsupervised cross domain adaptation ap- proach for person re-identification. In ICCV, 2019. 3, 6, 7
[13] Yixiao Ge, Dapeng Chen, and Hongsheng Li. Mutual mean- teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. In ICLR, 2020. 3, 6, 7
[14] Yixiao Ge, Zhuowan Li, Haiyu Zhao, Guojun Yin, Shuai Yi, Xiaogang Wang, and Hongsheng Li. Fd-gan: Pose-guided feature distilling gan for robust person re-identification. In NeurIPS, 2018. 2, 7, 8
[15] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In NeurIPS, 2014. 1, 2
[16] Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual rep- resentation learning. In CVPR, 2020. 1, 2
[17] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In CVPR, 2016. 5
[18] Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilib- rium. In NeurIPS, 2017. 8
[19] Xun Huang and Serge Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In ICCV, 2017. 5
[20] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation with conditional adver- sarial networks. In CVPR, 2017. 5
[21] Angjoo Kanazawa, Michael J. Black, David W. Jacobs, and Jitendra Malik. End-to-end recovery of human shape and pose. In CVPR, 2018. 2, 3
[22] Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. In CVPR, 2019. 2
[23] Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. Analyzing and improving the image quality of StyleGAN. In CVPR, 2020. 2
[24] Jianing Li and Shiliang Zhang. Joint visual and tempo- ral consistency for unsupervised domain adaptive person re- identification. In ECCV, 2020. 1, 2, 3, 5, 6, 7
[25] Yu-Jhe Li, Ci-Siang Lin, Yan-Bo Lin, and Yu-Chiang Frank Wang. Cross-dataset person re-identification via unsuper- vised pose disentanglement and adaptation. In ICCV, 2019. 1, 2, 3, 6, 8
[26]Shan Lin, Haoliang Li, Chang-Tsun Li, and Alex Chichung Kot. Multi-task mid-level feature alignment network for un- supervised cross-dataset person re-identification. In BMVC, 2018. 3
[27] Yutian Lin, Xuanyi Dong, Liang Zheng, Yan Yan, and Yi Yang. A bottom-up clustering approach to unsupervised per- son re-identification. In AAAI, 2019. 1, 3, 6
[28] Yutian Lin, Lingxi Xie, Yu Wu, Chenggang Yan, and Qi Tian. Unsupervised person re-identification via softened similarity learning. In CVPR, 2020. 3, 6
[29] Chuanchen Luo, Chunfeng Song, and Zhaoxiang Zhang. Generalizing person re-identification by camera-aware in- variance learning and cross-domain mixup. In ECCV, 2020. 2, 6, 7
[30] Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Repre- sentation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018. 2, 4
[31] Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. Performance measures and a data set for multi-target, multi-camera tracking. In ECCVW, 2016. 5
[32] Olga Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Zhiheng Huang, A. Karpathy, A. Khosla, M. Bern- stein, A. Berg, and Li Fei-Fei. Imagenet large scale visual recognition challenge. IJCV, 2015. 5
[33] Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, and Shengjin Wang. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In ECCV, 2018. 1
[34] Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, and Jan Kautz. MoCoGAN: Decomposing motion and content for video generation. In CVPR, 2018. 2
[35] DongkaiWang and Shiliang Zhang. Unsupervised person re- identification via multi-label classification. In CVPR, 2020. 1, 2, 3, 5, 6, 7
[36] Jingya Wang, Xiatian Zhu, Shaogang Gong, and Wei Li. Transferable joint attribute-identity deep learning for unsu- pervised person re-identification. CVPR, 2018. 3
[37] Yaohui Wang, Piotr Bilinski, Francois Bremond, and Antitza Dantcheva. G3AN: Disentangling appearance and motion for video generation. In CVPR, 2020. 2
[38] Yaohui Wang, Piotr Bilinski, Francois Bremond, and Antitza Dantcheva. Imaginator: Conditional spatio-temporal gan for video generation. In WACV, 2020. 2
[39] Yaohui Wang, Francois Bremond, and Antitza Dantcheva. Inmodegan: Interpretable motion decomposition generative adversarial network for video generation. arXiv preprint arXiv:2101.03049, 2021. 2
[40] Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. Image quality assessment: from error visibility to structural similarity. TIP, 2004. 8
[41] Longhui Wei, Shiliang Zhang, Wen Gao, and Qi Tian. Person transfer gan to bridge domain gap for person re- identification. In CVPR, 2018. 1, 3, 5
[42] Guile Wu, Xiatian Zhu, and Shaogang Gong. Track- let self-supervised learning for unsupervised person re- identification. In AAAI, 2020. 3, 6
[43] YuWu, Yutian Lin, Xuanyi Dong, Yan Yan, Wei Bian, and Yi Yang. Progressive learning for person re-identification with one example. TIP, 2019. 3
[44] Zhirong Wu, Yuanjun Xiong, Stella X. Yu, and Dahua Lin. Unsupervised feature learning via non-parametric instance discrimination. In CVPR, 2018. 2, 4
[45] Fengxiang Yang, Ke Li, Zhun Zhong, Zhiming Luo, Xing Sun, Hao Cheng, Xiaowei Guo, Feiyue Huang, Rongrong Ji, and Shaozi Li. Asymmetric co-teaching for unsupervised cross-domain person re-identification. In AAAI, 2020. 3, 5, 6
[46] Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jing- dong Wang, and Qi Tian. Scalable person re-identification: A benchmark. ICCV, 2015. 5
[47] Zhedong Zheng, Xiaodong Yang, Zhiding Yu, Liang Zheng, Yi Yang, and Jan Kautz. Joint discriminative and generative learning for person re-identification. In CVPR, 2019. 3, 5, 7, 8
[48] Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. Random erasing data augmentation. In AAAI, 2020. 2
[49] Zhun Zhong, Liang Zheng, Shaozi Li, and Yi Yang. Gener- alizing a person retrieval model hetero- and homogeneously. In ECCV, 2018. 1, 3
[50] Zhun Zhong, Liang Zheng, Zhiming Luo, Shaozi Li, and Yi Yang. Invariance matters: Exemplar memory for domain adaptive person re-identification. In CVPR, 2019. 3, 6, 7
[51] Zhun Zhong, Liang Zheng, Zhiming Luo, Shaozi Li, and Yi Yang. Learning to adapt invariance in memory for person re-identification. PAMI, 2020. 3, 6, 7
[52] Zhun Zhong, Liang Zheng, Zhedong Zheng, Shaozi Li, and Yi Yang. Camera style adaptation for person re- identification. In CVPR, 2018. 2
[53] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycle- consistent adversarial networks. In ICCV, 2017. 2, 3
[54] Yang Zou, Xiaodong Yang, Zhiding Yu, B. V. K. Vijaya Ku- mar, and Jan Kautz. Joint disentangling and adaptation for cross-domain person re-identification. In ECCV, 2020. 1, 3, 5, 6, 7, 8

标签:CVPR,person,Unsupervised,Contrastive,Li,Joint,2020,identification,re
来源: https://blog.csdn.net/NGUever15/article/details/120854735