VLDeformer: Vision–Language Decomposed Transformer for fast cross-modal retrieval L Zhang, H Wu, Q Chen, Y Deng, J Siebert, Z Li, Y Han, D Kong, Z Cao Knowledge-Based Systems 252, 109316, 2022 | 13 | 2022 |
Contrastive Label Correlation Enhanced Unified Hashing Encoder for Cross-modal Retrieval H Wu, L Zhang, Q Chen, Y Deng, J Siebert, Y Han, Z Li, D Kong, Z Cao Proceedings of the 31st ACM International Conference on Information …, 2022 | 7 | 2022 |
A network representation learning method based on topology W Wang, D Ma, G Xin, Y Han, J Huang, B Wang Information Sciences 571, 443-458, 2021 | 7 | 2021 |
FashionSAP: Symbols and Attributes Prompt for Fine-grained Fashion Vision-Language Pre-training Y Han, L Zhang, Q Chen, Z Chen, Z Li, J Yang, Z Cao Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 6 | 2023 |
Vldeformer: learning visual-semantic embeddings by vision-language transformer decomposing L Zhang, H Wu, Q Chen, Y Deng, Z Li, D Kong, Z Cao, J Siebert, Y Han arXiv preprint arXiv:2110.11338 9, 2021 | 2 | 2021 |
Replacement as a Self-supervision for Fine-grained Vision-language Pre-training L Zhang, Q Chen, Z Chen, Y Han, Z Li, Z Cao arXiv preprint arXiv:2303.05313, 2023 | | 2023 |