Where Does the Performance Improvement Come From? -A Reproducibility Concern about Image-Text Retrieval J Rao, F Wang, L Ding, S Qi, Y Zhan, W Liu, D Tao Proceedings of the 45th international ACM SIGIR conference on research and …, 2022 | 27 | 2022 |
Parameter-efficient and student-friendly knowledge distillation J Rao, X Meng, L Ding, S Qi, X Liu, M Zhang, D Tao IEEE Transactions on Multimedia, 2023 | 17 | 2023 |
Dynamic contrastive distillation for image-text retrieval J Rao, L Ding, S Qi, M Fang, Y Liu, L Shen, D Tao IEEE Transactions on Multimedia, 2023 | 13 | 2023 |
Student can also be a good teacher: Extracting knowledge from vision-and-language model for cross-modal retrieval J Rao, T Qian, S Qi, Y Wu, Q Liao, X Wang Proceedings of the 30th ACM International Conference on Information …, 2021 | 10 | 2021 |
What is the limitation of multimodal llms? a deeper look into multimodal llms through prompt probing S Qi, Z Cao, J Rao, L Wang, J Xiao, X Wang Information Processing & Management 60 (6), 103510, 2023 | 8 | 2023 |
Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining? F Wang, L Ding, J Rao, Y Liu, L Shen, C Ding arXiv preprint arXiv:2308.12898, 2023 | 4 | 2023 |
Finetuning Language Models for Multimodal Question Answering X Zhang, W Xie, Z Dai, J Rao, H Wen, X Luo, M Zhang, M Zhang Proceedings of the 31st ACM International Conference on Multimedia, 9420-9424, 2023 | 1 | 2023 |
Watch and Buy: A Practical Solution for Real-time Fashion Product Identification in Live Stream J Rao, Y Cao, S Qi, Z Dong, T Qian, X Wang Proceedings of the 1st Workshop on Multimodal Product Identification in …, 2021 | 1 | 2021 |
3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset X Ma, X Liu, DF Wong, J Rao, B Li, L Ding, LS Chao, D Tao, M Zhang COLING 2024, 2024 | | 2024 |