Video question answering via gradually refined attention over appearance and motion D Xu, Z Zhao, J Xiao, F Wu, H Zhang, X He, Y Zhuang Proceedings of the 25th ACM international conference on Multimedia, 1645-1653, 2017 | 585 | 2017 |
Self-supervised spatiotemporal learning via video clip order prediction D Xu, J Xiao, Z Zhao, J Shao, D Xie, Y Zhuang Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019 | 538 | 2019 |
Activitynet-qa: A dataset for understanding complex web videos via question answering Z Yu, D Xu, J Yu, T Yu, Z Zhao, Y Zhuang, D Tao Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 9127-9134, 2019 | 365 | 2019 |
Multichannel attention refinement for video question answering Y Zhuang, D Xu, X Yan, W Cheng, Z Zhao, S Pu, J Xiao ACM Transactions on Multimedia Computing, Communications, and Applications …, 2020 | 34 | 2020 |
Explore video clip order with self-supervised and curriculum learning for video applications J Xiao, L Li, D Xu, C Long, J Shao, S Zhang, S Pu, Y Zhuang IEEE Transactions on Multimedia 23, 3454-3466, 2020 | 10 | 2020 |