Deyao Zhu

Cited by

	All	Since 2019
Citations	1913	1913
h-index	9	9
i10-index	8	8

1400

700

350

1050

20222023202410 574 1327

Co-authors

Mohamed Elhoseiny, Ph.D.Associate Professor, KAUST (hiring postdocs & grad students)Verified email at kaust.edu.sa
Jun ChenKAUSTVerified email at kaust.edu.sa
Xiaoqian ShenCS PhD @ KAUSTVerified email at kaust.edu.sa
Xiang LiKAUSTVerified email at kaust.edu.sa
Li Erran LiIEEE Fellow and ACM Fellow, AWS AI, AmazonVerified email at cs.columbia.edu
Abduallah MohamedApplied Research Scientist, Meta Reality LabsVerified email at fb.com
Mohamed ZahranUdacityVerified email at udacity.com

Deyao Zhu

Research Scientist, ByteDance

Verified email at bytedance.com - Homepage

AGI Vision Language Models AI Agents


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
MiniGPT-4: Enhancing vision-language understanding with advanced large language models D Zhu, J Chen, X Shen, X Li, M Elhoseiny International Conference on Learning Representations 2024, 2023	1409	2023
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning J Chen, D Zhu, X Shen, X Li, Z Liu, P Zhang, R Krishnamoorthi, ... 2nd MMFM Workshop in CVPR2024, 2023	270	2023
ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions D Zhu, J Chen, K Haydarov, X Shen, W Zhang, M Elhoseiny Transactions on Machine Learning Research (TMLR), 2023	73	2023
Social-Implicit: Rethinking Trajectory Prediction Evaluation and The Effectiveness of Implicit Maximum Likelihood Estimation A Mohamed, D Zhu, W Vu, M Elhoseiny, C Claudel European Conference on Computer Vision (ECCV) 2022, 2022	51	2022
Video ChatCaptioner: Towards Enriched Spatiotemporal Descriptions J Chen, D Zhu, K Haydarov, X Li, M Elhoseiny arXiv preprint arXiv:2304.04227, 2023	26	2023
Exploring Open-Vocabulary Semantic Segmentation from CLIP Vision Encoder Distillation Only J Chen, D Zhu, G Qian, B Ghanem, Z Yan, C Zhu, F Xiao, SC Culatana, ... Proceedings of the IEEE/CVF International Conference on Computer Vision, 699-710, 2023	23*	2023
Motion forecasting with unlikelihood training in continuous space D Zhu, M Zahran, LE Li, M Elhoseiny Conference on Robot Learning, 1003-1012, 2022	15	2022
RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition J Chen, A Agarwal, S Abdelkarim, D Zhu, M Elhoseiny Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022	15*	2022
Minigpt4-video: Advancing multimodal llms for video understanding with interleaved visual-textual tokens K Ataallah, X Shen, E Abdelrahman, E Sleiman, D Zhu, J Ding, ... 2nd MMFM Workshop in CVPR2024, 2024	9	2024
HalentNet: Multimodal Trajectory Forecasting with Hallucinative Intents D Zhu, M Zahran, LE Li, M Elhoseiny International Conference on Learning Representations, 2021, 2021	7	2021
Guiding Online Reinforcement Learning with Action-Free Offline Pretraining D Zhu, Y Wang, J Schmidhuber, M Elhoseiny arXiv preprint arXiv:2301.12876, 2023	6	2023
Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning D Zhu, LE Li, M Elhoseiny International Conference on Learning Representations 2023, 2022	5	2022
Learning to disentangle latent physical factors for video prediction D Zhu, M Munderloh, B Rosenhahn, J Stückler Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Dortmund …, 2019	4	2019
Goldfish: Vision-Language Understanding of Arbitrarily Long Videos K Ataallah, X Shen, E Abdelrahman, E Sleiman, M Zhuge, J Ding, D Zhu, ... European Conference on Computer Vision (ECCV) 2024, 2024		2024
MiniGPT-Med: Large Language Model as a General Interface for Radiology Diagnosis A Alkhaldi, R Alnajim, L Alabdullatef, R Alyahya, J Chen, D Zhu, A Alsinan, ... arXiv preprint arXiv:2407.04106, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–15

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors