Minsu Kim

Citado por

	Todos	Desde 2019
Citações	362	362
Índice h	11	11
Índice i10	12	12

200

100

150

20212022202320244 47 198 113

Coautores

Yong Man RoProfessor of Electrical Engineering, KAISTE-mail confirmado em kaist.ac.kr
Joanna HongPh.D. at Korea Advanced Institute of Science and TechnologyE-mail confirmado em kaist.ac.kr
Jeongsoo ChoiKAISTE-mail confirmado em kaist.ac.kr
Se Jin ParkKorea Advanced Institute of Science and Technology (KAIST)E-mail confirmado em kaist.ac.kr
Jeong Hun YeoKorea Advanced Institute of Science and TechnologyE-mail confirmado em kaist.ac.kr
Shinji WatanabeCarnegie Mellon UniversityE-mail confirmado em cmu.edu
Junho KimKorea Advanced Institute of Science and Technology (KAIST)E-mail confirmado em kaist.ac.kr
Hong Joo, LeeTechnical University of MunichE-mail confirmado em tum.de
Hyunjun KimKorea Advanced Institute of Science and TechnologyE-mail confirmado em kaist.ac.kr
Hyung-Il KimSenior Researcher, ETRIE-mail confirmado em etri.re.kr
Sangmin LeeUniversity of Illinois Urbana-ChampaignE-mail confirmado em illinois.edu
Soumi MaitiCarnegie Mellon UniversityE-mail confirmado em andrew.cmu.edu
Jung Uk KimAssistant Professor of Computer Science, Kyung Hee UniversityE-mail confirmado em khu.ac.kr
Siddhant AroraGraduate Student, Carnegie Mellon UniversityE-mail confirmado em andrew.cmu.edu
Xuankai ChangCarnegie Mellon University, StudentE-mail confirmado em andrew.cmu.edu
Jee-weon JungCarnegie Mellon UniversityE-mail confirmado em ieee.org
Dahun KimResearch Scientist, Google DeepMindE-mail confirmado em google.com
Sungjune ParkElectrical Engineering, Korea Advanced Institute of Science and Technology (KAIST)E-mail confirmado em kaist.ac.kr

Seguir

Minsu Kim

Korea Advanced Institute of Science and Technology

E-mail confirmado em kaist.ac.kr - Página inicial

Multimodal Learning Audio-Visual Speech Processing Multimodal Language Processing


Título Ordenar por citações Ordenar por ano Ordenar por título	Citado por Citado por	Ano
Synctalkface: Talking face generation with precise lip-syncing via audio-lip memory SJ Park, M Kim, J Hong, J Choi, YM Ro Proceedings of the AAAI Conference on Artificial Intelligence 36 (2), 2062-2070, 2022	52	2022
Distinguishing homophenes using multi-head visual-audio memory for lip reading M Kim, JH Yeo, YM Ro Proceedings of the AAAI conference on artificial intelligence 36 (1), 1174-1182, 2022	43	2022
Multi-modality associative bridging through memory: Speech sound recollected from face video M Kim, J Hong, SJ Park, YM Ro Proceedings of the IEEE/CVF International Conference on Computer Vision, 296-306, 2021	38	2021
Lip to speech synthesis with visual context attentional gan M Kim, J Hong, YM Ro Advances in Neural Information Processing Systems 34, 2758-2770, 2021	37	2021
Cromm-vsr: Cross-modal memory augmented visual speech recognition M Kim, J Hong, SJ Park, YM Ro IEEE Transactions on Multimedia 24, 4342-4355, 2021	27	2021
Speaker-adaptive lip reading with user-dependent padding M Kim, H Kim, YM Ro European Conference on Computer Vision, 576-593, 2022	18	2022
Speech reconstruction with reminiscent sound via visual voice memory J Hong, M Kim, SJ Park, YM Ro IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 3654-3667, 2021	18	2021
Prompt tuning of deep neural networks for speaker-adaptive visual speech recognition M Kim, HI Kim, YM Ro arXiv preprint arXiv:2302.08102, 2023	17	2023
Visual context-driven audio feature enhancement for robust end-to-end audio-visual speech recognition J Hong, M Kim, D Yoo, YM Ro INTERSPEECH 2022, 2022	17	2022
Watch or listen: Robust audio-visual speech recognition with visual corruption modeling and reliability scoring J Hong, M Kim, J Choi, YM Ro Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023	16	2023
Lip-to-speech synthesis in the wild with multi-task learning M Kim, J Hong, YM Ro ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	14	2023
Intelligible Lip-to-Speech Synthesis with Speech Units J Choi, M Kim, YM Ro INTERSPEECH 2023, 2023	11	2023
Multi-temporal lip-audio memory for visual speech recognition JH Yeo, M Kim, YM Ro ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	8	2023
Interpretation of lesional detection via counterfactual generation J Kim, M Kim, YM Ro 2021 IEEE International Conference on Image Processing (ICIP), 96-100, 2021	8	2021
Many-to-many spoken language translation via unified speech and text representation learning with unit-to-unit translation M Kim, J Choi, D Kim, YM Ro arXiv preprint arXiv:2308.01831, 2023	7	2023
Akvsr: Audio knowledge empowered visual speech recognition by compressing audio knowledge of a pretrained model JH Yeo, M Kim, J Choi, DH Kim, YM Ro IEEE Transactions on Multimedia, 2024	6	2024
Lip reading for low-resource languages by learning and combining general speech knowledge and language-specific knowledge M Kim, JH Yeo, J Choi, YM Ro Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023	6	2023
Visagesyntalk: Unseen speaker video-to-speech synthesis via speech-visage feature selection J Hong, M Kim, YM Ro European Conference on Computer Vision, 452-468, 2022	5	2022
Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from Whisper JH Yeo, M Kim, S Watanabe, YM Ro ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	3*	2024
Towards practical and efficient image-to-speech captioning with vision-language pre-training and multi-modal tokens M Kim, J Choi, S Maiti, JH Yeo, S Watanabe, YM Ro ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	3	2024

O sistema não pode executar a operação agora. Tente novamente mais tarde.

Artigos 1–20

Citações por ano

Citações duplicadas

Citações mescladas

Adicionar coautoresCoautores

Seguir

Citado por

Coautores