Follow
Minsu Kim
Title
Cited by
Cited by
Year
Synctalkface: Talking face generation with precise lip-syncing via audio-lip memory
SJ Park, M Kim, J Hong, J Choi, YM Ro
Proceedings of the AAAI Conference on Artificial Intelligence 36 (2), 2062-2070, 2022
522022
Distinguishing homophenes using multi-head visual-audio memory for lip reading
M Kim, JH Yeo, YM Ro
Proceedings of the AAAI conference on artificial intelligence 36 (1), 1174-1182, 2022
412022
Multi-modality associative bridging through memory: Speech sound recollected from face video
M Kim, J Hong, SJ Park, YM Ro
Proceedings of the IEEE/CVF International Conference on Computer Vision, 296-306, 2021
372021
Lip to speech synthesis with visual context attentional gan
M Kim, J Hong, YM Ro
Advances in Neural Information Processing Systems 34, 2758-2770, 2021
362021
Cromm-vsr: Cross-modal memory augmented visual speech recognition
M Kim, J Hong, SJ Park, YM Ro
IEEE Transactions on Multimedia 24, 4342-4355, 2021
262021
Speaker-adaptive lip reading with user-dependent padding
M Kim, H Kim, YM Ro
European Conference on Computer Vision, 576-593, 2022
182022
Speech reconstruction with reminiscent sound via visual voice memory
J Hong, M Kim, SJ Park, YM Ro
IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 3654-3667, 2021
182021
Prompt tuning of deep neural networks for speaker-adaptive visual speech recognition
M Kim, HI Kim, YM Ro
arXiv preprint arXiv:2302.08102, 2023
172023
Visual context-driven audio feature enhancement for robust end-to-end audio-visual speech recognition
J Hong, M Kim, D Yoo, YM Ro
INTERSPEECH 2022, 2022
172022
Watch or listen: Robust audio-visual speech recognition with visual corruption modeling and reliability scoring
J Hong, M Kim, J Choi, YM Ro
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
152023
Lip-to-speech synthesis in the wild with multi-task learning
M Kim, J Hong, YM Ro
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
132023
Intelligible Lip-to-Speech Synthesis with Speech Units
J Choi, M Kim, YM Ro
INTERSPEECH 2023, 2023
102023
Multi-temporal lip-audio memory for visual speech recognition
JH Yeo, M Kim, YM Ro
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
82023
Interpretation of lesional detection via counterfactual generation
J Kim, M Kim, YM Ro
2021 IEEE International Conference on Image Processing (ICIP), 96-100, 2021
82021
Many-to-many spoken language translation via unified speech and text representation learning with unit-to-unit translation
M Kim, J Choi, D Kim, YM Ro
arXiv preprint arXiv:2308.01831, 2023
72023
Akvsr: Audio knowledge empowered visual speech recognition by compressing audio knowledge of a pretrained model
JH Yeo, M Kim, J Choi, DH Kim, YM Ro
IEEE Transactions on Multimedia, 2024
62024
Lip reading for low-resource languages by learning and combining general speech knowledge and language-specific knowledge
M Kim, JH Yeo, J Choi, YM Ro
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
62023
Visagesyntalk: Unseen speaker video-to-speech synthesis via speech-visage feature selection
J Hong, M Kim, YM Ro
European Conference on Computer Vision, 452-468, 2022
52022
Towards practical and efficient image-to-speech captioning with vision-language pre-training and multi-modal tokens
M Kim, J Choi, S Maiti, JH Yeo, S Watanabe, YM Ro
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
32024
Robust video facial authentication with unsupervised mode disentanglement
M Kim, HJ Lee, S Lee, YM Ro
2020 IEEE International Conference on Image Processing (ICIP), 1321-1325, 2020
32020
The system can't perform the operation now. Try again later.
Articles 1–20