Follow
Shivam Mehta
Title
Cited by
Cited by
Year
Neural HMMs are all you need (for high-quality attention-free TTS)
S Mehta, É Székely, J Beskow, GE Henter
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
202022
Prosody-controllable spontaneous TTS with neural HMMs
H Lameris, S Mehta, GE Henter, J Gustafson, É Székely
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
122023
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis
S Mehta, S Wang, S Alexanderson, J Beskow, É Székely, GE Henter
Proc. 12th ISCA Speech Synthesis Workshop (SSW2023), 150--156, 2023
112023
OverFlow: Putting flows on top of neural transducers for better TTS
S Mehta, A Kirkland, H Lameris, J Beskow, É Székely, GE Henter
Proceedings of INTERSPEECH 2023, 4279--4283, 2023
92023
Diffusion-based co-speech gesture generation using joint text and audio representation
A Deichler, S Mehta, S Alexanderson, J Beskow
Proceedings of the 25th International Conference on Multimodal Interaction …, 2023
92023
Matcha-TTS: A fast TTS architecture with conditional flow matching
S Mehta, R Tu, J Beskow, É Székely, GE Henter
arXiv preprint arXiv:2309.03199, 2023
72023
Penetration testing as a test phase in web service testing a black box pen testing approach
S Mehta, G Raj, D Singh
Smart Computing and Informatics: Proceedings of the First International …, 2018
72018
Stuck in the MOS pit: A critical analysis of MOS test methodology in TTS evaluation
A Kirkland, S Mehta, H Lameris, GE Henter, E Székely, J Gustafson
12th Speech Synthesis Workshop (SSW) 2023, 2023
42023
Speech data augmentation for improving phoneme transcriptions of aphasic speech using wav2vec 2.0 for the psst challenge
B Moëll, J O'Regan, S Mehta, A Kirkland, H Lameris, J Gustafsson, ...
13th Language Resources and Evaluation Conference (LREC), 62-70, 2022
32022
Unified speech and gesture synthesis using flow matching
S Mehta, R Tu, S Alexanderson, J Beskow, É Székely, GE Henter
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
22024
Finding the Blank with Sequence Labeling for English Learning
S Mehta, I Smetannikov
Proceedings of the 2020 1st International Conference on Control, Robotics …, 2020
12020
Should you use a probabilistic duration model in TTS? Probably! Especially for spontaneous speech
S Mehta, H Lameris, R Punmiya, J Beskow, É Székely, GE Henter
arXiv preprint arXiv:2406.05401, 2024
2024
Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis
S Mehta, A Deichler, J O'regan, B Moëll, J Beskow, GE Henter, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
2024
Stereotypical nationality representations in HRI: perspectives from international young adults
R Cumbal, A Axelsson, S Mehta, O Engwall
Frontiers in Robotics and AI 10, 2023
2023
Learning fast with fewer data samples using Neural HMMs
S Mehta, H Lameris, É Székely, J Beskow, GE Henter
2022
Spontaneous Neural HMM TTS with Prosodic Feature Modification
H Lameris, S Mehta, GE Henter, A Kirkland, B Moëll, J O’Regan, ...
Fonetik 2022, Stockholm 13-15 May, 202, 2022
2022
The system can't perform the operation now. Try again later.
Articles 1–16