Follow
Hugo Laurençon
Hugo Laurençon
Hugging Face
Verified email at huggingface.co
Title
Cited by
Cited by
Year
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
TL Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ...
arXiv preprint arXiv:2211.05100, 2022
12022022
The bigscience roots corpus: A 1.6 tb composite multilingual dataset
H Laurençon, L Saulnier, T Wang, C Akiki, A Villanova del Moral, ...
Advances in Neural Information Processing Systems 35, 31809-31826, 2022
1162022
Obelics: An open web-scale filtered dataset of interleaved image-text documents
H Laurençon, L Saulnier, L Tronchon, S Bekman, A Singh, A Lozhkov, ...
Advances in Neural Information Processing Systems 36, 2024
712024
The ROOTS search tool: Data transparency for LLMs
A Piktus, C Akiki, P Villegas, H Laurençon, G Dupont, AS Luccioni, ...
arXiv preprint arXiv:2302.14035, 2023
212023
DP-Parse: Finding word boundaries from raw speech with an instance lexicon
R Algayres, T Ricoul, J Karadayi, H Laurençon, S Zaiem, A Mohamed, ...
Transactions of the Association for Computational Linguistics 10, 1051-1065, 2022
112022
Continuous homeostatic reinforcement learning for self-regulated autonomous agents
H Laurençon, CR Ségerie, J Lussange, BS Gutkin
arXiv preprint arXiv:2109.06580, 2021
62021
Calm: A multi-task benchmark for comprehensive assessment of language model bias
V Gupta, PN Venkit, H Laurençon, S Wilson, RJ Passonneau
arXiv preprint arXiv:2308.12539, 2023
12023
What matters when building vision-language models?
H Laurençon, L Tronchon, M Cord, V Sanh
arXiv preprint arXiv:2405.02246, 2024
2024
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
H Laurençon, L Tronchon, V Sanh
arXiv preprint arXiv:2403.09029, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–9