Llama: Open and efficient foundation language models H Touvron, T Lavril, G Izacard, X Martinet, MA Lachaux, T Lacroix, ... arXiv preprint arXiv:2302.13971, 2023 | 9019 | 2023 |
Llama 2: Open foundation and fine-tuned chat models H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ... arXiv preprint arXiv:2307.09288, 2023 | 8392 | 2023 |
Mixtral of experts AQ Jiang, A Sablayrolles, A Roux, A Mensch, B Savary, C Bamford, ... arXiv preprint arXiv:2401.04088, 2024 | 744 | 2024 |
Mistral 7B AQ Jiang, A Sablayrolles, A Mensch, C Bamford, DS Chaplot, D Casas, ... arXiv preprint arXiv:2310.06825, 2023 | 648 | 2023 |
CCNet: Extracting high quality monolingual datasets from web crawl data G Wenzek, MA Lachaux, A Conneau, V Chaudhary, F Guzmán, A Joulin, ... arXiv preprint arXiv:1911.00359, 2019 | 580 | 2019 |
Poly-encoders: Transformer architectures and pre-training strategies for fast and accurate multi-sentence scoring S Humeau, K Shuster, MA Lachaux, J Weston arXiv preprint arXiv:1905.01969, 2019 | 565 | 2019 |
Unsupervised translation of programming languages MA Lachaux, B Roziere, L Chanussot, G Lample arXiv preprint arXiv:2006.03511, 2020 | 377* | 2020 |
LLaMA: open and efficient foundation language models. arXiv H Touvron, T Lavril, G Izacard, X Martinet, MA Lachaux, T Lacroix, ... arXiv preprint arXiv:2302.13971, 2023 | 162 | 2023 |
DOBF: A Deobfuscation Pre-Training Objective for Programming Languages MA Lachaux, B Roziere, M Szafraniec, G Lample Advances in Neural Information Processing Systems 34, 2021 | 140* | 2021 |
Hypertree proof search for neural theorem proving G Lample, T Lacroix, MA Lachaux, A Rodriguez, A Hayat, T Lavril, ... Advances in neural information processing systems 35, 26337-26349, 2022 | 93 | 2022 |
Llama 2: open foundation and fine-tuned chat models. arXiv H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ... arXiv preprint arXiv:2307.09288, 2023 | 86 | 2023 |
Llama 2: Open foundation and fine-tuned chat models. arXiv 2023 H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ... arXiv preprint arXiv:2307.09288, 0 | 78 | |
Target conditioning for one-to-many generation MA Lachaux, A Joulin, G Lample arXiv preprint arXiv:2009.09758, 2020 | 15 | 2020 |
Timo-401 thée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open 402 and efficient foundation language models H Touvron, T Lavril, G Izacard, X Martinet, MA Lachaux arXiv preprint arXiv:2302.13971 403, 2023 | 10 | 2023 |