Scalable agent alignment via reward modeling: a research direction J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 2018 | 244 | 2018 |
Reducing sentiment bias in language models via counterfactual evaluation PS Huang, H Zhang, R Jiang, R Stanforth, J Welbl, J Rae, V Maini, ... arXiv preprint arXiv:1911.03064, 2019 | 159 | 2019 |
Machine learning for humans V Maini, S Sabri Retrieved on May 1, 2022, 2017 | 93 | 2017 |
Building safe artificial intelligence: specification, robustness, and assurance PA Ortega, V Maini, DMS Team DeepMind Safety Research Blog, 2018 | 36 | 2018 |
the DeepMind safety team PA Ortega, V Maini Building safe artificial intelligence: specification, robustness, and assurance, 2018 | 13 | 2018 |
Scalable agent alignment via reward modeling: A research direction. arXiv 2018 J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 1811 | 13 | 1811 |
Machine learning for humans. 2017 V Maini, S Sabri Disponível:«https://everythingcomputerscience. com/books/Machine% 20Learning …, 2023 | 12 | 2023 |
Scalable agent alignment via reward modeling: a research direction. arXiv J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 2018 | 8 | 2018 |