Follow
Silviu Pitis
Silviu Pitis
University of Toronto, Vector Institute
Verified email at cs.toronto.edu - Homepage
Title
Cited by
Cited by
Year
Large language models are human-level prompt engineers
Y Zhou, AI Muresanu, Z Han, K Paster, S Pitis, H Chan, J Ba
International Conference on Learning Representations (ICLR 2023), 2023
4422023
Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning
S Pitis, H Chan, S Zhao, B Stadie, J Ba
International Conference on Machine Learning (ICML 2020), 2020
1122020
Counterfactual data augmentation using locally factored dynamics
S Pitis, E Creager, A Garg
Neural Information Processing Systems (NeurIPS 2020), 2020
702020
Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach
S Pitis
The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), 2019
482019
Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning
K De Asis, A Chan, S Pitis, RS Sutton, D Graves
The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), 2020
312020
Boosted prompt ensembles for large language models
S Pitis, MR Zhang, A Wang, J Ba
arXiv preprint arXiv:2304.05970, 2023
252023
An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality
S Pitis, H Chan, K Jamali, J Ba
Eighth International Conference on Learning Representations (ICLR 2020), 2020
212020
MoCoDA: Model-based Counterfactual Data Augmentation
S Pitis, E Creager, A Mandlekar, A Garg
Neural Information Processing Systems (NeurIPS 2022), 2022
202022
Source Traces for Temporal Difference Learning
S Pitis
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 2018
192018
Identifying the risks of lm agents with an lm-emulated sandbox
Y Ruan, H Dong, A Wang, S Pitis, Y Zhou, J Ba, Y Dubois, CJ Maddison, ...
arXiv preprint arXiv:2309.15817, 2023
172023
Large language models are human-level prompt engineers (2022)
Y Zhou, AI Muresanu, Z Han, K Paster, S Pitis, H Chan, J Ba
arXiv preprint arXiv:2211.01910, 2022
162022
Failure modes of learning reward models for llms and other sequence models
S Pitis
ICML 2023 Workshop The Many Facets of Preference-Based Learning, 2023
52023
Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
S Pitis
Neural Information Processing Systems (NeurIPS 2023), 2023
5*2023
Calibrating language models via augmented prompt ensembles
M Jiang, Y Ruan, S Huang, S Liao, S Pitis, RB Grosse, J Ba
42023
Return augmentation gives supervised RL temporal compositionality
K Paster, S Pitis, SA McIlraith, J Ba
Deep Reinforcement Learning Workshop NeurIPS 2022, 2022
42022
Steering large language models using APE
Y Zhou, AI Muresanu, Z Han, K Paster, S Pitis, H Chan, J Ba
NeurIPS ML Safety Workshop, 2022
32022
Objective Social Choice: Using Auxiliary Information to Improve Voting Outcomes
S Pitis, MR Zhang
International Conference on Autonomous Agents and Multi-Agent Systems 2020, 2020
32020
ProtoGE: Prototype Goal Encodings for Multi-goal Reinforcement Learning
S Pitis, H Chan, J Ba
The 4th Multidisciplinary Conference on Reinforcement Learning and Decision …, 2019
32019
Methods for retrieving alternative contract language using a prototype
S Pitis
The Sixteenth International Conference on Law and Artificial Intelligence …, 2017
32017
CSC 311: Introduction to machine learning
R Grosse, RG Krishnan, G Zhang
University of Toronto, Fall, 2020
22020
The system can't perform the operation now. Try again later.
Articles 1–20