Kenny Young
Title
Cited by
Cited by
Year
Neurohex: A deep q-learning hex agent
K Young, G Vasan, R Hayward
Computer Games, 3-18, 2016
192016
Directly estimating the variance of the {\lambda}-return using temporal-difference methods
C Sherstan, B Bennett, K Young, DR Ashley, A White, M White, RS Sutton
arXiv preprint arXiv:1801.08287, 2018
112018
Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return.
C Sherstan, DR Ashley, B Bennett, K Young, A White, M White, RS Sutton
UAI, 63-72, 2018
62018
Minatar: An atari-inspired testbed for thorough and reproducible reinforcement learning experiments
K Young, T Tian
arXiv preprint arXiv:1903.03176, 2019
52019
Metatrace: Online step-size tuning by meta-gradient descent for reinforcement learning control
K Young, B Wang, ME Taylor
arXiv preprint arXiv, 1805
51805
Integrating episodic memory into a reinforcement learning agent using reservoir sampling
KJ Young, RS Sutton, S Yang
arXiv preprint arXiv:1806.00540, 2018
32018
MinAtar: An Atari-inspired Testbed for More Efficient Reinforcement Learning Experiments.
K Young, T Tian
arXiv preprint arXiv:1903.03176, 2019
22019
Metatrace actor-critic: Online step-size tuning by meta-gradient descent for reinforcement learning control
K Young, B Wang, ME Taylor
arXiv preprint arXiv:1805.04514, 2018
22018
MOHEX WINS HEX 11X11 AND 13X13 TOURNAMENTS
RB Hayward, N Weninger
unpublished, 0
2
A Reverse Hex Solver
K Young, RB Hayward
International Conference on Computers and Games, 137-148, 2016
12016
Variance Reduced Advantage Estimation with Hindsight Credit Assignment
K Young
arXiv preprint arXiv:1911.08362, 2019
2019
A Reverse Hex Solver
RB Hayward, B Toft, RB Hayward, B Toft, RB Hayward, B Toft, ...
Hex, Inside and Out: The Full Story 98 (9), xi-xiii, 2019
2019
Learning What to Remember with Online Policy Gradient Over a Reservoir
K Young, RS Sutton
MOHEX WINS 2016 HEX 11X11 AND 13X13 TOURNAMENTS
R Hayward, N Weninger, K Young, K Takada, T Zhang
The system can't perform the operation now. Try again later.
Articles 1–14