Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration
Zhao, Tingting, Hachiya, Hirotaka, Tangkaratt, Voot, Morimoto, Jun, Sugiyama, MasashiVolume:
25
Language:
english
Journal:
Neural Computation
DOI:
10.1162/NECO_a_00452
Date:
June, 2013
File:
PDF, 1.22 MB
english, 2013