首页 > 其他分享> > How to Combine Tree-Search Methods in Reinforcement Learning

How to Combine Tree-Search Methods in Reinforcement Learning

2020-11-21 20:00:30 作者：互联网

郑重声明：原文参见标题，如有侵权，请联系作者，将会撤销发布！

AAAI 2019 Best Paper

Abstract

1 Introduction

2 Preliminaries

3 The h-Greedy Policy and h-PI

4 h-Greedy Consistency

5 The h-Greedy Policy Alone is Not Sufficient For Partial Evaluation

6 Backup the Tree-Search Byproducts

7 Relation to ExistingWork

8 Experiments

9 Summary and Future Work

A Proof of Lemma 1

B Affinity of T^π and Consequences

C Proof of Proposition 2

D Proof of Theorem 3

E h-Greedy Consistency in Each Iteration

F A Note on the Alternative λ-Return Operator

G More Experimental Results

标签：Search,Methods,Tree,Greedy,Consistency,Policy,Proof
来源： https://www.cnblogs.com/lucifer1997/p/14017015.html