其他分享
首页 > 其他分享> > How to Combine Tree-Search Methods in Reinforcement Learning

How to Combine Tree-Search Methods in Reinforcement Learning

作者:互联网

郑重声明:原文参见标题,如有侵权,请联系作者,将会撤销发布!

 

AAAI 2019 Best Paper

 

Abstract

 

1 Introduction

 

2 Preliminaries

 

3 The h-Greedy Policy and h-PI

 

4 h-Greedy Consistency

 

5 The h-Greedy Policy Alone is Not Sufficient For Partial Evaluation

 

6 Backup the Tree-Search Byproducts

 

7 Relation to ExistingWork

 

8 Experiments

 

9 Summary and Future Work

 

A Proof of Lemma 1

 

B Affinity of Tπ and Consequences

 

C Proof of Proposition 2

 

D Proof of Theorem 3

 

E h-Greedy Consistency in Each Iteration

 

F A Note on the Alternative λ-Return Operator

 

G More Experimental Results

 

标签:Search,Methods,Tree,Greedy,Consistency,Policy,Proof
来源: https://www.cnblogs.com/lucifer1997/p/14017015.html