Playout Policy Adaptation with Move Features
Cazenave, Tristan (2016), Playout Policy Adaptation with Move Features, Theoretical Computer Science, 644, p. 43-52. 10.1016/j.tcs.2016.06.024
Type
Article accepté pour publication ou publiéDate
2016Journal name
Theoretical Computer ScienceVolume
644Publisher
Elsevier
Pages
43-52
Publication identifier
Metadata
Show full item recordAuthor(s)
Cazenave, TristanLaboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Abstract (EN)
Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We also propose to learn a policy not only using the moves but also according to the features of the moves. We test the resulting algorithms named Playout Policy Adaptation (PPA) and Playout Policy Adaptation with move Features (PPAF) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Domineering, Knightthrough, MisereKnightthrough and Nogo. The experiments compare PPA and PPAF to Upper Confidence for Trees (UCT) and to the closely related Move-Average Sampling Technique (MAST) algorithm.Subjects / Keywords
Computer Games; Monte Carlo Tree Search; Reinforcement Learning; Playout policy; Machine learningRelated items
Showing items related by title and author.
-
Sironi, Chiara; Cazenave, Tristan; Winands, Mark (2021) Communication / Conférence
-
Cazenave, Tristan; Teytaud, Fabien (2012) Communication / Conférence
-
Cazenave, Tristan; Diemert, Eustache (2018) Communication / Conférence
-
Cazenave, Tristan; Teytaud, Fabien (2012) Communication / Conférence
-
Cazenave, Tristan; Sevestre, Jean-Baptiste; Toulemont, Matthieu (2020) Communication / Conférence