• xmlui.mirage2.page-structure.header.title
    • français
    • English
  • Help
  • Login
  • Language 
    • Français
    • English
View Item 
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

BIRDResearch centres & CollectionsBy Issue DateAuthorsTitlesTypeThis CollectionBy Issue DateAuthorsTitlesType

My Account

LoginRegister

Statistics

Most Popular ItemsStatistics by CountryMost Popular Authors
Thumbnail - No thumbnail

Variance Reduction in Actor Critic Methods (ACM)

Benhamou, Éric (2019), Variance Reduction in Actor Critic Methods (ACM). https://basepub.dauphine.fr/handle/123456789/21200

Type
Document de travail / Working paper
External document link
https://hal.archives-ouvertes.fr/hal-02886487
Date
2019
Series title
Preprint Lamsade
Published in
Paris
Metadata
Show full item record
Author(s)
Benhamou, Éric
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Abstract (EN)
After presenting Actor Critic Methods (ACM), we show ACM are control variate estimators. Using the projection theorem, we prove that the Q and Advantage Actor Critic (A2C) methods are optimal in the sense of the L 2 norm for the control variate estima-tors spanned by functions conditioned by the current state and action. This straightforward application of Pythagoras theorem provides a theoretical justification of the strong performance of QAC and AAC most often referred to as A2C methods in deep policy gradient methods. This enables us to derive a new formulation for Advantage Actor Critic methods that has lower variance and improves the traditional A2C method.
Subjects / Keywords
Actor critic method; Variance reduction; Projection; Deep RL

Related items

Showing items related by title and author.

  • Thumbnail
    Similarities between policy gradient methods (PGM) in reinforcement learning (RL) and supervised learning (SL) 
    Benhamou, Éric (2019) Document de travail / Working paper
  • Thumbnail
    Gram Charlier and Edgeworth expansion for sample variance 
    Benhamou, Eric (2018) Article accepté pour publication ou publié
  • Thumbnail
    Three remarkable properties of the Normal distribution for simple variance 
    Benhamou, Eric; Guez, Beatrice; Paris, Nicolas (2018) Article accepté pour publication ou publié
  • Thumbnail
    A few properties of sample variance 
    Benhamou, Eric (2018) Document de travail / Working paper
  • Thumbnail
    Kalman filter demystified: from intuition to probabilistic graphical model to real case in financial markets 
    Benhamou, Eric (2018) Document de travail / Working paper
Dauphine PSL Bibliothèque logo
Place du Maréchal de Lattre de Tassigny 75775 Paris Cedex 16
Phone: 01 44 05 40 94
Contact
Dauphine PSL logoEQUIS logoCreative Commons logo