Strong Uniform Value in Gambling Houses and Partially Observable Markov Decision Processes
hal.structure.identifier | Centre d'économie de la Sorbonne [CES] | |
dc.contributor.author | Venel, Xavier
HAL ID: 8219 ORCID: 0000-0003-1150-9139 | |
hal.structure.identifier | CEntre de REcherches en MAthématiques de la DEcision [CEREMADE] | |
dc.contributor.author | Ziliotto, Bruno | |
dc.date.accessioned | 2019-11-08T10:44:08Z | |
dc.date.available | 2019-11-08T10:44:08Z | |
dc.date.issued | 2016 | |
dc.identifier.issn | 0363-0129 | |
dc.identifier.uri | https://basepub.dauphine.fr/handle/123456789/20210 | |
dc.language.iso | en | en |
dc.subject | dynamic programming | en |
dc.subject | Markov decision processes | en |
dc.subject | partial observation | en |
dc.subject | uniform value | en |
dc.subject | long-run average payoff | en |
dc.subject.ddc | 515 | en |
dc.title | Strong Uniform Value in Gambling Houses and Partially Observable Markov Decision Processes | en |
dc.type | Article accepté pour publication ou publié | |
dc.description.abstracten | In several standard models of dynamic programming (gambling houses, MDPs, POMDPs), we prove the existence of a robust notion of value for the infinitely repeated problem, namely the strong uniform value. This solves two open problems. First, this shows that for any > 0, the decision-maker has a pure strategy σ which is-optimal in any n-stage problem, provided that n is big enough (this result was only known for behavior strategies, that is, strategies which use randomization). Second, for any > 0, the decision-maker can guarantee the limit of the n-stage value minus in the infinite problem where the payoff is the expectation of the inferior limit of the time average payoff. | en |
dc.relation.isversionofjnlname | SIAM Journal on Control and Optimization | |
dc.relation.isversionofjnlvol | 54 | en |
dc.relation.isversionofjnlissue | 4 | en |
dc.relation.isversionofjnldate | 2016-08 | |
dc.relation.isversionofjnlpages | 1983-2008 | en |
dc.relation.isversionofdoi | 10.1137/15M1043340 | en |
dc.relation.isversionofjnlpublisher | SIAM - Society for Industrial and Applied Mathematics | en |
dc.subject.ddclabel | Analyse | en |
dc.relation.forthcoming | non | en |
dc.relation.forthcomingprint | non | en |
dc.description.ssrncandidate | non | en |
dc.description.halcandidate | non | en |
dc.description.readership | recherche | en |
dc.description.audience | International | en |
dc.relation.Isversionofjnlpeerreviewed | oui | en |
dc.relation.Isversionofjnlpeerreviewed | oui | en |
dc.date.updated | 2019-11-08T10:40:33Z | |
hal.author.function | aut | |
hal.author.function | aut |