Optimal Stopping and Dynamic Allocation

New Image

A class of optimal stopping problems for the Wiener process is studied herein, and asymptotic expansions for the optimal stopping boundaries are derived. These results lead to a simple index-type class of asymptotically optimal solutions to the classical discounted multi-armed bandit problem: given a discount factor 01 and k populations with densities from an exponential family, how should x sub 1, x sub 2,... be sampled sequentially from these populations to maximize the expected value of SIGMA sup (approaches infinity) sub 1 beta sup (i-1) x sub i, in ignorance of the parameters of the densities?