Volkan Cevher, Jonathan Mark Scarlett, Ilija Bogunovic
We consider the sequential Bayesian optimization problem with bandit feedback, adopting a formulation that allows for the reward function to vary with time. We model the reward function using a Gaussian process whose evolution obeys a simple Markov model. ...
2016