Bayesian inference provides an attractive learning framework to analyze data, and to sequentially update knowledge on streaming data. Unfortunately, exact Bayesian inference is rarely feasible in practice and approximation methods are usually employed, but do such methods preserve the generalization properties of Bayesian inference? In this talk, I will show that it is indeed the case for some variational inference (VI) algorithms. First, I will show generalization bounds for estimation in the batch setting. These results can be seen as extensions of the "concentration of the posterior" theorems to variational approximations of the posterior. I will then focus on the sequential case (streaming data). I will propose various online VI algorithms and derive generalization bounds. In this case, our theoretical result relies on the convexity of the variational objective, but we argue that our result should hold more generally and present empirical evidence in support of this.
Joint works with James Ridgway (https://arxiv.org/abs/1706.09293), Badr-Eddine Chérief-Abdellatif (https://projecteuclid.org/euclid.ejs/1537344604) and Emti Khan (will appear on arXiv in the next few days)