Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Randomized trace estimation is a popular and well-studied technique that approximates the trace of a large-scale matrix B by computing the average of x(T) Bx for many samples of a random vector X. Often, B is symmetric positive definite (SPD) but a number of applications give rise to indefinite B. Most notably, this is the case for log-determinant estimation, a task that features prominently in statistical learning, for instance in maximum likelihood estimation for Gaussian process regression. The analysis of randomized trace estimates, including tail bounds, has mostly focused on the SPD case. In this work, we derive new tail bounds for randomized trace estimates applied to indefinite B with Rademacher or Gaussian random vectors. These bounds significantly improve existing results for indefinite B, reducing the number of required samples by a factor n or even more, where n is the size of B. Even for an SPD matrix, ourwork improves an existing result by Roosta-Khorasani and Ascher (Found Comput Math, 15(5):1187-1212, 2015) for Rademacher vectors. This work also analyzes the combination of randomized trace estimates with the Lanczos method for approximating the trace of f(B). Particular attention is paid to the matrix logarithm, which is needed for log-determinant estimation. We improve and extend an existing result, to not only cover Rademacher but also Gaussian random vectors.