Deep Dive into High Dimensional Data Science with Andrew Stuart
The times and location for the tutorials/seminar given by Andrew Stuart (Caltech) are as follows:
Timetable
December 12th: 9:30-10:30, Bayes 5.10
|
Tutorial Lecture | Gaussian Processes: Interpolation, Regression and PDEs |
December 13th: 16:00, Bayes G.03
|
Seminar | Data Science Algorithms In High Dimensions |
Organisers: Benedict Leimkuhler and Aretha Teckentrup
The abstracts for the talks can be found below.
Abstracts
Tutorial Talk
Gaussian processes are versatile tools for function approximation. The ideas were systematically developed by the statistics community in the context of linear regression and spline models [1]. More recently they have been developed for a variety of problems in machine learning [2]; and they underpin certain collocation methods for differential equations [3]. In this tutorial I will describe the methodology from a unified perspective based on the calculus of variations, covering representer theorems, making comparisons with neural network based approximation theory, and showing how the approach may be used to solve nonlinear partial differential equations [4].
Seminar
Many of the objects that are routinely processed in data science may be thought of in two distinct ways: as high dimensional vectors, or as infinite dimensional functions. A canonical example is an image: when pixellated, it is represented as a vector of RGB values; but underlying this vector is a function ascribing RGB values at every point in a subset of Euclidean space. The pixellation depends on the specific level of resolution used to capture the image, and may change; in contrast the underlying function may be thought of as existing independently of any specific pixellation used to represent it.
Many classical vector-based algorithms used in data science do not scale well to changes in the dimension
of the high dimensional vector representation of the problem. However, by conceiving of the algorithms as acting on infinite dimensional functions (function-based algorithms), and only then introducing high dimensional vector representations, new algorithms can be developed; these new algorithms scale well to changes in dimension, yet can be implemented with minor modifications of existing vector-based algorithms. This philosophical approach to the development of algorithms is simple to adopt, but leads to orders of magnitude improvement in performance. The ideas will be illustrated in the context of Monte Carlo Markov Chainethods to sample probability distributions in Banach spaces; and in the context of thedata-driven approximation of maps between Banach spaces,and in particular supervised learning and deep neural networks.
[1] Spline Models for Observational Data, G. Wahba, CBMS-NSF Regional Conference Series in Applied Mathematics (SIAM), https://epubs.siam.org/doi/book/10.1137/1.9781611970128
[2] Gaussian Processes for Machine Learning, C. E. Rasmussen & C. K. I. Williams, MIT Press, 2006, http://gaussianprocess.org/gpml/chapters/RW.pdf
[3] Kernel techniques: From machine learning to meshless methods, R. Schaback and H. WendlandActa Numerica 2006, https://doi.org/10.1017/S0962492906270016
[4] Solving and Learning Nonlinear PDEs with Gaussian Processes, Y. Chen, B. Hosseini, H. Owhadi, A. M. Stuart
Journal of Computational Physics 2021, https://doi.org/10.1016/j.jcp.2021.110668