Upcoming Seminars

Co-sponsored by the Department of Computer Science and the Department of Electrical and Computer Engineering
Users who wish to exercise privacy rights or make privacy choices must often rely on website or app user interfaces. However, too often, these user interfaces suffer from usability deficiencies ranging from being…
Speaker
Previous Seminars

Many data-driven problems in the modern world involve solving nonconvex optimization problems. The large-scale nature of many of these problems also necessitates the use of first-order optimization methods, i.e., methods that rely only on the gradient information, for computational purposes. But the first-order optimization methods, which include the prototypical gradient descent algorithm, face a major hurdle for nonconvex optimization: since the gradient of a function vanishes at a saddle point, first-order methods can potentially get stuck at the saddle points of the objective function. And while recent works have established that the gradient descent algorithm almost surely escapes the saddle points under some mild conditions, there remains a concern that first-order methods can spend an inordinate amount of time in the saddle neighborhoods. It is in this regard that we revisit the behavior of the gradient descent trajectories within the saddle neighborhoods and ask whether it is possible to provide necessary and sufficient conditions under which these trajectories escape the saddle neighborhoods in linear time. The ensuing analysis relies on precise approximations of the discrete gradient descent trajectories in terms of the spectrum of the Hessian at the saddle point, and it leads to a simple boundary condition that can be readily checked to ensure the gradient descent method escapes the saddle neighborhoods of a class of nonconvex functions in linear time. The boundary condition check also leads to the development of a simple variant of the vanilla gradient descent method, termed Curvature Conditioned Regularized Gradient Descent (CCRGD). The talk concludes with a convergence analysis of the CCRGD algorithm, which includes its rate of convergence to a local minimum of a class of nonconvex optimization problems.
Speaker

Computer vision models trained on unparalleled amounts of data have revolutionized many applications. However, more and more historical societal biases are making their way into these seemingly innocuous systems.
Speaker

Nuclear fusion power is a potential source of safe, non-carbon-emitting and virtually limitless energy. The tokamak is a promising approach to fusion based on magnetic plasma confinement, constituting a complex physical system with many control challenges. However, plasma instabilities pose an existential threat to a reactor, which has not yet been solved. Since current physical understanding is not sufficiently advanced to reliably predict instabilities, a way forward is artificial intelligence and data-driven models.
Speaker

Nonsmooth regularisers are widely used in machine learning for enforcing solution structures (such as the l1 norm for sparsity or the nuclear norm for low rank). State of the art solvers are typically first order methods or coordinate descent methods which handle nonsmoothness by careful smooth approximations and support pruning. In this work, we revisit the approach of iteratively reweighted least squares (IRLS) and show how a simple reparameterization coupled with a bilevel resolution leads to a smooth unconstrained problem. We are therefore able to exploit the machinery of smooth optimisation, such as BFGS, to obtain local superlinear convergence. The result is a highly versatile approach which is able to significantly outperform state of the art methods for a wide range of problems.
Speaker

Inspired by the proposal of tangent kernels of neural networks (NNs), a recent research line aims to design kernels with a better generalization performance on standard datasets. Indeed, a few recent works showed that certain kernel machines perform as well as NNs on certain datasets, despite their separations in specific cases implied by theoretical results. Furthermore, it was shown that the induced kernels of convolutional neural networks perform much better than any former handcrafted kernels. These empirical results pose a theoretical challenge to understanding the performance gaps in kernel machines and NNs in different scenarios.
Speaker

VisualAI lab focuses on bringing together the fields of computer vision, machine learning, human-machine interaction as well as fairness, accountability and transparency. In this talk, we will introduce the general goal of the lab, and how to build an agent that can understand and follow human’s language to perform tasks.
Speaker

While exciting progress has been made in understanding the global convergence of vanilla gradient methods for solving challenging nonconvex problems in statistical estimation and machine learning, their computational efficacy is still far from satisfactory for ill-posed or ill-conditioned problems. In this talk, we discuss how the trick of preconditioning further boosts the convergence speed with minimal computation overheads through two examples: low-rank matrix estimation in statistical learning and policy optimization in entropy-regularized reinforcement learning.
Speaker

In this talk we study the problem of signal recovery for group models. More precisely for a given set of groups, each containing a small subset of indices, and for given linear sketches of the true signal vector which is known to be group-sparse in the sense that its support is contained in the union of a small number of these groups, we study algorithms which successfully recover the true signal just by the knowledge of its linear sketches. We derive model projection complexity results and algorithms for more general group models than the state-of-the-art. We consider two versions of the classical Iterative Hard Thresholding algorithm (IHT). The classical version iteratively calculates the exact projection of a vector onto the group model, while the approximate version (AM-IHT) uses a head- and a tail-approximation iteratively. We apply both variants to group models and analyze the two cases where the sensing matrix is a Gaussian matrix and a model expander matrix.
Speaker

Scientists and engineers are increasingly applying deep neural networks (DNNs) to modelling and design of complex systems. While the flexibility of DNNs makes them an attractive tool, it also makes their solutions difficult to interpret and their predictive capability difficult to quantify. In contrast, scientific models directly expose the…
Speaker

Random feature methods have been successful in various machine learning tasks, are easy to compute, and come with theoretical accuracy bounds. They serve as an alternative approach to standard neural networks since they can represent similar function spaces without a costly training phase. However, for accuracy, random feature methods require more measurements than trainable parameters, limiting their use for data-scarce applications or problems in scientific machine learning.