# Overview

##### [ Français ]

The thematic activity focuses on mathematical challenges of machine learning. The spectacular success of machine learning in a wide range of applications opens many exciting theoretical challenges in a number of mathematical fields, including probability, statistics, combinatorics, optimization, and geometry. The CRM will bring together researchers of machine learning and mathematics to discuss these problems. The principal topics include combinatorial statistics, online learning, and deep neural networks.

The main activities include a workshop on " Combinatorial Statistics " and another one on " Modern Challenges in Learning Theory, " as well as regular seminars given by the invited researchers and scholars-in-residence.

**Opening keynote lecture on Monday April 16 with Yoshua Bengio.**

11:30 - 12:30

Université de Montréal, Pavillon André-Aisenstadt, room1360

**Deep Learning for AI **

There has been rather impressive progress recently with brain-inspired statistical learning algorithms based on the idea of learning multiple levels of representation, also known as neural networks or deep learning. They shine in artificial intelligence tasks involving perception and generation of sensory data like images or sounds and to some extent in understanding and generating natural language. We have proposed new generative models which lead to training frameworks very different from the traditional maximum likelihood framework, and borrowing from game theory. Theoretical understanding of the success of deep learning is work in progress but relies on representation aspects as well as optimization aspects, which interact. At the heart is the ability of these learning mechanisms to capitalize on the compositional nature of the underlying data distributions, meaning that some functions can be represented exponentially more efficiently with deep distributed networks compared to approaches like standard non-parametric methods which lack both depth and distributed representations. On the optimization side, we now have evidence that local minima (due to the highly non-convex nature of the training objective) may not be as much of a problem as thought a few years ago, and that training with variants of stochastic gradient descent actually helps to quickly find better-generalizing solutions. Finally, new interesting questions and answers are arising regarding learning theory for deep networks, why even very large networks do not necessarily overfit and how the representation-forming structure of these networks may give rise to better error bounds which do not absolutely depend on the iid data hypothesis.

**Monday April 23-Thursday April 26**

Workshop on Modern Challenges of Learning Theory

24 invited speakers.

Open to all scholars. Registration closed.

**Monday April 30-Friday May 4**

Workshop on Combinatorial Statistics (by invitation only).

One minicourse by Yuval Peres (Microsoft Research).