Skip to main content

Understanding Linear Convolutional Neural Networks via Sparse Factorizations of Real Polynomials (and Decomposing Linear Group-Equivariant Networks)

Save to calendar

Mar 21

Date and time: 21 March 2024, 14:00-15:00 CET
Speaker: Kathlén Kohn, KTH
Title: Understanding Linear Convolutional Neural Networks via Sparse Factorizations of Real Polynomials (and Decomposing Linear Group-Equivariant Networks)

Where: Room Q26, floor 3, Malvinas Väg 6A, at KTH main campus
Zoom:
https://kth-se.zoom.us/j/69560887455
Meeting ID: 695 6088 7455

Moderator: Bastien Dubail, bastdub@kth.se / Alexandre Proutiere, alepro@kth.se
Administrator: Alexandre Proutiere, alepro@kth.se

Abstract: This talk will explain that Convolutional Neural Networks without activation parametrize  polynomials that admit a certain sparse factorization. For a fixed network architecture, these polynomials form a semialgebraic set. We will investigate how the geometry of this semialgebraic set (e.g., its singularities and relative boundary) changes with the network architecture. Moreover, we will explore how these geometric properties affect the optimization of a loss function for given training data.

We prove that for architectures where all strides are larger than one and generic data, the non-zero critical points of the squared-error loss are smooth interior points of the semialgebraic function space. This property is known to be false for dense linear networks or linear convolutional networks with stride one.

For linear networks, that are equivariant under the action of some group, we prove that no fixed network architecture can parametrize the whole space of functions, but that finitely many architectures can exhaust the whole space of linear equivariant functions.

This talk is based on joint work with Joan Bruna, Guido Montúfar, Anna-Laura Sattelberger, Vahid Shahverdi, and Matthew Trager.

Biography:

Link to speaker’s personal website