Reconsidering the Frank-Wolfe Law, Part 1 (The Future of Machine Learning) | Written by Monodeep Mukherjee

Non-smooth Frank-Wolfe algorithm with dual cut plane approach

Author: Guilherme Mazzanti, Thibault Moquet, Laurent Pfeiffer

Abstract: An extension of the Frank-Wolfe algorithm (FWA), also known as the conditional gradient algorithm, is proposed. The standard form of FWA allows you to call the Linear Minimization Oracle at each iteration to solve a constrained optimization problem involving a β-smooth cost function. More specifically, the oracle solves the problem obtained by linearizing the original cost function. The algorithm designed and investigated in this article, called the Dualized Level-Set (DLS) algorithm, extends FWA to allow it to address classes of non-smooth costs related to specific support functions. The key idea behind the construction of the DLS method is his general interpretation of FWA as a cutting plane algorithm from a dual perspective. The DLS algorithm essentially results from the duplication of a certain cutting plane algorithm based on projections on several level sets. The DLS algorithm generates a set of primal dual candidates and proves that the corresponding primal dual gaps converge at a rate of O(1/t√).

2. Forward gradient-based Frank-Wolfe optimization for memory-efficient deep neural network training.

Author: M. Rostami, SS Kia

Summary: Training a deep neural network using gradient-based methods requires computing gradients at each level. However, computing gradients using backpropagation or inverse mode differentiation requires a large amount of memory consumption, making backpropagation an inefficient way to compute gradients. This paper focuses on analyzing the performance of the well-known Frank-Wolfe algorithm, also known as the conditional gradient algorithm, by accessing the forward mode of automatic differentiation for computing gradients. Detailed technical details show that the proposed algorithm converges to an optimal solution with a sublinear convergence rate by accessing a noisy estimate of the true slope obtained in a forward mode called automatic differentiation We provide. Projected forward slope. In contrast, the standard Frank-Wolfe algorithm cannot converge to an optimal solution when provided with access to the projection forward gradient. A numerical example is used to demonstrate the convergence properties of the proposed algorithm.

3. Gridless 2D reconstruction of lines using sliding Frank-Wolfe algorithm

Author: Kevin Polisano, Basil Dubois-Bonnaire, Sylvain Meinen

Abstract: To address the challenge of line recovery in degraded images, we present a novel approach leveraging the Sliding Frank – Wolfe algorithm. Based on advances in conditional gradient methods for sparse inverse problems using differentiable measurement models, it is tailored to line detection tasks in the domain of blurred line deconvolution and ridge detection of linear chirps in spectrogram images. He proposes two different models.

Source link