Bamdev Mishra

Applied Machine Learning Researcher

Trace norm regularization

Low-rank optimization with trace norm penalty

Authors

B. Mishra, G. Meyer, F. Bach, and R. Sepulchre

Abstract

Problems of the form f(X) + λ||X||* are tackled, where f is a smooth convex function, ||X||* is the trace (or nuclear) norm of X, and λ >= 0 is a regularization parameter.

The paper addresses the problem of low-rank trace norm (also known as nuclear norm) minimization. We propose an algorithm that alternates between fixed-rank optimization and rank-one updates. The fixed-rank optimization is characterized by an efficient factorization that makes the trace norm differentiable in the search space and the computation of duality gap numerically tractable. The search space is nonlinear but is equipped with a particular Riemannian structure that leads to efficient computations. We present a second-order trust-region algorithm with a guaranteed quadratic rate of convergence. Overall, the proposed optimization scheme converges super-linearly to the global solution while still maintaining complexity that is linear in the number of rows of the matrix. To compute a set of solutions efficiently for a grid of regularization parameters we propose a predictor-corrector approach on the quotient manifold that outperforms the naive warm-restart approach. The performance of the proposed algorithm is illustrated on problems of low-rank matrix completion and multivariate linear regression.

Downloads

An example with an animated logo of ULg


This is an illustration of the code on a low-rank image of zeros and ones. 60% of the entries (pixels) are randomly removed with uniform probability of an approximately low-rank image of the University of Liege logo. The low-rank matrix completion code outputs a sequence of (globally optimal) solutions as the regularization parameter λ is varied. Below we show the recovery of the original image.


Recovered image with different ranks

Traversing different ranks.


Original image

Better prediction accuracy of regularization path

%d bloggers like this: