General Description:
S-FISTA (smoothed FISTA) method for solving problems of the form
Assumptions:
Available Oracles:
(the function values of f,g and h are not required in the economical setting)
Usage:
out = sfista(Ffun,Ffun_grad,Gfun,Gfun_prox,Hfun,Hfun_prox,Afun,Atfun,mu,lambda_G,lambda_H,startx,[par]) [out,fmin] = sfista(Ffun,Ffun_grad,Gfun,Gfun_prox,Hfun,Hfun_prox,Afun,Atfun,mu,lambda_G,lambda_H,startx,[par]) [out,fmin,parout] = sfista(Ffun,Ffun_grad,Gfun,Gfun_prox,Hfun,Hfun_prox,Afun,Atfun,mu,lambda_G,lambda_H,startx,[par]) Input:Ffun - function handle for f Ffun_grad - function handle for the gradient of f
Gfun - function handle for g
Gfun_prox - function handle for the proximal mapping of g times a positive constant
Hfun - function handle for h (if h is extended real-valued, it is enough to specify the operation of h on its domain) Hfun_prox - function handle for the proximal mapping of h times a positive constant Afun - function handle for the linear transformation A Atfun - function handle for the adjoint of the linear transformation A mu - positive smoothing parameter lambda_G - positive scalar penalty for the function g lambda_H - positive scalar penalty for the function h startx - starting vector par - struct containing different values required for the operation of sfista
Output:out - optimal solution (up to a tolerance) fmin - optimal value (up to a tolerance) parout - a struct containing containing additional information related to the convergence. The fields of parout are: iterNum - number of performed iterations
funValVec - vector of all function values generated by the method Method Description:
Employs FISTA on the following smoothed version of the model:
where
is the Moreau envelope of g with parameter mu. The smooth part is taken as
and the nonsmooth component is
Besides "continuation", ", "gamma" and "cont_cycles", all the parameters in par are related to FISTA.
References:
Consider the problem
where Q, a and A are given by
>> randn('seed',314); >> A = randn(30,20); >> Q = randn(20); >> Q = Q * Q'; >> a = randn(20,1); Note that by its construction, Q is positive definite, and therefore the problem is convex. To solve the problem, we will run sfista with
and with mu=0.01.
>> Afun = @(x) A*x; >> Atfun = @(x) A'*x; >> f=@(x) (x-a)'*Q*(x-a); >> grad_f = @(x) 2*Q*(x-a) ; >> g = @(x) norm(x,2); >> h = @(x) norm(x,1); >> mu = 0.01 ; >> lambda_g=0.25; >> lambda_h=0.4; >> [xs,fun_xs,parout] = sfista( @(x) f(x) , @(x) grad_f(x),@(x) g(x),@(x,a) prox_Euclidean_norm(x,a),@(x) h(x),@(x,a) prox_l1(x,a), @(x) A*x , @(x)... A'*x,mu,lambda_g,lambda_h,ones(20,1)); ********************* FISTA ********************* #iter fun. val. L val. 1 270.280178 128.000000 2 156.805102 128.000000 3 97.019766 128.000000 4 65.695242 128.000000 5 47.036194 128.000000 : : :
200 11.636340 128.000000 227 11.636340 128.000000 228 11.636339 128.000000 229 11.636339 128.000000 257 11.636339 128.000000 285 11.636339 128.000000 Stopping because the norm of the difference between consecutive iterates is too small ---------------------------------- Optimal value = 11.636339 Original function value = 11.637589 Note that the "optimal value" is the function value corresponding to the best smoothed objective function value. "Original function value" is the corresponding function value without any smoothing (that is, when using g rather than g_mu). The amount of iterations employed by sfista is 299:
>> parout.iterNum ans = 299 It is possible to use continuation, meaning that the smoothing parameter is initialized as a large value that gradually decreases to the required value. Several cycles of iterations are employed, where in each cycle, FISTA is employed on the problem with a specific smoothing parameter.
>> parout.CycleIter ans = 73 2 1 2 74 The overall number of iterations when using continuation (152) is obviously smaller than the amount of iterations required in the version without continuation (299).
|
Solvers >