Gradient Descent: Cost Function
up vote
2
down vote
favorite
I'm trying to implement the gradient descent method for the problem of minimising the following function:
$$f(x) = frac{1}{2}(x-m)^{T}A(x-m)-sumlimits_{i=1}^nlogleft(x_i^{2}right),$$
where $x in R^n$ is a vector; $m in R^n$ is a fixed vector; and $A$ is a fixed positive definite matrix.
The only applications of gradient descent I have come across is for linear regression! So, as a starting point for helping me to solve this, I'd like to know in what situations this cost function would be applied. Does anyone out there recognise it?
gradient-descent
add a comment |
up vote
2
down vote
favorite
I'm trying to implement the gradient descent method for the problem of minimising the following function:
$$f(x) = frac{1}{2}(x-m)^{T}A(x-m)-sumlimits_{i=1}^nlogleft(x_i^{2}right),$$
where $x in R^n$ is a vector; $m in R^n$ is a fixed vector; and $A$ is a fixed positive definite matrix.
The only applications of gradient descent I have come across is for linear regression! So, as a starting point for helping me to solve this, I'd like to know in what situations this cost function would be applied. Does anyone out there recognise it?
gradient-descent
Your objective function is a special case of a quadratic cost function with a regularization term that uses the $log$ of the coefficients. $L_1$ or $L_2$ regularization cost functions are more typical, but you can impose a gentler penalty with a $log$ cost function.
– Aditya Dua
Nov 12 at 22:25
1
Thanks @AdityaDua. This response has been useful to me
– Barton
Nov 14 at 21:52
@AdityaDua do you know what the role of the matrix A would be in this cost function?
– Barton
2 days ago
I'd suggest reading up on "generalised least squares". The matrix A allows you to deal with correlated errors and unequal variances (Heteroscedasticity).
– Aditya Dua
2 days ago
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I'm trying to implement the gradient descent method for the problem of minimising the following function:
$$f(x) = frac{1}{2}(x-m)^{T}A(x-m)-sumlimits_{i=1}^nlogleft(x_i^{2}right),$$
where $x in R^n$ is a vector; $m in R^n$ is a fixed vector; and $A$ is a fixed positive definite matrix.
The only applications of gradient descent I have come across is for linear regression! So, as a starting point for helping me to solve this, I'd like to know in what situations this cost function would be applied. Does anyone out there recognise it?
gradient-descent
I'm trying to implement the gradient descent method for the problem of minimising the following function:
$$f(x) = frac{1}{2}(x-m)^{T}A(x-m)-sumlimits_{i=1}^nlogleft(x_i^{2}right),$$
where $x in R^n$ is a vector; $m in R^n$ is a fixed vector; and $A$ is a fixed positive definite matrix.
The only applications of gradient descent I have come across is for linear regression! So, as a starting point for helping me to solve this, I'd like to know in what situations this cost function would be applied. Does anyone out there recognise it?
gradient-descent
gradient-descent
edited Nov 13 at 13:31
Davide Giraudo
123k16149253
123k16149253
asked Nov 12 at 20:13
Barton
133
133
Your objective function is a special case of a quadratic cost function with a regularization term that uses the $log$ of the coefficients. $L_1$ or $L_2$ regularization cost functions are more typical, but you can impose a gentler penalty with a $log$ cost function.
– Aditya Dua
Nov 12 at 22:25
1
Thanks @AdityaDua. This response has been useful to me
– Barton
Nov 14 at 21:52
@AdityaDua do you know what the role of the matrix A would be in this cost function?
– Barton
2 days ago
I'd suggest reading up on "generalised least squares". The matrix A allows you to deal with correlated errors and unequal variances (Heteroscedasticity).
– Aditya Dua
2 days ago
add a comment |
Your objective function is a special case of a quadratic cost function with a regularization term that uses the $log$ of the coefficients. $L_1$ or $L_2$ regularization cost functions are more typical, but you can impose a gentler penalty with a $log$ cost function.
– Aditya Dua
Nov 12 at 22:25
1
Thanks @AdityaDua. This response has been useful to me
– Barton
Nov 14 at 21:52
@AdityaDua do you know what the role of the matrix A would be in this cost function?
– Barton
2 days ago
I'd suggest reading up on "generalised least squares". The matrix A allows you to deal with correlated errors and unequal variances (Heteroscedasticity).
– Aditya Dua
2 days ago
Your objective function is a special case of a quadratic cost function with a regularization term that uses the $log$ of the coefficients. $L_1$ or $L_2$ regularization cost functions are more typical, but you can impose a gentler penalty with a $log$ cost function.
– Aditya Dua
Nov 12 at 22:25
Your objective function is a special case of a quadratic cost function with a regularization term that uses the $log$ of the coefficients. $L_1$ or $L_2$ regularization cost functions are more typical, but you can impose a gentler penalty with a $log$ cost function.
– Aditya Dua
Nov 12 at 22:25
1
1
Thanks @AdityaDua. This response has been useful to me
– Barton
Nov 14 at 21:52
Thanks @AdityaDua. This response has been useful to me
– Barton
Nov 14 at 21:52
@AdityaDua do you know what the role of the matrix A would be in this cost function?
– Barton
2 days ago
@AdityaDua do you know what the role of the matrix A would be in this cost function?
– Barton
2 days ago
I'd suggest reading up on "generalised least squares". The matrix A allows you to deal with correlated errors and unequal variances (Heteroscedasticity).
– Aditya Dua
2 days ago
I'd suggest reading up on "generalised least squares". The matrix A allows you to deal with correlated errors and unequal variances (Heteroscedasticity).
– Aditya Dua
2 days ago
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
From the question, we have that f is a convex quadratic function
$$f(x) = frac{1}{2}(x-m)^{mathrm T} mathrm A (x-m)-sumlimits_{i=1}^nlogleft(x_i^{2}right),$$
where $x in R^n$ is a vector; $m in R^n$ is a fixed vector, and $A$ is a fixed positive definite matrix (symmetric and positive semidefinite).
$$nabla f ( x) = frac{mathrm T+1}{2} mathrm A (x-m)^{mathrm T}-sumlimits_{i=1}^nleft(frac{2x}{x_i^{2}ln(10)}right)$$
Using gradient descent with step $mu$,
$$ x_{k+1} = x_k - mu nabla f ( x_k)$$
Choose $mu$ such that $f(x_k) < f(x_{k+1}) $,then do a loop until we find $x^{*}: f(x^{*}_k) - f(x^{*}_{k+1}) sim 0$
This is a general way to take gradient descent for a convex quadratic function in n-dimensional space. Hope it is helpful.
New contributor
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
From the question, we have that f is a convex quadratic function
$$f(x) = frac{1}{2}(x-m)^{mathrm T} mathrm A (x-m)-sumlimits_{i=1}^nlogleft(x_i^{2}right),$$
where $x in R^n$ is a vector; $m in R^n$ is a fixed vector, and $A$ is a fixed positive definite matrix (symmetric and positive semidefinite).
$$nabla f ( x) = frac{mathrm T+1}{2} mathrm A (x-m)^{mathrm T}-sumlimits_{i=1}^nleft(frac{2x}{x_i^{2}ln(10)}right)$$
Using gradient descent with step $mu$,
$$ x_{k+1} = x_k - mu nabla f ( x_k)$$
Choose $mu$ such that $f(x_k) < f(x_{k+1}) $,then do a loop until we find $x^{*}: f(x^{*}_k) - f(x^{*}_{k+1}) sim 0$
This is a general way to take gradient descent for a convex quadratic function in n-dimensional space. Hope it is helpful.
New contributor
add a comment |
up vote
0
down vote
From the question, we have that f is a convex quadratic function
$$f(x) = frac{1}{2}(x-m)^{mathrm T} mathrm A (x-m)-sumlimits_{i=1}^nlogleft(x_i^{2}right),$$
where $x in R^n$ is a vector; $m in R^n$ is a fixed vector, and $A$ is a fixed positive definite matrix (symmetric and positive semidefinite).
$$nabla f ( x) = frac{mathrm T+1}{2} mathrm A (x-m)^{mathrm T}-sumlimits_{i=1}^nleft(frac{2x}{x_i^{2}ln(10)}right)$$
Using gradient descent with step $mu$,
$$ x_{k+1} = x_k - mu nabla f ( x_k)$$
Choose $mu$ such that $f(x_k) < f(x_{k+1}) $,then do a loop until we find $x^{*}: f(x^{*}_k) - f(x^{*}_{k+1}) sim 0$
This is a general way to take gradient descent for a convex quadratic function in n-dimensional space. Hope it is helpful.
New contributor
add a comment |
up vote
0
down vote
up vote
0
down vote
From the question, we have that f is a convex quadratic function
$$f(x) = frac{1}{2}(x-m)^{mathrm T} mathrm A (x-m)-sumlimits_{i=1}^nlogleft(x_i^{2}right),$$
where $x in R^n$ is a vector; $m in R^n$ is a fixed vector, and $A$ is a fixed positive definite matrix (symmetric and positive semidefinite).
$$nabla f ( x) = frac{mathrm T+1}{2} mathrm A (x-m)^{mathrm T}-sumlimits_{i=1}^nleft(frac{2x}{x_i^{2}ln(10)}right)$$
Using gradient descent with step $mu$,
$$ x_{k+1} = x_k - mu nabla f ( x_k)$$
Choose $mu$ such that $f(x_k) < f(x_{k+1}) $,then do a loop until we find $x^{*}: f(x^{*}_k) - f(x^{*}_{k+1}) sim 0$
This is a general way to take gradient descent for a convex quadratic function in n-dimensional space. Hope it is helpful.
New contributor
From the question, we have that f is a convex quadratic function
$$f(x) = frac{1}{2}(x-m)^{mathrm T} mathrm A (x-m)-sumlimits_{i=1}^nlogleft(x_i^{2}right),$$
where $x in R^n$ is a vector; $m in R^n$ is a fixed vector, and $A$ is a fixed positive definite matrix (symmetric and positive semidefinite).
$$nabla f ( x) = frac{mathrm T+1}{2} mathrm A (x-m)^{mathrm T}-sumlimits_{i=1}^nleft(frac{2x}{x_i^{2}ln(10)}right)$$
Using gradient descent with step $mu$,
$$ x_{k+1} = x_k - mu nabla f ( x_k)$$
Choose $mu$ such that $f(x_k) < f(x_{k+1}) $,then do a loop until we find $x^{*}: f(x^{*}_k) - f(x^{*}_{k+1}) sim 0$
This is a general way to take gradient descent for a convex quadratic function in n-dimensional space. Hope it is helpful.
New contributor
edited yesterday
New contributor
answered yesterday
AnNg
375
375
New contributor
New contributor
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2995797%2fgradient-descent-cost-function%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Your objective function is a special case of a quadratic cost function with a regularization term that uses the $log$ of the coefficients. $L_1$ or $L_2$ regularization cost functions are more typical, but you can impose a gentler penalty with a $log$ cost function.
– Aditya Dua
Nov 12 at 22:25
1
Thanks @AdityaDua. This response has been useful to me
– Barton
Nov 14 at 21:52
@AdityaDua do you know what the role of the matrix A would be in this cost function?
– Barton
2 days ago
I'd suggest reading up on "generalised least squares". The matrix A allows you to deal with correlated errors and unequal variances (Heteroscedasticity).
– Aditya Dua
2 days ago