sklearn log_loss different number of classes

I'm using log_loss with sklearn

from sklearn.metrics import log_loss

print log_loss(true, pred,normalize=False)

and i have following error:

ValueError: y_true and y_pred have different number of classes 38, 2

It is really strange to me since, the arrays look valid:

print pred.shape

print np.unique(pred)

print np.unique(pred).size

(19191L,)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

 25 26 27 28 29 30 31 32 33 34 35 36 37]

38



print true.shape

print np.unique(true)

print np.unique(true).size

(19191L,)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

 25 26 27 28 29 30 31 32 33 34 35 36 37]

38

What is wrong with the log_loss? Why it throws the error?

Sample data:

pred: array([ 0,  1,  2, ...,  3, 12, 16], dtype=int64)

true: array([ 0,  1,  2, ...,  3, 12, 16])

edited Nov 9 '15 at 19:29

asked Nov 9 '15 at 18:52

Ablomis

3515

Can you post some data for pred and true? It looks like your labels are being passed incorrectly.

– ryanmc
Nov 9 '15 at 19:25

Added to the original post

– Ablomis
Nov 9 '15 at 19:28

2

Log loss is to be used to assess the accuracy of probabilities - it is expecting an array of probabilities associated with every possible label (you are passing only label). I believe your pred variable should be an array of n-arrays, where n is the number of labels.

– ryanmc
Nov 9 '15 at 20:03

add a comment |

I'm using log_loss with sklearn

from sklearn.metrics import log_loss

print log_loss(true, pred,normalize=False)

and i have following error:

ValueError: y_true and y_pred have different number of classes 38, 2

It is really strange to me since, the arrays look valid:

print pred.shape

print np.unique(pred)

print np.unique(pred).size

(19191L,)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

 25 26 27 28 29 30 31 32 33 34 35 36 37]

38



print true.shape

print np.unique(true)

print np.unique(true).size

(19191L,)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

 25 26 27 28 29 30 31 32 33 34 35 36 37]

38

What is wrong with the log_loss? Why it throws the error?

Sample data:

pred: array([ 0,  1,  2, ...,  3, 12, 16], dtype=int64)

true: array([ 0,  1,  2, ...,  3, 12, 16])

edited Nov 9 '15 at 19:29

asked Nov 9 '15 at 18:52

Ablomis

3515

Can you post some data for pred and true? It looks like your labels are being passed incorrectly.

– ryanmc
Nov 9 '15 at 19:25

Added to the original post

– Ablomis
Nov 9 '15 at 19:28

2

Log loss is to be used to assess the accuracy of probabilities - it is expecting an array of probabilities associated with every possible label (you are passing only label). I believe your pred variable should be an array of n-arrays, where n is the number of labels.

– ryanmc
Nov 9 '15 at 20:03

add a comment |

I'm using log_loss with sklearn

from sklearn.metrics import log_loss

print log_loss(true, pred,normalize=False)

and i have following error:

ValueError: y_true and y_pred have different number of classes 38, 2

It is really strange to me since, the arrays look valid:

print pred.shape

print np.unique(pred)

print np.unique(pred).size

(19191L,)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

 25 26 27 28 29 30 31 32 33 34 35 36 37]

38



print true.shape

print np.unique(true)

print np.unique(true).size

(19191L,)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

 25 26 27 28 29 30 31 32 33 34 35 36 37]

38

What is wrong with the log_loss? Why it throws the error?

Sample data:

pred: array([ 0,  1,  2, ...,  3, 12, 16], dtype=int64)

true: array([ 0,  1,  2, ...,  3, 12, 16])

edited Nov 9 '15 at 19:29

asked Nov 9 '15 at 18:52

Ablomis

3515

I'm using log_loss with sklearn

from sklearn.metrics import log_loss

print log_loss(true, pred,normalize=False)

and i have following error:

ValueError: y_true and y_pred have different number of classes 38, 2

It is really strange to me since, the arrays look valid:

print pred.shape

print np.unique(pred)

print np.unique(pred).size

(19191L,)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

 25 26 27 28 29 30 31 32 33 34 35 36 37]

38



print true.shape

print np.unique(true)

print np.unique(true).size

(19191L,)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

 25 26 27 28 29 30 31 32 33 34 35 36 37]

38

What is wrong with the log_loss? Why it throws the error?

Sample data:

pred: array([ 0,  1,  2, ...,  3, 12, 16], dtype=int64)

true: array([ 0,  1,  2, ...,  3, 12, 16])

python scikit-learn

edited Nov 9 '15 at 19:29

asked Nov 9 '15 at 18:52

Ablomis

3515

edited Nov 9 '15 at 19:29

asked Nov 9 '15 at 18:52

Ablomis

3515

edited Nov 9 '15 at 19:29

asked Nov 9 '15 at 18:52

Ablomis

3515

asked Nov 9 '15 at 18:52

Ablomis

3515

asked Nov 9 '15 at 18:52

Ablomis

3515

Can you post some data for pred and true? It looks like your labels are being passed incorrectly.

– ryanmc
Nov 9 '15 at 19:25

Added to the original post

– Ablomis
Nov 9 '15 at 19:28

2

Log loss is to be used to assess the accuracy of probabilities - it is expecting an array of probabilities associated with every possible label (you are passing only label). I believe your pred variable should be an array of n-arrays, where n is the number of labels.

– ryanmc
Nov 9 '15 at 20:03

add a comment |

Can you post some data for pred and true? It looks like your labels are being passed incorrectly.

– ryanmc
Nov 9 '15 at 19:25

Added to the original post

– Ablomis
Nov 9 '15 at 19:28

2

Log loss is to be used to assess the accuracy of probabilities - it is expecting an array of probabilities associated with every possible label (you are passing only label). I believe your pred variable should be an array of n-arrays, where n is the number of labels.

– ryanmc
Nov 9 '15 at 20:03

Can you post some data for pred and true? It looks like your labels are being passed incorrectly.

– ryanmc
Nov 9 '15 at 19:25

Added to the original post

– Ablomis
Nov 9 '15 at 19:28

Log loss is to be used to assess the accuracy of probabilities - it is expecting an array of probabilities associated with every possible label (you are passing only label). I believe your pred variable should be an array of n-arrays, where n is the number of labels.

– ryanmc
Nov 9 '15 at 20:03

add a comment |

3 Answers
3

active

oldest

votes

It's simple, you are using the prediction and not the probability of your prediction. Your pred variable contains [ 1 2 1 3 .... ] but to use log_loss it should contain something like [[ 0.1, 0.8, 0.1] [ 0.0, 0.79 , 0.21] .... ]. to obtain these probabilities use the function predict_proba:

pred = model.predict_proba(x_test)

eval = log_loss(y_true,pred)

edited Nov 21 '18 at 9:44

answered Mar 22 '17 at 13:37

deltascience

1,49222652

add a comment |

Inside the log_loss method, true array is fit and transformed by a LabelBinarizer which changes its dimensions. So, the check that true and pred have similar dimensions doesn't mean that log_loss method will work because true's dimensions change. If you just have binary classes, I suggest you use this log_loss cost function else for multiple classes, this method doesn't work.

answered Jun 29 '16 at 8:01

Hima Varsha

226111

add a comment |

From the log_loss documentation:

y_pred : array-like of float, shape = (n_samples, n_classes) or (n_samples,)

Predicted probabilities, as returned by a classifier’s predict_proba method. If y_pred.shape = (n_samples,) the probabilities provided are assumed to be that of the positive class. The labels in y_pred are assumed to be ordered alphabetically, as done by preprocessing.LabelBinarizer.

You need to pass probabilities not the prediction labels.

answered Mar 22 '17 at 7:37

ug2409

644

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f33616102%2fsklearn-log-loss-different-number-of-classes%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

pred = model.predict_proba(x_test)

eval = log_loss(y_true,pred)

edited Nov 21 '18 at 9:44

answered Mar 22 '17 at 13:37

deltascience

1,49222652

add a comment |

pred = model.predict_proba(x_test)

eval = log_loss(y_true,pred)

edited Nov 21 '18 at 9:44

answered Mar 22 '17 at 13:37

deltascience

1,49222652

add a comment |

pred = model.predict_proba(x_test)

eval = log_loss(y_true,pred)

edited Nov 21 '18 at 9:44

answered Mar 22 '17 at 13:37

deltascience

1,49222652

pred = model.predict_proba(x_test)

eval = log_loss(y_true,pred)

edited Nov 21 '18 at 9:44

answered Mar 22 '17 at 13:37

deltascience

1,49222652

edited Nov 21 '18 at 9:44

answered Mar 22 '17 at 13:37

deltascience

1,49222652

answered Mar 22 '17 at 13:37

deltascience

1,49222652

answered Mar 22 '17 at 13:37

deltascience

1,49222652

add a comment |

answered Jun 29 '16 at 8:01

Hima Varsha

226111

add a comment |

answered Jun 29 '16 at 8:01

Hima Varsha

226111

add a comment |

answered Jun 29 '16 at 8:01

Hima Varsha

226111

answered Jun 29 '16 at 8:01

Hima Varsha

226111

answered Jun 29 '16 at 8:01

Hima Varsha

226111

answered Jun 29 '16 at 8:01

Hima Varsha

226111

answered Jun 29 '16 at 8:01

Hima Varsha

226111

add a comment |

From the log_loss documentation:

y_pred : array-like of float, shape = (n_samples, n_classes) or (n_samples,)

Predicted probabilities, as returned by a classifier’s predict_proba method. If y_pred.shape = (n_samples,) the probabilities provided are assumed to be that of the positive class. The labels in y_pred are assumed to be ordered alphabetically, as done by preprocessing.LabelBinarizer.

You need to pass probabilities not the prediction labels.

answered Mar 22 '17 at 7:37

ug2409

644

add a comment |

From the log_loss documentation:

y_pred : array-like of float, shape = (n_samples, n_classes) or (n_samples,)

Predicted probabilities, as returned by a classifier’s predict_proba method. If y_pred.shape = (n_samples,) the probabilities provided are assumed to be that of the positive class. The labels in y_pred are assumed to be ordered alphabetically, as done by preprocessing.LabelBinarizer.

You need to pass probabilities not the prediction labels.

answered Mar 22 '17 at 7:37

ug2409

644

add a comment |

From the log_loss documentation:

y_pred : array-like of float, shape = (n_samples, n_classes) or (n_samples,)

Predicted probabilities, as returned by a classifier’s predict_proba method. If y_pred.shape = (n_samples,) the probabilities provided are assumed to be that of the positive class. The labels in y_pred are assumed to be ordered alphabetically, as done by preprocessing.LabelBinarizer.

You need to pass probabilities not the prediction labels.

answered Mar 22 '17 at 7:37

ug2409

644

From the log_loss documentation:

y_pred : array-like of float, shape = (n_samples, n_classes) or (n_samples,)

Predicted probabilities, as returned by a classifier’s predict_proba method. If y_pred.shape = (n_samples,) the probabilities provided are assumed to be that of the positive class. The labels in y_pred are assumed to be ordered alphabetically, as done by preprocessing.LabelBinarizer.

You need to pass probabilities not the prediction labels.

answered Mar 22 '17 at 7:37

ug2409

644

answered Mar 22 '17 at 7:37

ug2409

644

answered Mar 22 '17 at 7:37

ug2409

644

answered Mar 22 '17 at 7:37

ug2409

644

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrgtkky