What loss function to use when labels are probabilities? Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) Announcing the arrival of Valued Associate #679: Cesar Manara Unicorn Meta Zoo #1: Why another podcast?Why would neural networks be a particularly good framework for “embodied AI”?Understanding GAN Loss functionHelp with implementing Q-learning for a feedfoward network playing a video gameHow do I implement softmax forward propagation and backpropagation to replace sigmoid in a neural network?Gradient of hinge loss functionHow to understand marginal loglikelihood objective function as loss function (explanation of an article)?What is batch / batch size in neural networks?Comparing and studying Loss FunctionsLoss function spikesPredicting sine using LSTM: Small output range and delayed output?

What would be Julian Assange's expected punishment, on the current English criminal law?

Replacing HDD with SSD; what about non-APFS/APFS?

New Order #5: where Fibonacci and Beatty meet at Wythoff

Is it possible to ask for a hotel room without minibar/extra services?

I'm having difficulty getting my players to do stuff in a sandbox campaign

Slither Like a Snake

Area of a 2D convex hull

How is simplicity better than precision and clarity in prose?

Single author papers against my advisor's will?

Can a non-EU citizen traveling with me come with me through the EU passport line?

What is the electric potential inside a point charge?

Can smartphones with the same camera sensor have different image quality?

Why use gamma over alpha radiation?

If I can make up priors, why can't I make up posteriors?

How to rotate it perfectly?

Why does tar appear to skip file contents when output file is /dev/null?

Can a monk deflect thrown melee weapons?

How many things? AとBがふたつ

What loss function to use when labels are probabilities?

Passing functions in C++

Fishing simulator

How to market an anarchic city as a tourism spot to people living in civilized areas?

Cold is to Refrigerator as warm is to?

Did the new image of black hole confirm the general theory of relativity?

What loss function to use when labels are probabilities?

Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)

Announcing the arrival of Valued Associate #679: Cesar Manara

Unicorn Meta Zoo #1: Why another podcast?Why would neural networks be a particularly good framework for “embodied AI”?Understanding GAN Loss functionHelp with implementing Q-learning for a feedfoward network playing a video gameHow do I implement softmax forward propagation and backpropagation to replace sigmoid in a neural network?Gradient of hinge loss functionHow to understand marginal loglikelihood objective function as loss function (explanation of an article)?What is batch / batch size in neural networks?Comparing and studying Loss FunctionsLoss function spikesPredicting sine using LSTM: Small output range and delayed output?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

What loss function is most appropriate when training a model with target values that are probabilities? For example, I have a 3-output model with x=[some features] and y=[0.2, 0.3, 0.5].

It seems like something like cross-entropy doesn't make sense here since it assumes that a single target is the correct label.

Would something like MSE (after applying softmax) make sense, or is there a better loss function?

asked 7 hours ago

Thomas Johnson

1133

New contributor

add a comment |

What loss function is most appropriate when training a model with target values that are probabilities? For example, I have a 3-output model with x=[some features] and y=[0.2, 0.3, 0.5].

It seems like something like cross-entropy doesn't make sense here since it assumes that a single target is the correct label.

Would something like MSE (after applying softmax) make sense, or is there a better loss function?

asked 7 hours ago

Thomas Johnson

1133

New contributor

add a comment |

What loss function is most appropriate when training a model with target values that are probabilities? For example, I have a 3-output model with x=[some features] and y=[0.2, 0.3, 0.5].

It seems like something like cross-entropy doesn't make sense here since it assumes that a single target is the correct label.

Would something like MSE (after applying softmax) make sense, or is there a better loss function?

asked 7 hours ago

Thomas Johnson

1133

New contributor

What loss function is most appropriate when training a model with target values that are probabilities? For example, I have a 3-output model with x=[some features] and y=[0.2, 0.3, 0.5].

It seems like something like cross-entropy doesn't make sense here since it assumes that a single target is the correct label.

Would something like MSE (after applying softmax) make sense, or is there a better loss function?

neural-networks loss-functions probability-distribution

asked 7 hours ago

Thomas Johnson

1133

New contributor

asked 7 hours ago

Thomas Johnson

1133

New contributor

asked 7 hours ago

Thomas Johnson

1133

New contributor

asked 7 hours ago

Thomas Johnson

1133

asked 7 hours ago

Thomas Johnson

1133

New contributor

Thomas Johnson is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

1 Answer
1

active

oldest

votes

Actually, the cross-entropy loss function would be appropriate here, since it measures the "distance" between a distribution $q$ and the "true" distribution $p$.

You are right, though, that using a loss function called "cross_entropy" in many APIs would be a mistake. This is because these functions, as you said, assume a one-hot label. You would need to use the general cross-entropy function,

$$H(p,q)=-sum_xin X p(x) log q(x).$$
$ $

Note that one-hot labels would mean that
$$
p(x) =
begincases
1 & textif x text is the true label\
0 & textotherwise
endcases$$

which causes the cross-entropy $H(p,q)$ to reduce to the form you're familiar with:

$$H(p,q) = -log q(x_label)$$

answered 7 hours ago

Philip Raeisghasem

998119

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "658"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

Thomas Johnson is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fai.stackexchange.com%2fquestions%2f11816%2fwhat-loss-function-to-use-when-labels-are-probabilities%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Actually, the cross-entropy loss function would be appropriate here, since it measures the "distance" between a distribution $q$ and the "true" distribution $p$.

$$H(p,q)=-sum_xin X p(x) log q(x).$$
$ $

Note that one-hot labels would mean that
$$
p(x) =
begincases
1 & textif x text is the true label\
0 & textotherwise
endcases$$

which causes the cross-entropy $H(p,q)$ to reduce to the form you're familiar with:

$$H(p,q) = -log q(x_label)$$

answered 7 hours ago

Philip Raeisghasem

998119

add a comment |

Actually, the cross-entropy loss function would be appropriate here, since it measures the "distance" between a distribution $q$ and the "true" distribution $p$.

$$H(p,q)=-sum_xin X p(x) log q(x).$$
$ $

Note that one-hot labels would mean that
$$
p(x) =
begincases
1 & textif x text is the true label\
0 & textotherwise
endcases$$

which causes the cross-entropy $H(p,q)$ to reduce to the form you're familiar with:

$$H(p,q) = -log q(x_label)$$

answered 7 hours ago

Philip Raeisghasem

998119

add a comment |

Actually, the cross-entropy loss function would be appropriate here, since it measures the "distance" between a distribution $q$ and the "true" distribution $p$.

$$H(p,q)=-sum_xin X p(x) log q(x).$$
$ $

Note that one-hot labels would mean that
$$
p(x) =
begincases
1 & textif x text is the true label\
0 & textotherwise
endcases$$

which causes the cross-entropy $H(p,q)$ to reduce to the form you're familiar with:

$$H(p,q) = -log q(x_label)$$

answered 7 hours ago

Philip Raeisghasem

998119

Actually, the cross-entropy loss function would be appropriate here, since it measures the "distance" between a distribution $q$ and the "true" distribution $p$.

$$H(p,q)=-sum_xin X p(x) log q(x).$$
$ $

Note that one-hot labels would mean that
$$
p(x) =
begincases
1 & textif x text is the true label\
0 & textotherwise
endcases$$

which causes the cross-entropy $H(p,q)$ to reduce to the form you're familiar with:

$$H(p,q) = -log q(x_label)$$

answered 7 hours ago

Philip Raeisghasem

998119

answered 7 hours ago

Philip Raeisghasem

998119

answered 7 hours ago

Philip Raeisghasem

998119

answered 7 hours ago

Philip Raeisghasem

998119

add a comment |

Thomas Johnson is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Thomas Johnson is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Artificial Intelligence Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Xjyuk

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Sahara Skak | Bilen | Luke uk diar | NawigatsjuunCommonskategorii: SaharaWikivoyage raisfeerer: Sahara26° N, 13° O

19. јануар Садржај Догађаји Рођења Смрти Празници и дани сећања Види још Референце Мени за навигацијуу

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Sahara Skak | Bilen | Luke uk diar | NawigatsjuunCommonskategorii: SaharaWikivoyage raisfeerer: Sahara26° N, 13° O

19. јануар Садржај Догађаји Рођења Смрти Празници и дани сећања Види још Референце Мени за навигацијуу

1 Answer
1

1 Answer
1

1 Answer
1