Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
284 views
in Technique[技术] by (71.8m points)

machine learning - Cross Entropy in PyTorch

I'm a bit confused by the cross entropy loss in PyTorch.

Considering this example:

import torch
import torch.nn as nn
from torch.autograd import Variable

output = Variable(torch.FloatTensor([0,0,0,1])).view(1, -1)
target = Variable(torch.LongTensor([3]))

criterion = nn.CrossEntropyLoss()
loss = criterion(output, target)
print(loss)

I would expect the loss to be 0. But I get:

Variable containing:
 0.7437
[torch.FloatTensor of size 1]

As far as I know cross entropy can be calculated like this:

enter image description here

But shouldn't be the result then 1*log(1) = 0 ?

I tried different inputs like one-hot encodings, but this doesn't work at all, so it seems the input shape of the loss function is okay.

I would be really grateful if someone could help me out and tell me where my mistake is.

Thanks in advance!

question from:https://stackoverflow.com/questions/49390842/cross-entropy-in-pytorch

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

In your example you are treating output [0, 0, 0, 1] as probabilities as required by the mathematical definition of cross entropy. But PyTorch treats them as outputs, that don’t need to sum to 1, and need to be first converted into probabilities for which it uses the softmax function.

So H(p, q) becomes:

H(p, softmax(output))

Translating the output [0, 0, 0, 1] into probabilities:

softmax([0, 0, 0, 1]) = [0.1749, 0.1749, 0.1749, 0.4754]

whence:

-log(0.4754) = 0.7437

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...