Docs return description of binary_confusion_matrix incorrect #183

ChrisBog-MV · 2023-09-18T12:00:17Z

Not sure if I'm being daft here but I think this line is incorrect:

torcheval/torcheval/metrics/functional/classification/confusion_matrix.py

Line 23 in e0444c6

    
               Compute binary confusion matrix, a 2 by 2 tensor with counts ( (true positive, false negative) , (false positive, true negative) )

It says that the returned tensor contains the values:

( (true positive, false negative) , (false positive, true negative) )

But if you look at the examples, it shows (e.g.):

>>> input = torch.tensor([0, 1, 0.7, 0.6])
>>> target = torch.tensor([0, 1, 1, 0])
>>> binary_confusion_matrix(input, target)
        tensor([[1, 1],
                [0, 2]])

Which for those inputs, I count:

tn = 1 #indices = 0
tp = 2 #indices = 1,2
fp = 1 #indices = 3
fn = 0

So I believe that would mean the actual tensor being returned is either [[fp, tn], [fn, tp]] or [[tn, fp], [fn, tp]]. From my own experiments, I'm pretty sure it's the latter.

Been scratching my head all morning about why my results look wrong and I think this is why.

The text was updated successfully, but these errors were encountered:

bobakfb · 2023-09-18T17:13:47Z

Hi @ChrisBog-MV Thanks for pointing this out!

@JKSenthil I went through and looked at this. The issue is that we are following sklearn's convention of using

TN | FP
FN | TP

which is not standard for binary confusion matrices.

I would suggest we flip the rows and columns here with a bit of slicing

return _confusion_matrix_compute(matrix, normalize)[[1,0]][:, [1,0]]

tests and docstring examples will need to be updated as well. But basically we will then return

TP | FN
FP | TN

which is standard.

anordin95 · 2024-01-06T02:28:13Z

Hi!

+1. Also, I made an example that illustrates the problem with expected unique counts for each category (i.e. FP, FN, etc.)

confusion_matrix_example.py

import torcheval.metrics
import numpy as np

binary_confusion_matrix_metric = torcheval.metrics.BinaryConfusionMatrix()
# Intentionally construct a dataset that should evaluate to these counts:
# TP: 4; TN: 3; FP: 2; FN: 1.
predictions_and_labels = np.array([
    # False negatives:
    (0, 1),
    # False positives:
    (1, 0),
    (1, 0),
    # True negatives:
    (0, 0),
    (0, 0),
    (0, 0),
    # True positives:
    (1, 1),
    (1, 1),
    (1, 1),
    (1, 1),
])

predictions = predictions_and_labels[: , 0]
labels = predictions_and_labels[: , 1]

binary_confusion_matrix_metric.update(torch.from_numpy(predictions), torch.from_numpy(labels))
binary_confusion_matrix = binary_confusion_matrix_metric.compute()
print(binary_confusion_matrix)

# The torcheval docs indicate the return value should be:
# ( (true positive, false negative) , (false positive, true negative) )
# So, we'd expect ( (4, 1), (2, 3) ).

# However, we don't observe what we expect!
# Instead I think the docs should say:
# ( (true negative, false positive) , (false negative, true positive) )

$ python confusion_matrix_example.py

tensor([[3., 2.],
        [1., 4.]])

PS: a torch.flip(binary_confusion_matrix, dims=[0, 1]) would also do the trick and perhaps be a bit more readable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs return description of binary_confusion_matrix incorrect #183

Docs return description of binary_confusion_matrix incorrect #183

ChrisBog-MV commented Sep 18, 2023 •

edited

Loading

bobakfb commented Sep 18, 2023

anordin95 commented Jan 6, 2024 •

edited

Loading

Docs return description of binary_confusion_matrix incorrect #183

Docs return description of binary_confusion_matrix incorrect #183

Comments

ChrisBog-MV commented Sep 18, 2023 • edited Loading

bobakfb commented Sep 18, 2023

anordin95 commented Jan 6, 2024 • edited Loading

ChrisBog-MV commented Sep 18, 2023 •

edited

Loading

anordin95 commented Jan 6, 2024 •

edited

Loading