0:00
hi everyone my name Isa today I'm going
0:02
to teach you about how to use a
0:04
confusion Matrix in Python using psychic
0:07
learn the confusion Matrix what it does
0:11
is that it measures the quality of
0:13
predictions from a classification model
0:15
by using how many predictions are true
0:19
false so essentially what it computes is
0:22
the true positive the false positives
0:25
the true negatives and the false
0:28
negatives so when talk about true
0:31
positive and false posit true positive
0:35
what we're saying is that we have
0:37
predicted true and it's actually true so
0:40
for example a we have predicted that
0:42
someone is sick and that person is
0:46
sick false positive is when we predicted
0:50
that it's false and it's actually false
0:53
we've predicted that the person is not
0:55
sick and is actually not sick and we
0:58
have where our predictions are wrong so
1:02
we have predicted that it's true and
1:04
it's false so we predicted that person
1:07
is sick and it's not sick and the false
1:09
negative another false case where we've
1:12
predicted false but it's true so we
1:15
predicted that someone is not sick but
1:17
the person is actually sick so this is
1:20
what we have we have a quadrant in the
1:22
confusion Matrix where we have the
1:25
actual values and we have the predicted
1:28
values so it the actual value says that
1:33
the person is not sick so negative or
1:36
that the person is positive and when we
1:39
took the predicted values so we
1:41
predicted that person is not sick or
1:45
person is sick so in this case if we
1:47
have predicted that someone is not sick
1:51
and that it was actually not sick then
1:53
we fall into the first quadrant of true
1:56
negative when we've predicted that the
1:59
person is sick fals positive uh
2:02
predicted that person is sick but the
2:04
person is actually not sick that's when
2:07
we call it a false positive and so on so
2:11
we have these quadrants and that's what
2:13
a confusion return return so let's make
2:17
an example by training a machine
2:18
learning model what we're going to do is
2:24
a um code in order to low breast cancer
2:29
data set and uh to train a k neighbor's
2:33
classification to predict malignant or
2:37
cancer so if we run this we have an
2:41
accuracy score that returns
2:46
um so that's I'm not going into the
2:49
detail of how to run this we have uh
2:52
post articles and videos about train
2:55
test plit about kabers classif
2:58
classifier and in this case I just want
3:00
to show you how to actually uh compute
3:04
the confusion Matrix from that
3:05
information so we import from
3:09
skarn metrics so that's a module in
3:12
which you will find import confusion
3:16
Matrix and then in order to compute a
3:19
confusion Matrix we can see C cm equals
3:24
confusion Matrix and the parameters that
3:27
you will pass is the test y and the Y
3:30
prediction so we have uh
3:34
predicted made a prediction on this and
3:36
so we're plotting a confusion Matrix
3:38
using the test and prediction and we
3:41
return the confusion Matrix here so we
3:45
see here that we have an
3:50
rows um and essentially this array
3:57
chart so what we do here we can print
4:00
those result by saying okay we have a
4:03
true negative that the person is not
4:05
sick and we see that okay 57 are true
4:08
negative seven are true POS false
4:10
positive five are false negative and 102
4:16
positives so we can actually get this
4:20
information uh that tells us a little
4:22
bit more information than just Computing
4:25
the accuracy score of 92% we can D dig
4:29
deeper and to uh depending on our
4:31
context into some other metrics that can
4:34
help us make a decision on our model we
4:38
can also plot a confusion Matrix in a
4:40
way that is uh more beautiful so
4:43
essentially to plot it we assign the
4:46
labels so we take the labels and we say
4:49
okay we have the K&N classes so the
4:53
classes that are in KNN in this case is
4:56
the um benine and meling and can ERS so
5:01
zeros and one and then you can return a
5:03
confusion matrix by saying cm equal
5:10
Matrix and we use a y test and
5:14
ypr to compare get the true positives
5:19
and then we can return a confusion
5:22
Matrix display so we will use a
5:24
confusion uh Matrix display class from
5:28
escaler Matrix to plot the confusion
5:37
display and all we have to do is we pass
5:44
object to the yeah uh the parenter and
5:55
labels and then we can do this plot and
5:59
then then we can add a title and do Plot
6:03
show and then if we run this we end up
6:06
with this beautiful uh confusion Matrix
6:09
plot that is more visual than the the
6:15
returned in some cases we have a
6:19
classification multiclass classification
6:22
essentially is when we try to predict
6:23
more than just two uh outcomes and in
6:27
this case what we're going to do is
6:28
we're going to use the one data set uh
6:31
and use a random uh forest
6:34
classifier and we will return this and
6:38
if we want to uh plot
6:41
this um plot this confusion Matrix that
6:44
have multiple class then we will use a
6:49
heat map from uh cbor inste so again we
6:55
Matrix and Y test y PR
7:00
so we get the confusion Matrix and then
7:03
we use SNS the heat map and we do
7:11
annotation equal true to show The
7:13
annotation and then we can add X label y
7:18
label and title and then do Plot
7:22
show and that's what I mean meant by uh
7:27
doing multiple class classification is
7:30
that we're trying to predict three
7:32
classes compare it and we can have a
7:34
multiclass classification here one last
7:37
thing to consider here is that the
7:39
confusion Matrix is used whenever the we
7:44
are trying to build a classification
7:47
report so it's not a tutorial on the
7:50
classification report but I'm going to
7:51
show you how uh the confusion Matrix so
7:57
you import classification
8:00
report and I strongly H you to go to my
8:03
next tutorial on the classification
8:05
report to understand how this work but
8:08
essentially I just want to show you
8:12
that you do classification report y test
8:17
y PR and then you run
8:20
this and you will see that you get this
8:24
report which shows Precision recall and
8:27
precision and recall are calculated
8:30
based on those true positives and true
8:33
negatives and uh this is uh how the
8:37
confusion ma Matrix impacts the
8:39
classification report so thank you very
8:42
much please subscribe to my channel and
8:45
visit my website and thank you and see