About sklearn.metrics.average_precision_score documentationHow to interpret the AUC score in this case?Where to find statistically relevant documentation of common Python packages?Irregular Precision-Recall CurveNeed a Work-around for OneHotEncoder Issue in SKLearn PreprocessingSci-kit learn function to select threshold for higher recall than precisionXGBoost: Quantifying Feature Importancessklearn nmf - question about its usemodel.predict in Keras, Python errorDifference between sklearn’s “log_loss” and “LogisticRegression”?Not sure if over-fitting
How can I stop my kitten from growing?
Is it possible to view all the attribute data in QGIS
How was the blinking terminal cursor invented?
Character had a different name in the past. Which name should I use in a flashback?
How to choose the correct exposure for flower photography?
Can I have a delimited macro with a literal # in the parameter text?
Germany rejected my entry to Schengen countries
If you attack a Tarrasque while swallowed, what AC do you need to beat to hit it?
Does the Aboleth have expertise in history and perception?
How does the "reverse syntax" in Middle English work?
What is the backup for a glass cockpit, if a plane loses power to the displays/controls?
Bash - Execute two commands and get exit status 1 if first fails
Is it a good idea to teach algorithm courses using pseudocode instead of a real programming language?
Cycling to work - 30 mile return
Parse a C++14 integer literal
In Dutch history two people are referred to as "William III"; are there any more cases where this happens?
Why didn't Daenerys' advisers suggest assassinating Cersei?
Why does string strummed with finger sound different from the one strummed with pick?
What city and town structures are important in a low fantasy medieval world?
Why is python script running in background consuming 100 % CPU?
How to fix "webpack Dev Server Invalid Options" in Vuejs
Why are Marine Le Pen's possible connections with Steve Bannon something worth investigating?
Is a reptile with diamond scales possible?
Would it be possible to set up a franchise in the ancient world?
About sklearn.metrics.average_precision_score documentation
How to interpret the AUC score in this case?Where to find statistically relevant documentation of common Python packages?Irregular Precision-Recall CurveNeed a Work-around for OneHotEncoder Issue in SKLearn PreprocessingSci-kit learn function to select threshold for higher recall than precisionXGBoost: Quantifying Feature Importancessklearn nmf - question about its usemodel.predict in Keras, Python errorDifference between sklearn’s “log_loss” and “LogisticRegression”?Not sure if over-fitting
$begingroup$
There is a example in sklearn.metrics.average_precision_score documentation.
import numpy as np
from sklearn.metrics import average_precision_score
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])
average_precision_score(y_true, y_scores)
0.83
But when I plot precision_recall_curve
precision, recall, _ = precision_recall_curve(y_true, y_scores)
plt.plot( recall,precision)
I got the picture:
why the area under the precision_recall_curve is not 0.83?
python scikit-learn scoring
$endgroup$
add a comment |
$begingroup$
There is a example in sklearn.metrics.average_precision_score documentation.
import numpy as np
from sklearn.metrics import average_precision_score
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])
average_precision_score(y_true, y_scores)
0.83
But when I plot precision_recall_curve
precision, recall, _ = precision_recall_curve(y_true, y_scores)
plt.plot( recall,precision)
I got the picture:
why the area under the precision_recall_curve is not 0.83?
python scikit-learn scoring
$endgroup$
add a comment |
$begingroup$
There is a example in sklearn.metrics.average_precision_score documentation.
import numpy as np
from sklearn.metrics import average_precision_score
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])
average_precision_score(y_true, y_scores)
0.83
But when I plot precision_recall_curve
precision, recall, _ = precision_recall_curve(y_true, y_scores)
plt.plot( recall,precision)
I got the picture:
why the area under the precision_recall_curve is not 0.83?
python scikit-learn scoring
$endgroup$
There is a example in sklearn.metrics.average_precision_score documentation.
import numpy as np
from sklearn.metrics import average_precision_score
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])
average_precision_score(y_true, y_scores)
0.83
But when I plot precision_recall_curve
precision, recall, _ = precision_recall_curve(y_true, y_scores)
plt.plot( recall,precision)
I got the picture:
why the area under the precision_recall_curve is not 0.83?
python scikit-learn scoring
python scikit-learn scoring
edited 4 hours ago
Ben Reiniger
664214
664214
asked 7 hours ago
disney82231disney82231
352
352
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
According to the documentation, the value is not exactly the area under curve, it is
$$textAP = sum_n(R_n - R_n-1)P_n.$$
which is a rectangular approximation.
For your specific example, i.e.
R P
1 0.0 1.0
2 0.5 1.0
3 0.5 0.5
4 0.1 0.66
it is calculated as
$$beginalign*
textAP & = overbrace(0.5 - 0.0)times1.0^(R_2 - R_1)P_2 + overbrace(0.5 - 0.5)times 0.5^(R_3 - R_2)P_3 + overbrace(1.0 - 0.5) times0.66^(R_4 - R_3)P_4 \
&= 0.5 + 0.00+ 0.33 = 0.83
endalign*$$
which is the area under the red curve as illustrated below:
compared to $$textAUPR=0.5 + overbracefrac0.5 + 0.662 times 0.5^texttrapezoid area = 0.79$$
which is the area under the blue curve.
$endgroup$
3
$begingroup$
Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
$endgroup$
– Ben Reiniger
4 hours ago
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f52130%2fabout-sklearn-metrics-average-precision-score-documentation%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
According to the documentation, the value is not exactly the area under curve, it is
$$textAP = sum_n(R_n - R_n-1)P_n.$$
which is a rectangular approximation.
For your specific example, i.e.
R P
1 0.0 1.0
2 0.5 1.0
3 0.5 0.5
4 0.1 0.66
it is calculated as
$$beginalign*
textAP & = overbrace(0.5 - 0.0)times1.0^(R_2 - R_1)P_2 + overbrace(0.5 - 0.5)times 0.5^(R_3 - R_2)P_3 + overbrace(1.0 - 0.5) times0.66^(R_4 - R_3)P_4 \
&= 0.5 + 0.00+ 0.33 = 0.83
endalign*$$
which is the area under the red curve as illustrated below:
compared to $$textAUPR=0.5 + overbracefrac0.5 + 0.662 times 0.5^texttrapezoid area = 0.79$$
which is the area under the blue curve.
$endgroup$
3
$begingroup$
Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
$endgroup$
– Ben Reiniger
4 hours ago
add a comment |
$begingroup$
According to the documentation, the value is not exactly the area under curve, it is
$$textAP = sum_n(R_n - R_n-1)P_n.$$
which is a rectangular approximation.
For your specific example, i.e.
R P
1 0.0 1.0
2 0.5 1.0
3 0.5 0.5
4 0.1 0.66
it is calculated as
$$beginalign*
textAP & = overbrace(0.5 - 0.0)times1.0^(R_2 - R_1)P_2 + overbrace(0.5 - 0.5)times 0.5^(R_3 - R_2)P_3 + overbrace(1.0 - 0.5) times0.66^(R_4 - R_3)P_4 \
&= 0.5 + 0.00+ 0.33 = 0.83
endalign*$$
which is the area under the red curve as illustrated below:
compared to $$textAUPR=0.5 + overbracefrac0.5 + 0.662 times 0.5^texttrapezoid area = 0.79$$
which is the area under the blue curve.
$endgroup$
3
$begingroup$
Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
$endgroup$
– Ben Reiniger
4 hours ago
add a comment |
$begingroup$
According to the documentation, the value is not exactly the area under curve, it is
$$textAP = sum_n(R_n - R_n-1)P_n.$$
which is a rectangular approximation.
For your specific example, i.e.
R P
1 0.0 1.0
2 0.5 1.0
3 0.5 0.5
4 0.1 0.66
it is calculated as
$$beginalign*
textAP & = overbrace(0.5 - 0.0)times1.0^(R_2 - R_1)P_2 + overbrace(0.5 - 0.5)times 0.5^(R_3 - R_2)P_3 + overbrace(1.0 - 0.5) times0.66^(R_4 - R_3)P_4 \
&= 0.5 + 0.00+ 0.33 = 0.83
endalign*$$
which is the area under the red curve as illustrated below:
compared to $$textAUPR=0.5 + overbracefrac0.5 + 0.662 times 0.5^texttrapezoid area = 0.79$$
which is the area under the blue curve.
$endgroup$
According to the documentation, the value is not exactly the area under curve, it is
$$textAP = sum_n(R_n - R_n-1)P_n.$$
which is a rectangular approximation.
For your specific example, i.e.
R P
1 0.0 1.0
2 0.5 1.0
3 0.5 0.5
4 0.1 0.66
it is calculated as
$$beginalign*
textAP & = overbrace(0.5 - 0.0)times1.0^(R_2 - R_1)P_2 + overbrace(0.5 - 0.5)times 0.5^(R_3 - R_2)P_3 + overbrace(1.0 - 0.5) times0.66^(R_4 - R_3)P_4 \
&= 0.5 + 0.00+ 0.33 = 0.83
endalign*$$
which is the area under the red curve as illustrated below:
compared to $$textAUPR=0.5 + overbracefrac0.5 + 0.662 times 0.5^texttrapezoid area = 0.79$$
which is the area under the blue curve.
edited 36 mins ago
answered 4 hours ago
EsmailianEsmailian
4,519422
4,519422
3
$begingroup$
Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
$endgroup$
– Ben Reiniger
4 hours ago
add a comment |
3
$begingroup$
Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
$endgroup$
– Ben Reiniger
4 hours ago
3
3
$begingroup$
Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
$endgroup$
– Ben Reiniger
4 hours ago
$begingroup$
Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
$endgroup$
– Ben Reiniger
4 hours ago
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f52130%2fabout-sklearn-metrics-average-precision-score-documentation%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown