About sklearn.metrics.average_precision_score documentationHow to interpret the AUC score in this case?Where to find statistically relevant documentation of common Python packages?Irregular Precision-Recall CurveNeed a Work-around for OneHotEncoder Issue in SKLearn PreprocessingSci-kit learn function to select threshold for higher recall than precisionXGBoost: Quantifying Feature Importancessklearn nmf - question about its usemodel.predict in Keras, Python errorDifference between sklearn’s “log_loss” and “LogisticRegression”?Not sure if over-fitting

How can I stop my kitten from growing?

Is it possible to view all the attribute data in QGIS

How was the blinking terminal cursor invented?

Character had a different name in the past. Which name should I use in a flashback?

How to choose the correct exposure for flower photography?

Can I have a delimited macro with a literal # in the parameter text?

Germany rejected my entry to Schengen countries

If you attack a Tarrasque while swallowed, what AC do you need to beat to hit it?

Does the Aboleth have expertise in history and perception?

How does the "reverse syntax" in Middle English work?

What is the backup for a glass cockpit, if a plane loses power to the displays/controls?

Bash - Execute two commands and get exit status 1 if first fails

Is it a good idea to teach algorithm courses using pseudocode instead of a real programming language?

Cycling to work - 30 mile return

Parse a C++14 integer literal

In Dutch history two people are referred to as "William III"; are there any more cases where this happens?

Why didn't Daenerys' advisers suggest assassinating Cersei?

Why does string strummed with finger sound different from the one strummed with pick?

What city and town structures are important in a low fantasy medieval world?

Why is python script running in background consuming 100 % CPU?

How to fix "webpack Dev Server Invalid Options" in Vuejs

Why are Marine Le Pen's possible connections with Steve Bannon something worth investigating?

Is a reptile with diamond scales possible?

Would it be possible to set up a franchise in the ancient world?



About sklearn.metrics.average_precision_score documentation


How to interpret the AUC score in this case?Where to find statistically relevant documentation of common Python packages?Irregular Precision-Recall CurveNeed a Work-around for OneHotEncoder Issue in SKLearn PreprocessingSci-kit learn function to select threshold for higher recall than precisionXGBoost: Quantifying Feature Importancessklearn nmf - question about its usemodel.predict in Keras, Python errorDifference between sklearn’s “log_loss” and “LogisticRegression”?Not sure if over-fitting













2












$begingroup$


There is a example in sklearn.metrics.average_precision_score documentation.



import numpy as np
from sklearn.metrics import average_precision_score
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])
average_precision_score(y_true, y_scores)
0.83


But when I plot precision_recall_curve



precision, recall, _ = precision_recall_curve(y_true, y_scores)
plt.plot( recall,precision)


I got the picture:enter image description here



why the area under the precision_recall_curve is not 0.83?










share|improve this question











$endgroup$
















    2












    $begingroup$


    There is a example in sklearn.metrics.average_precision_score documentation.



    import numpy as np
    from sklearn.metrics import average_precision_score
    y_true = np.array([0, 0, 1, 1])
    y_scores = np.array([0.1, 0.4, 0.35, 0.8])
    average_precision_score(y_true, y_scores)
    0.83


    But when I plot precision_recall_curve



    precision, recall, _ = precision_recall_curve(y_true, y_scores)
    plt.plot( recall,precision)


    I got the picture:enter image description here



    why the area under the precision_recall_curve is not 0.83?










    share|improve this question











    $endgroup$














      2












      2








      2





      $begingroup$


      There is a example in sklearn.metrics.average_precision_score documentation.



      import numpy as np
      from sklearn.metrics import average_precision_score
      y_true = np.array([0, 0, 1, 1])
      y_scores = np.array([0.1, 0.4, 0.35, 0.8])
      average_precision_score(y_true, y_scores)
      0.83


      But when I plot precision_recall_curve



      precision, recall, _ = precision_recall_curve(y_true, y_scores)
      plt.plot( recall,precision)


      I got the picture:enter image description here



      why the area under the precision_recall_curve is not 0.83?










      share|improve this question











      $endgroup$




      There is a example in sklearn.metrics.average_precision_score documentation.



      import numpy as np
      from sklearn.metrics import average_precision_score
      y_true = np.array([0, 0, 1, 1])
      y_scores = np.array([0.1, 0.4, 0.35, 0.8])
      average_precision_score(y_true, y_scores)
      0.83


      But when I plot precision_recall_curve



      precision, recall, _ = precision_recall_curve(y_true, y_scores)
      plt.plot( recall,precision)


      I got the picture:enter image description here



      why the area under the precision_recall_curve is not 0.83?







      python scikit-learn scoring






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 4 hours ago









      Ben Reiniger

      664214




      664214










      asked 7 hours ago









      disney82231disney82231

      352




      352




















          1 Answer
          1






          active

          oldest

          votes


















          3












          $begingroup$

          According to the documentation, the value is not exactly the area under curve, it is
          $$textAP = sum_n(R_n - R_n-1)P_n.$$
          which is a rectangular approximation.



          For your specific example, i.e.



           R P
          1 0.0 1.0
          2 0.5 1.0
          3 0.5 0.5
          4 0.1 0.66


          it is calculated as
          $$beginalign*
          textAP & = overbrace(0.5 - 0.0)times1.0^(R_2 - R_1)P_2 + overbrace(0.5 - 0.5)times 0.5^(R_3 - R_2)P_3 + overbrace(1.0 - 0.5) times0.66^(R_4 - R_3)P_4 \
          &= 0.5 + 0.00+ 0.33 = 0.83
          endalign*$$

          which is the area under the red curve as illustrated below:





          compared to $$textAUPR=0.5 + overbracefrac0.5 + 0.662 times 0.5^texttrapezoid area = 0.79$$
          which is the area under the blue curve.






          share|improve this answer











          $endgroup$








          • 3




            $begingroup$
            Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
            $endgroup$
            – Ben Reiniger
            4 hours ago











          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f52130%2fabout-sklearn-metrics-average-precision-score-documentation%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          3












          $begingroup$

          According to the documentation, the value is not exactly the area under curve, it is
          $$textAP = sum_n(R_n - R_n-1)P_n.$$
          which is a rectangular approximation.



          For your specific example, i.e.



           R P
          1 0.0 1.0
          2 0.5 1.0
          3 0.5 0.5
          4 0.1 0.66


          it is calculated as
          $$beginalign*
          textAP & = overbrace(0.5 - 0.0)times1.0^(R_2 - R_1)P_2 + overbrace(0.5 - 0.5)times 0.5^(R_3 - R_2)P_3 + overbrace(1.0 - 0.5) times0.66^(R_4 - R_3)P_4 \
          &= 0.5 + 0.00+ 0.33 = 0.83
          endalign*$$

          which is the area under the red curve as illustrated below:





          compared to $$textAUPR=0.5 + overbracefrac0.5 + 0.662 times 0.5^texttrapezoid area = 0.79$$
          which is the area under the blue curve.






          share|improve this answer











          $endgroup$








          • 3




            $begingroup$
            Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
            $endgroup$
            – Ben Reiniger
            4 hours ago















          3












          $begingroup$

          According to the documentation, the value is not exactly the area under curve, it is
          $$textAP = sum_n(R_n - R_n-1)P_n.$$
          which is a rectangular approximation.



          For your specific example, i.e.



           R P
          1 0.0 1.0
          2 0.5 1.0
          3 0.5 0.5
          4 0.1 0.66


          it is calculated as
          $$beginalign*
          textAP & = overbrace(0.5 - 0.0)times1.0^(R_2 - R_1)P_2 + overbrace(0.5 - 0.5)times 0.5^(R_3 - R_2)P_3 + overbrace(1.0 - 0.5) times0.66^(R_4 - R_3)P_4 \
          &= 0.5 + 0.00+ 0.33 = 0.83
          endalign*$$

          which is the area under the red curve as illustrated below:





          compared to $$textAUPR=0.5 + overbracefrac0.5 + 0.662 times 0.5^texttrapezoid area = 0.79$$
          which is the area under the blue curve.






          share|improve this answer











          $endgroup$








          • 3




            $begingroup$
            Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
            $endgroup$
            – Ben Reiniger
            4 hours ago













          3












          3








          3





          $begingroup$

          According to the documentation, the value is not exactly the area under curve, it is
          $$textAP = sum_n(R_n - R_n-1)P_n.$$
          which is a rectangular approximation.



          For your specific example, i.e.



           R P
          1 0.0 1.0
          2 0.5 1.0
          3 0.5 0.5
          4 0.1 0.66


          it is calculated as
          $$beginalign*
          textAP & = overbrace(0.5 - 0.0)times1.0^(R_2 - R_1)P_2 + overbrace(0.5 - 0.5)times 0.5^(R_3 - R_2)P_3 + overbrace(1.0 - 0.5) times0.66^(R_4 - R_3)P_4 \
          &= 0.5 + 0.00+ 0.33 = 0.83
          endalign*$$

          which is the area under the red curve as illustrated below:





          compared to $$textAUPR=0.5 + overbracefrac0.5 + 0.662 times 0.5^texttrapezoid area = 0.79$$
          which is the area under the blue curve.






          share|improve this answer











          $endgroup$



          According to the documentation, the value is not exactly the area under curve, it is
          $$textAP = sum_n(R_n - R_n-1)P_n.$$
          which is a rectangular approximation.



          For your specific example, i.e.



           R P
          1 0.0 1.0
          2 0.5 1.0
          3 0.5 0.5
          4 0.1 0.66


          it is calculated as
          $$beginalign*
          textAP & = overbrace(0.5 - 0.0)times1.0^(R_2 - R_1)P_2 + overbrace(0.5 - 0.5)times 0.5^(R_3 - R_2)P_3 + overbrace(1.0 - 0.5) times0.66^(R_4 - R_3)P_4 \
          &= 0.5 + 0.00+ 0.33 = 0.83
          endalign*$$

          which is the area under the red curve as illustrated below:





          compared to $$textAUPR=0.5 + overbracefrac0.5 + 0.662 times 0.5^texttrapezoid area = 0.79$$
          which is the area under the blue curve.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 36 mins ago

























          answered 4 hours ago









          EsmailianEsmailian

          4,519422




          4,519422







          • 3




            $begingroup$
            Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
            $endgroup$
            – Ben Reiniger
            4 hours ago












          • 3




            $begingroup$
            Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
            $endgroup$
            – Ben Reiniger
            4 hours ago







          3




          3




          $begingroup$
          Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
          $endgroup$
          – Ben Reiniger
          4 hours ago




          $begingroup$
          Beat me to it; in particular, the documentation states "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic."
          $endgroup$
          – Ben Reiniger
          4 hours ago

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f52130%2fabout-sklearn-metrics-average-precision-score-documentation%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          ParseJSON using SSJSUsing AMPscript with SSJS ActivitiesHow to resubscribe a user in Marketing cloud using SSJS?Pulling Subscriber Status from Lists using SSJSRetrieving Emails using SSJSProblem in updating DE using SSJSUsing SSJS to send single email in Marketing CloudError adding EmailSendDefinition using SSJS

          Кампала Садржај Географија Географија Историја Становништво Привреда Партнерски градови Референце Спољашње везе Мени за навигацију0°11′ СГШ; 32°20′ ИГД / 0.18° СГШ; 32.34° ИГД / 0.18; 32.340°11′ СГШ; 32°20′ ИГД / 0.18° СГШ; 32.34° ИГД / 0.18; 32.34МедијиПодациЗванични веб-сајту

          19. јануар Садржај Догађаји Рођења Смрти Празници и дани сећања Види још Референце Мени за навигацијуу