BLOG   STAT LINKS   SALARY LINKS

COUNT THE BASKET
Advanced Stats for Basketball
countthebasket@gmail.com

March 6, 2008

Diminishing Returns for Scoring - Usage vs. Efficiency

Posted by Eli in Stat Theory, Studies

In the wake of my last few posts on diminishing returns in rebounding, a lot of people have suggested looking at how diminishing returns applies to scoring. This is a more complex issue, but I think some of the same methods can be used to try to understand what’s going on in this part of the game of basketball. For rebounding, we were just looking at the relationship of player rebounding to team rebounding. For scoring, we have to look at the relationship of player efficiency and player usage to team efficiency. Diminishing returns for scoring is really just another way of framing the usage vs. efficiency debate which has been going on in the stats community for years. Does efficiency decrease as usage increases? By how much? What, if any, value should be placed on shot creation? Are coaches using anything near to the optimal strategies in distributing shot attempts among their players? Is Allen Iverson wildly overrated? Was Fred Hoiberg criminally underutilized? The big names in basketball stats like Dean Oliver, Bob Chaikin, John Hollinger, Dan Rosenbaum, and Dave Berri have all staked out positions in this debate. For some background, see here and here and here and here and here and so on and so on. A lot of words have been written on this topic.

The major difficulty in studying the usage vs. efficiency tradeoff is the chicken-and-egg problem - does a positive correlation between usage and efficiency mean that players’ efficiencies aren’t hurt as they attempt to up their usage, or just that in seasons/games/matchups where players are more efficient (for whatever reason) they use more possessions? For instance, if a player is facing a poor defender (which will increase his efficiency) he (or his coach) might increase his usage. But it could be that this positive correlation is drowning out the presence of a real diminishing returns effect. If players go from low-usage, low-efficiency against a good defenders to high-usage, high-efficiency against poor defenders, it still could be the case that if they tried to increase their usage against average defenders their efficiency would decrease. Defender strength is just one of the factors that can cloud things - another confound comes from game-to-game or season-to-season variation in a player’s abilities (e.g. a player being “hot” or having an “off game”, a player being injured or tired, or a player using more possessions as his skills improve from year to year).

By using the method from my last study on diminishing returns for rebounding, it’s possible to largely avoid this chicken-and-egg problem. This method looks at situations in which some or all of the players on the court were forced to increase their usage (relative to their average usage on the season). And on the other side, it looks at lineups in which some or all of the players on the court were forced to decrease their typical usage. By looking at these forced cases the method minimizes the confounds from players increasing or decreasing their usage by choice in favorable situations.

There are a few extra steps involved in applying the method to study diminishing returns in scoring, but the core ideas are the same as for studying rebounding - use player season stats to project each lineup’s performance in a manner that assumes no diminishing returns, and then compare those projections to how the lineups actually fared. The more accurate the projections are, the smaller the effect of diminishing returns.

Applying the method to scoring

First, I needed a way to project the efficiency (points per possession) of a lineup based on the season performances of the five players in the lineup, in a manner that assumed no diminishing returns. To do so I used two statistics created by Dean Oliver, individual offensive rating (ORtg), and individual possessions (Poss). Unlike a composite player rating such as PER, these stats divide up usage (Poss) and efficiency (ORtg) into separate components. Poss measures how many of his team’s possessions a player uses by scoring, missing a shot that’s rebounded by the defense, or turning the ball over. ORtg measures how many points a player produces for each possession that he uses (actually, typically it measures points per 100 possessions, but for this study I am scaling it to points per single possession). For details on calculating these stats, buy Dean’s book Basketball on Paper, or see this article.

For each lineup, I used the ORtg and Poss of all five players to project the lineup’s points per possession. First I converted each player’s Poss into %TmPoss, the percent of their team’s possessions that the player typically uses when on the court.

%TmPoss = Poss/((5*MIN/tmMIN)*tmPoss)

Then I took a weighted average of the five ORtg’s, with the weights being each player’s %TmPoss.

projORtg = (P1ORtg*P1%TmPoss + P2ORtg*P2%TmPoss + P3ORtg*P3%TmPoss + P4ORtg*P4%TmPoss + P5ORtg*P5%TmPoss)/
           (P1%TmPoss + P2%TmPoss + P3%TmPoss + P4%TmPoss + P5%TmPoss)

The method assumes that high-usage players will use more possessions than their low-usage teammates, and thus have a greater impact (either good or bad) on the lineup’s overall efficiency. Importantly, this is a way of summing individual ORtg’s to arrive at a team ORtg without assuming any diminishing returns. For lineups composed of low usage players, where the sum of the five players’ %TmPoss is less than one, the projection assumes that the players in the lineup will maintain their efficiencies (ORtg) even though they will be forced to ramp up their usages above their typical level. And on the other side, if the lineup is a collection of high usage players with a summed %TmPoss of more than one, the projection assumes that even though the players will have to decrease their usage, their efficiencies will not increase. I will refer to the sum of the five players’ %TmPoss (which is the denominator of projORtg) as sum%TmPoss:

sum%TmPoss = P1%TmPoss + P2%TmPoss + P3%TmPoss + P4%TmPoss + P5%TmPoss

I then centered sum%TmPoss around 0 by subtracting 1 so that high-usage lineups would have positive values and low-usage lineups would have negative values:

csum%TmPoss = (P1%TmPoss + P2%TmPoss + P3%TmPoss + P4%TmPoss + P5%TmPoss) - 1

For each lineup, I calculated how many points it scored per possession when it was on the court and called this its actORtg.

actORtg = TmPTS/TmPoss

This meant that for every lineup I had three numbers - its projected offensive rating based on the players’ season stats (projORtg), its summed possession usage based on the players’ season stats (csum%TmPoss), and its actual offensive rating in the minutes the lineup played together (actORtg). To determine whether a lineup’s actual performance measured up to its projected performance, I subtracted projORtg from actORtg to get diffORtg. A positive diffORtg indicates that a lineup was more efficient than the no-diminishing-returns model projected it would be, and a negative diffORtg indicates that a lineup was less efficient than projected.

diffORtg = actORtg - projORtg

Here’s an example to see how this works, using one of Portland’s lineups (with data through 2/28 - my lineup data from this study came from BasketballValue and my player data came from Basketball-Reference):

Player             ORtg  %TmPoss
-----------------  ----  -------
Steve Blake        1.10     0.15
Brandon Roy        1.12     0.25
Martell Webster    1.05     0.17
LaMarcus Aldridge  1.07     0.24
Joel Przybilla     1.09     0.12

This lineup played together for 682 possessions and scored 694 points.

sum%TmPoss = 0.15 + 0.25 + 0.17 + 0.24 + 0.12 = 0.94

csum%TmPoss = 0.94 - 1 = -0.06

projORtg = (1.10*0.15 + 1.12*0.25 + 1.05*0.17 + 1.07*0.24 + 1.09*0.12)/(0.15 + 0.25 + 0.17 + 0.24 + 0.12) = 1.09

actORtg = 694/682 = 1.02

diffORtg = 1.02 - 1.09 = -0.07

So this was a low-usage lineup (csum%TmPoss of -0.06) that scored less efficiently than was projected (diffORtg of -0.07).

From following that example, you can probably see where this is going. Though you might be wondering why I went to all that trouble to calculate projORtg when one could just look at the relationship between sum%TmPoss and actORtg to see whether high-usage lineups tend to be more efficient than low-usage ones. The reason that’s not a good approach is because low-usage players tend to be less efficient than high-usage ones, so we have to first control for the efficiencies of the players in a lineup (which projORtg and diffORtg do) to really get at whether there’s a usage vs. efficiency tradeoff.

So I ran an ordinary least squares regression on all 8116 lineups that had played together through 2/28, using diffORtg as the outcome variable and csum%TmPoss as the predictor variable.

lm(formula = diffORtg ~ csum%TmPoss)
               coef.est coef.se
(Intercept)    -0.05     0.01
csum%TmPoss     0.43     0.07
---
n = 8116, k = 2
residual sd = 0.55, R-Squared = 0.004

For this first regression, the coefficient on csum%TmPoss was 0.43, with a standard error of 0.07. The R-squared was 0.004 and the residual standard deviation was 0.55. Finding a positive and statistically significant coefficient suggests that there is a usage vs. efficiency tradeoff, but I was afraid that very small sample size lineups could be skewing the results (e.g. lineups that played together for 2 possessions and scored 5 points for an actORtg of 2.50, or lineups that played together for 5 possessions and scored 0 points for an actORtg of 0.00).

To deal with this I tried a few different techniques. First, I removed all lineups that had played together for less than 50 possessions from the regression. This left 555 lineups. I ran the same regression and got a coefficient on csum%TmPoss of 0.24, with a standard error of 0.08. The R-squared was 0.02 and the residual standard deviation was 0.12.

lm(formula = diffORtg50 ~ csum%TmPoss50)
              coef.est coef.se
(Intercept)   0.00     0.01
csum%TmPoss50 0.24     0.08
---
n = 555, k = 2
residual sd = 0.12, R-Squared = 0.02

The other techniques I tried yielded similar coefficients. I ran a weighted least squares regression on all lineups, using possessions played together as the weights, and found a coefficient of 0.26 with a standard error of 0.04. And I ran an OLS regression on the 244 lineups that had played together for at least 100 possessions and got a coefficient of 0.27 with a standard error of 0.09.

lm(formula = diffORtg ~ csum%TmPoss, weights = TmPoss)
               coef.est coef.se
(Intercept)    -0.01     0.00
csum%TmPoss     0.26     0.04
---
n = 8116, k = 2
residual sd = 1.10, R-Squared = 0.01

lm(formula = diffORtg100 ~ csum%TmPoss100)
               coef.est coef.se
(Intercept)    -0.01     0.01
csum%TmPoss100  0.27     0.09
---
n = 244, k = 2
residual sd = 0.08, R-Squared = 0.04

All this put together suggests a statistically significant positive slope of around 0.25 for csum%TmPoss in relation to diffORtg.

A visual presentation of the usage vs. efficiency tradeoff

To get a sense of how this data looks, I put it into a chart similar to the ones I made for my last study on diminishing returns in rebounding. I divided all the lineups (even those that played together for very few possessions) into bins based on their summed player usages (csum%TmPoss). For each bin, I then took a weighted average of all the contained lineups’ diffORtg’s, using the number of possessions each lineup played together for the weights. That yielded the diffORtg’s that were plotted on the Y-axis. High values on the X-axis represent lineups composed of higher-usage players that were forced to decrease their usage, while low values represent lineups that were low-usage and had to increase their usage. On the Y-axis, high values represent lineups that scored more points per possession than was projected from the season ORtg’s of the five players in the lineup, while low values represent lineups that were less efficient than projected. The dashed gray line represents what we would expect if there were no diminishing returns and no usage vs. efficiency tradeoff - there would be no connection between the usages of the players in a lineup and whether it overperformed or underperformed its projected efficiency level.

The positive slope of the blue line suggests that there is a usage vs. efficiency tradeoff. In the bottom left, we see that lineups made of low-usage players couldn’t maintain their efficiency levels when they are forced to increase their usage. And in the top right we see that high-usage lineups increased their efficiency when they were forced to decrease their usage. These binned values have a slope of 0.26, similar to those found by regressing the lineups individually. Interestingly, it looks like the usage vs. tradeoff might have a larger effect on low-usage lineups than high-usage ones. While high-usage lineups did improve their efficiency slightly when decreasing their usage, the increase wasn’t as great as the drop for low-usage lineups that were forced to increase their usage.

Interpreting the results

The big finding from these regressions was that there was a positive, non-zero coefficient on csum%TmPoss in relation to diffORtg. This means that lineups made up of lower-usage players (negative csum%TmPoss) fail to score at the rate projected from the players’ season ORtg’s (negative diffORtg), suggesting they can’t maintain their efficiency levels as they are forced to increase their usage. One can infer that generally, when players in a lineup are forced to increase their usage, their efficiency decreases, and when players are forced to decrease their usage, their efficiency increases. This is evidence of a usage vs. efficiency tradeoff, or diminishing returns for scoring.

As far as the specific value of the coefficient, this research suggests (based on not quite one season’s worth of data) that it’s somewhere around 0.25. Meaning that a lineup composed of five players with a summed usage of 96% would show a 1 point per 100 possessions drop in their combined efficiency (e.g. 107 projected points per 100 possessions to 106 actual points per 100 possessions) due to having to increase their usage to 100%. And a lineup with a summed usage of 108% would show a 2 points per 100 possessions increase in their combined efficiency as a result of being forced to decrease their usage. In general, for every 1% that a lineup has to increase its usage, it’s efficiency decreases by 0.25 points per 100 possessions, and vice versa.

The low R-squared’s and high residual SD’s suggest that summed player usage doesn’t explain much of why some lineups are more efficient than is projected by a usage-weighted average of the ORtg’s of the five players in a lineup. One reason for this is randomness due to small sample sizes - many lineups just don’t play together for that many possessions and thus their actual ORtg’s aren’t very reliable. Two unanalyzed factors that probably explain a lot of the variance are the defensive level of the opposing lineups faced, and the teammate “fit” of the lineup. The first factor could be added to the model, though there would be some tricky issues to deal with. The second is more interesting - some low-usage lineups might be able to up their usage without a drop in efficiency because of specific positive interactions between the players matched together in those lineups. For more on this idea I recommend a recent non-basketball paper by Dean Oliver that can be found here.

Translating these findings to the player level

If a 1% increase in a lineup’s summed usage results in a drop of 0.25 points per 100 possessions, what does this translate to in terms of individual usage and individual efficiency? We can translate this pretty simply by multiplying 0.25 by five, which suggests that for each 1% a player increases his usage, his efficiency drops by 1.25 points per 100 possessions. For example, suppose we had a lineup of of five players all with a %TmPoss of 0.19 and an ORtg of 100 (for this example I’m going to use points per 100 possessions for ORtg). As the lineup increased its usage five percentage points from 0.95 to 1.00, we would expect its ORtg to drop from 100 to 98.75 (a drop of 5*0.25). This could be accomplished by each player’s usage increasing to 0.20 and each player’s efficiency dropping to 98.75. Thus on the player level, a 1% increase in usage would result in a decrease of 1.25 points produced per 100 possessions.

This ratio of +1% player usage to -1.25 player ORtg (in points produced per 100 possessions) suggests a usage vs. efficiency tradeoff twice the size of that found in a study by Dean Oliver. By looking at game-by-game stats, he found that a 1% increase in usage translated to a drop of 0.6 points produced per 100 possessions. However, I don’t think my findings necessarily conflict with his. As discussed above, a big difficulty in studying usage vs. efficiency is separating out the confounding factors that make usage correlate more positively with efficiency but that don’t really mean that there isn’t a diminishing returns effect. These are things such as players using more possessions in games when they are playing better, players using less possessions when they are matched up against better defenders, etc. Often these factors will completely drown out any actual diminishing returns, and a positive relationship will be found between usage and efficiency (such as in this study). Dean’s study was unique in that it was able to find a negative relationship, but it’s still very possible that the confounding factors were present and while they weren’t enough to cancel out the diminishing returns effect, they may have obscured its true size. So the fact that the study that I have done finds a stronger negative relationship between usage and efficiency could be because it controls for more of those confounding factors by looking specifically at situations in which players were forced to increase or decrease their usage.

Conclusions and follow-up

There are a lot of important ramifications to there being a usage vs. efficiency tradeoff, many of which have been discussed previously. Obviously it has an impact on team construction, lineup construction, play calling, and defensive strategies (Dean’s book has a lot of great stuff on these issues). It also affects player valuation. How much value should be attached to shot creation as a distinct ability from shooting efficiency? John Hollinger’s PER weights it heavily, while Dave Berri’s Wins Produced gives it very little weight. My study suggests that shot creation is an important skill, but more research needs to be done to find out just how much extra credit high-usage players deserve.

Another opportunity for further research would be to alter the study that I did to just look at shooting efficiency. My study used individual possessions to measure player usage, individual offensive rating to measure player efficiency, and team offensive rating to measure team efficiency. But one could instead use player shot attempts (including shooting fouls drawn) for player usage, true shooting percentage (points per shot attempt) for player efficiency, and team true shooting percentage for team efficiency. The main reason I didn’t do this first is because it leaves out turnovers, which many have speculated play a large role in the usage vs. efficiency tradeoff. However, it would be nice to do some follow-up studies to better understand how the tradeoff works. So I may try to look just at shooting efficiency in the future, though there would be some added difficulties to doing such a study (for one thing, the lineup data from BasketballValue wouldn’t be sufficient because it doesn’t track the shots taken and made by each lineup). This could help separate out how much the tradeoff impacts shooting as opposed to how much it impacts turnovers.

6 Comments »

  1. Once again, impressive work. I would’ve never thought to use a lineup-based approach in a million years. Regardless of your results here (or even the rebounding thing you did before), that method is going to be your enduring contribution to the field. Thanks to Aaron at BasketballValue for providing the data (criminally underused, in my opinion) and to you for going through all this work.

    Comment by edk — March 7, 2008

  2. Thanks, Ed. I don’t claim too much originality though. I’m just building off of ideas from you, Ben F., Cherokee_ACB, and Guy (and Dean Oliver of course).

    I agree that the BasketballValue lineup data is a gold mine that’s only beginning to be explored.

    Comment by EliMarch 7, 2008

  3. See here for more discussion relating to this post:

    http://sonicscentral.com/apbrmetrics/viewtopic.php?t=1679

    Comment by EliMarch 7, 2008

  4. Thanks for the research and sharing it to move the debate and understanding forward.

    The results make sense to me. High usage lineups maintain or get a small boost. Low usage lineups see a larger decline that accelerates as historical usage declines.

    The higher usage 5 man lineups tend to have more players with high usage obviously but it might be good to quantify and frame this. How does the rate of presence of players over a given usage level(s) change across the curve of all 5 man lineups? Is that changing more or less or roughly equally to change in total usage of the 5 man lineups?

    Perhaps the data could be broken out for top usage guy and rest of team or top 2 and rest of team? The more detail, the more chance to anchor interpretations about what it going on in support from the data.

    I also took interest in the the team average # of different 5 man lineups- about 270 and the number that played 50+ possessions about 18.5. How much total time does the later group have compared to the group below 50 possessions? I understand how the vast array of lineups occurs but I want to know more about the effects of this. Is the efficency of lineups played with 50+ possessions better than that of lineups with fewer than 50 at league level? What does the data look like at team level? Who is best and worst at using “standard ” lineups and alternatively “transition” or “situational” or “random” lineups?

    Look forward to hearing what you can say about these things.

    Comment by Mountain — March 8, 2008

  5. To tie in better with what you’ve done (or at least try) I shoud say (take two)…
    The coefficent of the study with all lineups was bigger than for just 50+ possessions.
    But what does the chart for all lineups look like and compared to the 50+ possesion chart you presented? How do sections of the curves compare? Does team level detail show the greater volatility of small possession lineups fairly evenly across the league or do good teams / coaches do much better with these 5 man lineups than weak teams / coaches? Is it easier for teams to do ok with their standard lineups and harder to separate yourself from the rest with them or not? Is the war won more in the complicated details of the rotation game or in the main matchups?

    Comment by Mountain — March 8, 2008

  6. Hello, I don’t understand the way you are interpretating the r squared value. This term means that the amount of varion explained by the predictor. Based on your r squared terms, this shows there is basically no correlation between the predictor and response. Correct me of I’m wrong, but I think you are mistaking the r squared term with the p value

    Comment by Question — April 2, 2012

RSS feed for comments on this post.

Leave a comment