In the wake of my last few posts on diminishing returns in rebounding, a lot of people have suggested looking at how diminishing returns applies to scoring. This is a more complex issue, but I think some of the same methods can be used to try to understand what’s going on in this part of the game of basketball. For rebounding, we were just looking at the relationship of player rebounding to team rebounding. For scoring, we have to look at the relationship of player efficiency and player usage to team efficiency. Diminishing returns for scoring is really just another way of framing the usage vs. efficiency debate which has been going on in the stats community for years. Does efficiency decrease as usage increases? By how much? What, if any, value should be placed on shot creation? Are coaches using anything near to the optimal strategies in distributing shot attempts among their players? Is Allen Iverson wildly overrated? Was Fred Hoiberg criminally underutilized? The big names in basketball stats like Dean Oliver, Bob Chaikin, John Hollinger, Dan Rosenbaum, and Dave Berri have all staked out positions in this debate. For some background, see here and here and here and here and here and so on and so on. A lot of words have been written on this topic.
The major difficulty in studying the usage vs. efficiency tradeoff is the chicken-and-egg problem - does a positive correlation between usage and efficiency mean that players’ efficiencies aren’t hurt as they attempt to up their usage, or just that in seasons/games/matchups where players are more efficient (for whatever reason) they use more possessions? For instance, if a player is facing a poor defender (which will increase his efficiency) he (or his coach) might increase his usage. But it could be that this positive correlation is drowning out the presence of a real diminishing returns effect. If players go from low-usage, low-efficiency against a good defenders to high-usage, high-efficiency against poor defenders, it still could be the case that if they tried to increase their usage against average defenders their efficiency would decrease. Defender strength is just one of the factors that can cloud things - another confound comes from game-to-game or season-to-season variation in a player’s abilities (e.g. a player being “hot” or having an “off game”, a player being injured or tired, or a player using more possessions as his skills improve from year to year).
By using the method from my last study on diminishing returns for rebounding, it’s possible to largely avoid this chicken-and-egg problem. This method looks at situations in which some or all of the players on the court were forced to increase their usage (relative to their average usage on the season). And on the other side, it looks at lineups in which some or all of the players on the court were forced to decrease their typical usage. By looking at these forced cases the method minimizes the confounds from players increasing or decreasing their usage by choice in favorable situations.
There are a few extra steps involved in applying the method to study diminishing returns in scoring, but the core ideas are the same as for studying rebounding - use player season stats to project each lineup’s performance in a manner that assumes no diminishing returns, and then compare those projections to how the lineups actually fared. The more accurate the projections are, the smaller the effect of diminishing returns.
Applying the method to scoring
First, I needed a way to project the efficiency (points per possession) of a lineup based on the season performances of the five players in the lineup, in a manner that assumed no diminishing returns. To do so I used two statistics created by Dean Oliver, individual offensive rating (ORtg), and individual possessions (Poss). Unlike a composite player rating such as PER, these stats divide up usage (Poss) and efficiency (ORtg) into separate components. Poss measures how many of his team’s possessions a player uses by scoring, missing a shot that’s rebounded by the defense, or turning the ball over. ORtg measures how many points a player produces for each possession that he uses (actually, typically it measures points per 100 possessions, but for this study I am scaling it to points per single possession). For details on calculating these stats, buy Dean’s book Basketball on Paper, or see this article.
For each lineup, I used the ORtg and Poss of all five players to project the lineup’s points per possession. First I converted each player’s Poss into %TmPoss, the percent of their team’s possessions that the player typically uses when on the court.
%TmPoss = Poss/((5*MIN/tmMIN)*tmPoss)
Then I took a weighted average of the five ORtg’s, with the weights being each player’s %TmPoss.
projORtg = (P1ORtg*P1%TmPoss + P2ORtg*P2%TmPoss + P3ORtg*P3%TmPoss + P4ORtg*P4%TmPoss + P5ORtg*P5%TmPoss)/ (P1%TmPoss + P2%TmPoss + P3%TmPoss + P4%TmPoss + P5%TmPoss)
The method assumes that high-usage players will use more possessions than their low-usage teammates, and thus have a greater impact (either good or bad) on the lineup’s overall efficiency. Importantly, this is a way of summing individual ORtg’s to arrive at a team ORtg without assuming any diminishing returns. For lineups composed of low usage players, where the sum of the five players’ %TmPoss is less than one, the projection assumes that the players in the lineup will maintain their efficiencies (ORtg) even though they will be forced to ramp up their usages above their typical level. And on the other side, if the lineup is a collection of high usage players with a summed %TmPoss of more than one, the projection assumes that even though the players will have to decrease their usage, their efficiencies will not increase. I will refer to the sum of the five players’ %TmPoss (which is the denominator of projORtg) as sum%TmPoss:
sum%TmPoss = P1%TmPoss + P2%TmPoss + P3%TmPoss + P4%TmPoss + P5%TmPoss
I then centered sum%TmPoss around 0 by subtracting 1 so that high-usage lineups would have positive values and low-usage lineups would have negative values:
csum%TmPoss = (P1%TmPoss + P2%TmPoss + P3%TmPoss + P4%TmPoss + P5%TmPoss) - 1
For each lineup, I calculated how many points it scored per possession when it was on the court and called this its actORtg.
actORtg = TmPTS/TmPoss
This meant that for every lineup I had three numbers - its projected offensive rating based on the players’ season stats (projORtg), its summed possession usage based on the players’ season stats (csum%TmPoss), and its actual offensive rating in the minutes the lineup played together (actORtg). To determine whether a lineup’s actual performance measured up to its projected performance, I subtracted projORtg from actORtg to get diffORtg. A positive diffORtg indicates that a lineup was more efficient than the no-diminishing-returns model projected it would be, and a negative diffORtg indicates that a lineup was less efficient than projected.
diffORtg = actORtg - projORtg
Here’s an example to see how this works, using one of Portland’s lineups (with data through 2/28 - my lineup data from this study came from BasketballValue and my player data came from Basketball-Reference):
Player ORtg %TmPoss ----------------- ---- ------- Steve Blake 1.10 0.15 Brandon Roy 1.12 0.25 Martell Webster 1.05 0.17 LaMarcus Aldridge 1.07 0.24 Joel Przybilla 1.09 0.12 This lineup played together for 682 possessions and scored 694 points. sum%TmPoss = 0.15 + 0.25 + 0.17 + 0.24 + 0.12 = 0.94 csum%TmPoss = 0.94 - 1 = -0.06 projORtg = (1.10*0.15 + 1.12*0.25 + 1.05*0.17 + 1.07*0.24 + 1.09*0.12)/(0.15 + 0.25 + 0.17 + 0.24 + 0.12) = 1.09 actORtg = 694/682 = 1.02 diffORtg = 1.02 - 1.09 = -0.07
So this was a low-usage lineup (csum%TmPoss of -0.06) that scored less efficiently than was projected (diffORtg of -0.07).
From following that example, you can probably see where this is going. Though you might be wondering why I went to all that trouble to calculate projORtg when one could just look at the relationship between sum%TmPoss and actORtg to see whether high-usage lineups tend to be more efficient than low-usage ones. The reason that’s not a good approach is because low-usage players tend to be less efficient than high-usage ones, so we have to first control for the efficiencies of the players in a lineup (which projORtg and diffORtg do) to really get at whether there’s a usage vs. efficiency tradeoff.
So I ran an ordinary least squares regression on all 8116 lineups that had played together through 2/28, using diffORtg as the outcome variable and csum%TmPoss as the predictor variable.
lm(formula = diffORtg ~ csum%TmPoss) coef.est coef.se (Intercept) -0.05 0.01 csum%TmPoss 0.43 0.07 --- n = 8116, k = 2 residual sd = 0.55, R-Squared = 0.004
For this first regression, the coefficient on csum%TmPoss was 0.43, with a standard error of 0.07. The R-squared was 0.004 and the residual standard deviation was 0.55. Finding a positive and statistically significant coefficient suggests that there is a usage vs. efficiency tradeoff, but I was afraid that very small sample size lineups could be skewing the results (e.g. lineups that played together for 2 possessions and scored 5 points for an actORtg of 2.50, or lineups that played together for 5 possessions and scored 0 points for an actORtg of 0.00).
To deal with this I tried a few different techniques. First, I removed all lineups that had played together for less than 50 possessions from the regression. This left 555 lineups. I ran the same regression and got a coefficient on csum%TmPoss of 0.24, with a standard error of 0.08. The R-squared was 0.02 and the residual standard deviation was 0.12.
lm(formula = diffORtg50 ~ csum%TmPoss50) coef.est coef.se (Intercept) 0.00 0.01 csum%TmPoss50 0.24 0.08 --- n = 555, k = 2 residual sd = 0.12, R-Squared = 0.02
The other techniques I tried yielded similar coefficients. I ran a weighted least squares regression on all lineups, using possessions played together as the weights, and found a coefficient of 0.26 with a standard error of 0.04. And I ran an OLS regression on the 244 lineups that had played together for at least 100 possessions and got a coefficient of 0.27 with a standard error of 0.09.
lm(formula = diffORtg ~ csum%TmPoss, weights = TmPoss) coef.est coef.se (Intercept) -0.01 0.00 csum%TmPoss 0.26 0.04 --- n = 8116, k = 2 residual sd = 1.10, R-Squared = 0.01 lm(formula = diffORtg100 ~ csum%TmPoss100) coef.est coef.se (Intercept) -0.01 0.01 csum%TmPoss100 0.27 0.09 --- n = 244, k = 2 residual sd = 0.08, R-Squared = 0.04
All this put together suggests a statistically significant positive slope of around 0.25 for csum%TmPoss in relation to diffORtg.
A visual presentation of the usage vs. efficiency tradeoff
To get a sense of how this data looks, I put it into a chart similar to the ones I made for my last study on diminishing returns in rebounding. I divided all the lineups (even those that played together for very few possessions) into bins based on their summed player usages (csum%TmPoss). For each bin, I then took a weighted average of all the contained lineups’ diffORtg’s, using the number of possessions each lineup played together for the weights. That yielded the diffORtg’s that were plotted on the Y-axis. High values on the X-axis represent lineups composed of higher-usage players that were forced to decrease their usage, while low values represent lineups that were low-usage and had to increase their usage. On the Y-axis, high values represent lineups that scored more points per possession than was projected from the season ORtg’s of the five players in the lineup, while low values represent lineups that were less efficient than projected. The dashed gray line represents what we would expect if there were no diminishing returns and no usage vs. efficiency tradeoff - there would be no connection between the usages of the players in a lineup and whether it overperformed or underperformed its projected efficiency level.
The positive slope of the blue line suggests that there is a usage vs. efficiency tradeoff. In the bottom left, we see that lineups made of low-usage players couldn’t maintain their efficiency levels when they are forced to increase their usage. And in the top right we see that high-usage lineups increased their efficiency when they were forced to decrease their usage. These binned values have a slope of 0.26, similar to those found by regressing the lineups individually. Interestingly, it looks like the usage vs. tradeoff might have a larger effect on low-usage lineups than high-usage ones. While high-usage lineups did improve their efficiency slightly when decreasing their usage, the increase wasn’t as great as the drop for low-usage lineups that were forced to increase their usage.
Interpreting the results
The big finding from these regressions was that there was a positive, non-zero coefficient on csum%TmPoss in relation to diffORtg. This means that lineups made up of lower-usage players (negative csum%TmPoss) fail to score at the rate projected from the players’ season ORtg’s (negative diffORtg), suggesting they can’t maintain their efficiency levels as they are forced to increase their usage. One can infer that generally, when players in a lineup are forced to increase their usage, their efficiency decreases, and when players are forced to decrease their usage, their efficiency increases. This is evidence of a usage vs. efficiency tradeoff, or diminishing returns for scoring.
As far as the specific value of the coefficient, this research suggests (based on not quite one season’s worth of data) that it’s somewhere around 0.25. Meaning that a lineup composed of five players with a summed usage of 96% would show a 1 point per 100 possessions drop in their combined efficiency (e.g. 107 projected points per 100 possessions to 106 actual points per 100 possessions) due to having to increase their usage to 100%. And a lineup with a summed usage of 108% would show a 2 points per 100 possessions increase in their combined efficiency as a result of being forced to decrease their usage. In general, for every 1% that a lineup has to increase its usage, it’s efficiency decreases by 0.25 points per 100 possessions, and vice versa.
The low R-squared’s and high residual SD’s suggest that summed player usage doesn’t explain much of why some lineups are more efficient than is projected by a usage-weighted average of the ORtg’s of the five players in a lineup. One reason for this is randomness due to small sample sizes - many lineups just don’t play together for that many possessions and thus their actual ORtg’s aren’t very reliable. Two unanalyzed factors that probably explain a lot of the variance are the defensive level of the opposing lineups faced, and the teammate “fit” of the lineup. The first factor could be added to the model, though there would be some tricky issues to deal with. The second is more interesting - some low-usage lineups might be able to up their usage without a drop in efficiency because of specific positive interactions between the players matched together in those lineups. For more on this idea I recommend a recent non-basketball paper by Dean Oliver that can be found here.
Translating these findings to the player level
If a 1% increase in a lineup’s summed usage results in a drop of 0.25 points per 100 possessions, what does this translate to in terms of individual usage and individual efficiency? We can translate this pretty simply by multiplying 0.25 by five, which suggests that for each 1% a player increases his usage, his efficiency drops by 1.25 points per 100 possessions. For example, suppose we had a lineup of of five players all with a %TmPoss of 0.19 and an ORtg of 100 (for this example I’m going to use points per 100 possessions for ORtg). As the lineup increased its usage five percentage points from 0.95 to 1.00, we would expect its ORtg to drop from 100 to 98.75 (a drop of 5*0.25). This could be accomplished by each player’s usage increasing to 0.20 and each player’s efficiency dropping to 98.75. Thus on the player level, a 1% increase in usage would result in a decrease of 1.25 points produced per 100 possessions.
This ratio of +1% player usage to -1.25 player ORtg (in points produced per 100 possessions) suggests a usage vs. efficiency tradeoff twice the size of that found in a study by Dean Oliver. By looking at game-by-game stats, he found that a 1% increase in usage translated to a drop of 0.6 points produced per 100 possessions. However, I don’t think my findings necessarily conflict with his. As discussed above, a big difficulty in studying usage vs. efficiency is separating out the confounding factors that make usage correlate more positively with efficiency but that don’t really mean that there isn’t a diminishing returns effect. These are things such as players using more possessions in games when they are playing better, players using less possessions when they are matched up against better defenders, etc. Often these factors will completely drown out any actual diminishing returns, and a positive relationship will be found between usage and efficiency (such as in this study). Dean’s study was unique in that it was able to find a negative relationship, but it’s still very possible that the confounding factors were present and while they weren’t enough to cancel out the diminishing returns effect, they may have obscured its true size. So the fact that the study that I have done finds a stronger negative relationship between usage and efficiency could be because it controls for more of those confounding factors by looking specifically at situations in which players were forced to increase or decrease their usage.
Conclusions and follow-up
There are a lot of important ramifications to there being a usage vs. efficiency tradeoff, many of which have been discussed previously. Obviously it has an impact on team construction, lineup construction, play calling, and defensive strategies (Dean’s book has a lot of great stuff on these issues). It also affects player valuation. How much value should be attached to shot creation as a distinct ability from shooting efficiency? John Hollinger’s PER weights it heavily, while Dave Berri’s Wins Produced gives it very little weight. My study suggests that shot creation is an important skill, but more research needs to be done to find out just how much extra credit high-usage players deserve.
Another opportunity for further research would be to alter the study that I did to just look at shooting efficiency. My study used individual possessions to measure player usage, individual offensive rating to measure player efficiency, and team offensive rating to measure team efficiency. But one could instead use player shot attempts (including shooting fouls drawn) for player usage, true shooting percentage (points per shot attempt) for player efficiency, and team true shooting percentage for team efficiency. The main reason I didn’t do this first is because it leaves out turnovers, which many have speculated play a large role in the usage vs. efficiency tradeoff. However, it would be nice to do some follow-up studies to better understand how the tradeoff works. So I may try to look just at shooting efficiency in the future, though there would be some added difficulties to doing such a study (for one thing, the lineup data from BasketballValue wouldn’t be sufficient because it doesn’t track the shots taken and made by each lineup). This could help separate out how much the tradeoff impacts shooting as opposed to how much it impacts turnovers.