So for the foreseeable future, I won’t be posting any of my original basketball research on this blog. My old posts will still be here, the links page will still be here (I’ll try to keep it updated), and I still may make occasional blog posts on other topics (perhaps just linking to other people’s work).

Thanks to everyone who has posted comments or sent me emails regarding past posts. And going forward if anyone has any thoughts or questions about old posts feel free to comment or email and I’ll try to respond.

]]>

Here is all Dan originally said about his method of splitting adjusted plus/minus into offensive and defensive components:

“I also present offensive and defensive ratings that are based on the pure adjusted plus/minus rating plus an “efficiency” rating that measures how many points per possession are scored by both teams when a given player is one the floor. By combining these two measures, I create offensive and defensive ratings. However, given that I am using two imprecisely estimated ratings to arrive at these offensive ratings, I suspect these rating are measured with quite a bit of error.”

I think I’ve been able to piece together what he meant, and the methodolgy is actually very clever. Instead of doing one regression for offense and one for defense, or one large regression containing both, this method starts with the regression for total adjusted plus/minus and then combines the results of that with a second regression that tells what portion of each player’s value came from their offense versus their defense.

The first step is to run the adjusted plus/minus regression as described in my previous post. The dependent variable is MARGIN (point differential per 100 possessions) and the independent variables are all the players in the league (other than the omitted low-minute reference players). Players are coded 1 if they were on the court for the home team for that observation, -1 if they were on the court for the away team, and 0 if they were not on the court. After the regression is run the player coefficients are re-centered so that the league average adjusted plus/minus is zero.

The second regression is similar to the first, but with two big changes. Instead of using MARGIN as the dependent variable, this one uses what Dan called “Efficiency”, but which I will call “DiffOD”.

DiffOD = 100*(HomePts/HomePoss) + 100*(AwayPts/AwayPoss)

DiffOD measures the difference between a team’s offensive strength and defensive strength. If a lineup is good offensively (high PtsScored/Poss) and bad defensively (high PtsAllowed/Poss), it will have a high DiffOD. If a lineup is good on both offense and defense, then their DiffOD will be lower (the value will be brought up by their good offense but brought down by their good defense - the same holds in reverse if the team is bad on both offense and defense). If they are bad on offense and good on defense, then their DiffOD will be very low (low PtsScored/Poss and low PtsAllowed/Poss). So DiffOD can be seen as offensive strength minus defensive strength.

The other change in this regression is that both the home team’s players and the away team’s players that were on the court are coded as 1, with all players not on the court coded as 0. This is because unlike MARGIN, where one lineup’s high adjusted plus/minus players would be neutralized by an opposing lineup’s high adjusted plus/minus players, in the case of DiffOD, one lineup being made up of good offense/bad defense players combines with an opposing lineup of good offense/bad defense players to lead to an even higher DiffOD (more points per possession by both teams).

The coefficients from this regression (once they are re-centered around the league average) give each player’s adjusted DiffOD. A high DiffOD means that controlling for teammates and opponents, that player leads to his team being better offensively than they are defensively. A player with a low DiffOD leads to his team being worse offensively than they are defensively. So in theory, a high DiffOD player contributes more to his team through his offense than his defense. Another way of saying this is that most of his value comes from his offense, or that he’s better on offense than he is on defense.

So now we have each player’s adjusted plus/minus and their adjusted DiffOD. We know their overall contributions, and the proportion of those contributions that came from their offense versus their defense. Now we just have to figure out how to combine these two numbers to calculate their offensive adjusted plus/minus and defensive adjusted plus/minus. The way I figured out to do this (which I’m guessing is the way Dan did it) is actually pretty simple.

First I went back and looked at the lineup-level dependent variables, MARGIN and DiffOD:

MARGIN = 100*(HomePts/HomePoss) - 100*(AwayPts/AwayPoss) DiffOD = 100*(HomePts/HomePoss) + 100*(AwayPts/AwayPoss)

We can look at these in terms of offensive and defensive efficiency:

MARGIN = ORtg - DRtg DiffOD = ORtg + DRtg MARGIN + DiffOD = 2*ORtg ORtg = (MARGIN + DiffOD)/2 DiffOD - MARGIN = 2*DRtg DRtg = (DiffOD - MARGIN)/2

Converting this back to player adjusted plus/minus and player adjusted DiffOD, we get the following formulas:

Offensive Adjusted Plus/Minus = (Adjusted Plus/Minus + DiffOD)/2 Defensive Adjusted Plus/Minus = (Adjusted Plus/Minus - DiffOD)/2

You’ll notice that adjusted plus/minus and DiffOD flip-flopped in the formula for defensive adjusted plus/minus compared to the relationship between MARGIN and DiffOD in the DRtg formula. This is because while a lineup that plays good defense will have a low DRtg, I wanted to present defensive adjusted plus/minus so that a player who is a good defender has a high (rather than low) value.

Using these formulas we can take each player’s adjusted plus/minus and use their DiffOD to split it into offensive adjusted plus/minus and defensive adjusted plus/minus.

The first thing to note about the results is that they are noisy. Just as each estimate of adjusted plus/minus has a margin of error, so does each estimate of DiffOD. By combining these two estimates to get offensive and defensive adjusted plus/minus we are combining the magnitudes of these errors. Furthermore, the same players that have large standard errors for adjusted plus/minus have large standard errors for DiffOD. Thus in addition to there being a large degree of uncertainty regarding Dwight Howard’s overall contributions, there is also a large degree of uncertainty regarding how to divide up those contributions between offense and defense. These errors would be reduced by looking at a larger sample of data, which I hope to do in the future now that I’ve got the methodology down. The alternate methods suggested by Lior and Cherokee_ACB may also result in smaller standard errors.

All of these tables are for the 2007-08 season and include only players who played at least 388 minutes (the cutoff used by BasketballValue). The position designations are those used by Doug’s Stats (I haven’t altered any of them though some are definitely inaccurate). See here for a spreadsheet with the data for all players.

**Offensive Adjusted Plus/Minus, Top and Bottom Ten:**

The top four are probably the same four that many people would rank as the four best offensive players in the league, in some order. I can’t explain Chris Quinn.

**Defensive Adjusted Plus/Minus, Top and Bottom Ten:**

Ronnie Price is almost definitely a fluke, considering his low minutes and the huge difference between his defensive rating and those of other point guards (which can be seen in PG defense table below). The Hornets players have some interesting ratings, with West and Stojakovic (see the SF defense table below) looking great defensively while Paul looks awful.

**POINT GUARDS, sorted by Offensive Adjusted Plus/Minus:**

**SHOOTING GUARDS, sorted by Offensive Adjusted Plus/Minus:**

**SMALL FORWARDS, sorted by Offensive Adjusted Plus/Minus:**

**POWER FORWARDS, sorted by Offensive Adjusted Plus/Minus:**

**CENTERS, sorted by Offensive Adjusted Plus/Minus:**

**POINT GUARDS, sorted by Defensive Adjusted Plus/Minus:**

**SHOOTING GUARDS, sorted by Defensive Adjusted Plus/Minus:**

**SMALL FORWARDS, sorted by Defensive Adjusted Plus/Minus:**

**POWER FORWARDS, sorted by Defensive Adjusted Plus/Minus:**

**CENTERS, sorted by Defensive Adjusted Plus/Minus:**

Here the averages for offensive, defensive, and total adjusted plus/minus by position (weighted by possessions played):

Pos Off Def Tot --- ---- ---- ---- PG 0.8 -2.0 -1.2 SG 0.5 -0.8 -0.3 SF 0.9 0.2 1.0 PF -0.8 1.0 0.3 C -2.1 2.5 0.5

These orderings are similar to those reported by Dan Rosenbaum in 2005. The high offensive adjusted plus/minus average for small forwards may be an anomaly.

]]>That sounds easy enough, but it’s actually kind of complicated, and the specifics of WINVAL were never made public (Mark Cuban reportedly was paying a handsome sum to use the system for the Mavs). Thankfully, in 2004 Dan Rosenbaum spelled out the details of the methodology in an article. He called his version adjusted plus/minus, and released a series of analyses using the metric (here and here). Eventually Dan was hired to consult for the Cleveland Cavaliers, but because he had spelled out the methodology others were able to duplicate his work for future seasons. David Lewin published rankings for the 2004-05 and 2005-06 seasons, and Steve Ilardi and Aaron Barzilai have done the same for the 2006-07 and 2007-08 seasons (up-to-date ratings can be found here).

I always wanted to try to calculate adjusted plus/minus on my own, but I was intimidated. I figured that I didn’t know enough about running regressions and that I didn’t have the data, software, or computing power to run such a large analysis. But I finally sat down and tried to do it a few days ago, and I discovered that it’s not that difficult. Using Dan Rosenbaum’s description of his method, publicly available data from BasketballValue, Excel 2007, and the free statistics program R, I was able to set up and run the whole thing in less than an hour. Here’s how I did it.

I used Excel 2007 to set up the data. Unfortunately, I don’t think an earlier version of Excel (such as Excel 2003) will work because previous versions were limited to 256 columns, and we’ll be working with a big table with over 300 columns (basically one for each player in the league). I haven’t really looked into ways of getting around this issue if you don’t have Excel 2007. It’s probably possible to do the data setup in a database program, in R itself, or through some scripting. If anyone has an ideas on this please let me know.

I used R to run the regression. I used R because it’s free, but I’m sure other programs like SPSS, SAS, Stata or Minitab would work as well. I’m not going to get into the details of installing R, but I will point to the R for Windows FAQ and the download page for the latest version.

Thanks to Aaron Barzilai’s fantastic site BasketballValue, the raw data needed to calculate adjusted plus/minus is readily available in an easy to use format. The key is having play-by-play data grouped into stints by the lineups on the court. The file we want is labeled “List of each matchup of one unit against another” on this page. Download the one from the 2007-08 regular season (matchups20080417.zip). While you’re there also download the “Statistics of all players across all teams played for” file (playerstats20080417.zip). Unzip both of these files.

Open a new Excel spreadsheet. Go to the “Data” tab and select “From Text” in the “Get External Data” section. Navigate to wherever you saved the unzipped matchups200804170110.txt file, select it, and click “Import”. Make sure “Delimited” is selected and click “Next”. Make sure “Tab” is selected and click “Next”. Click “Finish” and then click “OK”. This will import 35459 rows with data in columns A through AU.

In cell AV1 type “HomePlayers” and in cell AW1 type “AwayPlayers”. In AV2 type this formula:

="P"&F2&",P"&G2&",P"&H2&",P"&I2&",P"&J2&","

Double-click on the bottom right corner of the cell to autofill this formula into all of the cells in the column. In AW2 type this formula and then autofill it down:

="P"&K2&",P"&L2&",P"&M2&",P"&N2&",P"&O2&","

These will be used to quickly check whether a given player was on the court for a particular stint.

In AX1 type “MARGIN”. In AX2 type this formula and then autofill it down:

=100*(IF(AH2=0,125002/114854,AD2/AH2)-IF(AI2=0,120811/114896,AE2/AI2))

This is basically PointsScoredHome/PossessionsHome minus PointsScoredAway/PossessionsAway, which gives us the per-possession plus/minus of the lineup that was on the court for the home team. The complications come from stints in which one of the teams had zero possessions. In those cases we substitute the league average home (or away) team points per possession, which is what 125002/114854 and 120811/114896 represent (I got those four numbers by simply summing the PointsScoredHome, PossessionsHome, PointsScoredAway, and PossessionsAway columns). I’m not completely sure this is what should be done when one lineup has zero possessions, but it was the best sense I could make of Rosenbaum’s footnote.

At this point you may notice that this new MARGIN column has similar values to the OverallRtgHomevsAway column (AL), which is equal to OffensiveRtgHome (AJ) minus OffensiveRtgAway (AK). The reasons I didn’t just use the OverallRtgHomevsAway column are because it doesn’t have the adjustment for zero possessions detailed above, and because it contains some screwy numbers that I can’t explain. This can be seen in the very first row, where PointsScoredAway is 5 and PossessionsAway is 14, yet OffensiveRtgAway is 33.3333 (rather than 100*(5/14) = 35.7143). I don’t know if this is just an error in Aaron’s data or what, but as a result I decided to calculate things on my own.

Now that we have the dependent variable (MARGIN) for each observation, we have to set up the independent variables. Basically, every player in the league is an independent variable, other than those players who are excluded due to playing low minutes (these players become the reference players relative to which all other players’ adjusted plus/minus is calculated). So we first have to decide what our minutes cutoff will be. In his initial work Rosenbaum used 250 minutes over two seasons as his cutoff. Others have used 500 minutes in one season. For the purposes of this example I will use the cutoff used by Aaron Barzilai and Steve Ilardi of only those players who were in the top 75% in the league in minutes played. For 2007-08 this works out to a cutoff of 388 minutes, with 339 of the 467 players in the league qualifying and the other 128 serving as the reference players (I know this doesn’t come out to exactly 75%, but it’s the players that BasketballValue uses so it’s what I will use).

Here’s where that other file we downloaded comes in handy. Go to Sheet2 of the spreadsheet and then go to the “Data” tab and select “From Text” in the “Get External Data” section. Import the playerstats200804170110.txt file following the same steps as were described above to import the matchups200804170110.txt file. Once this is imported, if you scroll all the way over to the AdjustedPM column (AL), you’ll see that non-qualifying players are listed as “NULL”. Click on AL1 and sort the column A to Z. Scroll back over and insert a column immediately to the left of column B (right-click on “B” and select “Insert”). In this new column, type “PID” in B1. In B2 type the formula =”P”&A2 and then autofill this down. Scroll back over to column AM and scroll down until you find the last row with a value (before all the NULLs). This should be row 340. Scroll back over to column B. Click in B340, hold, and drag all the way up to B2. Hold down the Control key and hit C to copy the selected cells. Go back to Sheet1 of the spreadsheet and click in cell AY1. Go to the “Home” tab and click on the arrow below “Paste” in the “Clipboard” section. Select “Paste Special” and then select “Values” in the top section and “Transpose” in the bottom right before clicking “OK”. This will create the column headers for each independent variable, the 339 qualifying players (it doesn’t really matter what order they’re in, since we can identify them by their BasketballValue PlayerID number). At this point you should definitely save the spreadsheet, if you haven’t been doing so already.

While we’re working with the playerstats data, we’ll set something else up that will come in handy later when running the regression in R. Go back to Sheet2 and insert a column immediately to the left of column C (right-click on “C” and select “Insert”). In C1 type “fit.adj0708 <- lm(adj0708$MARGIN ~" (not including the quotation marks). In C2 type the formula ="adj0708$"&B2&" +" (including all the quotation marks and the space in the middle), and then autofill it down. Select column C, hit Ctrl-C to copy and then go to the "Home" tab, click the arrow below "Paste", and select "Paste Values". Click in cell C1 and drag down all the way down to cell C340 (but no further). Hit Ctrl-C to copy, then go to Sheet3 of the spreadsheet, go to the "Home" tab, click the arrow below "Paste", and select "Transpose". Immediately hit Ctrl-C to copy, then open Notepad (Start -> Run -> notepad) and paste (Ctrl-V). Scroll up to the top, and select the blank space between the tilde (~) and the “a” in adj0708$P235. Hit Ctrl-C to copy. Go to the “Edit” menu, select “Replace”, and hit Ctrl-V to paste into the “Find what” box (it will look like some blank space got pasted in). In the “Replace with” box type a space, and click the “Replace All” button. Then click “Cancel”. Click at the very bottom of the file which should put the cursor at the start of a blank line. Hit “Backspace” three times (which should put the cursor right after adj0708$P732), then type a comma, then a space, and then “weights=adj0708$Poss)” (not including the quotation marks). Save this text file, as we will need it later (if for some reason you can’t get this to work, you can download a completed version of the text file here).

This next step is the most computer processor-intensive of the whole process. For each stint we have to code each player as either being on the court for the home team in that stint (1), on the court for the away team (-1), or not on the court (0). This can be automated by a simple Excel formula, but then that formula has to run on over 12 million cells (339 x 35458). So before you do this, make sure you’ve saved the spreadsheet and that no other programs are running. This is probably a reason why the data setup should be done in a database program like MySQL rather than Excel, but oh well.

In AY2 type the following formula:

=IF(ISNUMBER(FIND(AY$1&",",$AV2)),1,IF(ISNUMBER(FIND(AY$1&",",$AW2)),-1,0))

Click on the bottom right corner of AY2, hold, and drag all the way over to the right until you reach the last column (NY). At this point you could double-click on the bottom right corner of NY2 to autofill down for all the columns, but I wouldn’t recommend it unless you have a very fast computer. Instead, I’d recommend splitting the columns up into chunks depending on your processor speed. Scroll back over to column AY, click in AY2 and drag over to the right about 10 cells. Then double-click on the bottom right corner of the right-most cell you selected and wait for it to autofill. This will give you a sense of how long the process will take. While Excel is working you might see a green progress bar in the bottom right of the screen, or the screen could appear to freeze with the Excel title bar saying that the program is “Not Responding”. Just ignore this and wait until it completes (if it never does, you can try to abort by pressing the “Break” key on the keyboard (usually it’s near the top right) or by using Ctrl-Alt-Delete to kill Excel). Once the autofill is done, before clicking anywhere else (while all the cells are still selected), hold down the “Control” key and hit “C.” Then go to the “Home” tab and click on the arrow below “Paste” in the “Clipboard” section. Select “Paste Values”. This also may take a little time to complete. If you did all that and your computer didn’t freeze, you can try a larger chunk. For me, 50-column wide chunks worked well. I selected the next 50 cells in row 2, autofilled them down (which took about 30 seconds), copied and pasted values (another 10 seconds), saved the whole file, and then went on to the next 50 columns. If 50 columns seems like pushing it, you could do 25 at a time. Overall there are 339 columns that have to be autofilled. Once you finish and save the file the filesize will be almost 50 megabytes.

If your computer doesn’t agree with all that, you could set up the data in a database program, or just download this file I created that has all the work done for you (including the step described in the next paragraph).

Now we have all the values for the independent variables. All we have to do now is clean up the spreadsheet and save the data we need to a CSV file. First select the MARGIN column (AX), hit Ctrl-C to copy, and go to Paste Values to paste (as described above). Next, select column AY, right-click on the AY heading, and select “Insert” to insert a column. In AY1 type “Poss”. In AY2 type the formula =AH2+AI2, and then autofill down. Select this newly created column, hit Ctrl-C to copy, and go to Paste Values to paste. This created a column of cells telling us how much each stint should be weighted (which is by the total number of possessions).

Save the spreadsheet. Then select all the columns from A to AW, right-click in one of the column headings (e.g. A) and select “Delete”. Do NOT click save after this. Instead, go to “Save As” -> “Other Formats” and in the “Save as type” dropdown select “CSV (Comma delimited)”. Name the file “adj0708″ and click “Save”. Excel will warn you that you can’t save multiple sheets, click “OK”. It will then warn you that there may be incompatible features, click “Yes”. You can now exit Excel (it will ask you whether you want to save changes to the CSV and you should say “No”). The CSV you created should be around 25 megabytes. Again, here’s a download link to the CSV that I created by following these steps - you can use it as a point of comparison if you want.

Now we have to import the CSV we created into R. To do this you’ll need the full path to wherever you saved the CSV file. The only trick is that when entering the path in R you should replace all the back-slashes (\) with forward-slashes (/). Enter the following command into R (replacing the path as necessary) to load the data into a table called “adj0708″ (it may take a minute to read the file):

adj0708 <- read.csv(file="c:/directoryname/adj0708.csv")

Next we have to enter the details of the regression we want to run. The basic format for using the lm() function is this:

RegressionName <- lm(TableName$DependentVariableColumn ~ TableName$IndependentVariable1Column + TableName$IndependentVariable2Column + ... , weights = TableName$WeightsColumn)

Here’s where that text file we created earlier comes in handy. It contains the exact command to enter into R to run the regression. Open the file, hit Ctrl-A to select all and Ctrl-C to copy. Go to the R command prompt and hit Ctrl-V to paste. You will see it paste everything in character by character. Once it has finished, hit enter, and the regression will run. This should take under a minute.

Once that’s complete, enter the following two commands to output the results of the regression to a file called results.txt in the R files directory:

sink("files/results.txt") summary(fit.adj0708)

Open the results.txt file which should be in your R files directory. First you’ll see the formula for the regression. Next is the coefficients section. Select this whole section (starting at the beginning of the line below “Coefficients:” and ending just before the “—” line) and copy it. Open a new Excel spreadsheet and paste in the selection. Go to the “Data” tab and select “Text to Columns” in the “Data Tools” section. Make sure “Fixed width” is selected and click “Next”. In the Data preview section insert a line by clicking just after the right parenthesis in Pr(>|t|). Click “Next”, then click “Finish”. In A1 type “PlayerCode” and in F1 type “Signif”. Go to Sheet2 of the spreadsheet then go to the “Data” tab and select “From Text” from the “Get External Data” tab. Navigate to the playerstats200804170110.txt file that we downloaded and unzipped earlier and click “Import”. Make sure “Delimited” is selected and click “Next”. Make sure “Tab” is selected and click “Next”. Then click “Finish” and then “OK”. Go back to Sheet1 of the spreadsheet. Right-click on the “B” column heading and select “Insert”. In B1 type “Player”. In B2 type the following formula and then autofill it down:

=VLOOKUP(VALUE(RIGHT(A2,LEN(A2)-9)),Sheet2!A:AM,2,FALSE)

Click on cell C1 and then sort Z to A. This will sort all players by their adjusted plus/minus. The magnitudes will be greater than those on BasketballValue because they center their results around the league average. To do this first right-click on the “C” column heading and select “Insert”. Type “Minutes” in C1. In C2 type this formula and then autofill down:

=IFERROR(VLOOKUP(VALUE(RIGHT(A2,LEN(A2)-9)),Sheet2!A:AM,4,FALSE),"")

Next right-click on the “E” column heading and select “Insert”. Type “Centered” in E1. In E2 type the formula =D2-SUMPRODUCT(D$2:D$341,C$2:C$341)/SUM(C$2:C$341) and then autofill it down (sumproduct(adj,min)/sum(min) is the minutes-weighted league average). This gives the final results, which should be close to the listings on BasketballValue (which are also listed in column AL on Sheet2). I’m not sure why the numbers aren’t identical - it may have something to do with the weird values in OverallRtgHomevsAway that I discussed earlier. Regardless, here’s the top ten that I got (which hopefully matches your results if you followed the steps outlined):

Player Adj +/- StErr --------------- ------- ----- Amir Johnson 13.57 6.53 Dwight Howard 12.71 10.12 Ronnie Price 12.18 6.73 Antawn Jamison 11.89 5.00 Thaddeus Young 11.86 3.96 Peja Stojakovic 11.85 5.52 Chris Bosh 11.64 4.13 Kobe Bryant 11.60 5.16 Manu Ginobili 11.56 3.63 Jamal Crawford 10.08 4.71

I need to figure out how to separate this into offensive and defensive adjusted plus/minus. I’m not sure how to do this. Running a different regression using ORtg (or DRtg) as the dependent variable instead of MARGIN would seem to make sense, but the problem is that the independent variables would have to be a mix of player adjusted offensive +/-’s and opponent adjusted defensive +/-’s (but of course these would have to change for different observations). Using MARGIN as the dependent variable but doubling the number of independent variables (with each player having one entry for their offense and one for their defense) doesn’t seem to work either. Dan Rosenbaum’s original article suggested that he combined adjusted plus/minus with efficiency ratings to get separate offensive and defensive ratings, but I’m not sure exactly what he meant by that. Anyway, if anyone has any ideas, let me know. Once I figure out how to split things up I should also be able to calculate adjusted rebound rates (really adjusted on-court team offensive and defensive rebounding percentages). BasketballValue has the data needed to calculate those but again I’m not sure of the methodology.

I also hope to regress boxscore stats (and some other advanced stats) onto these adjusted plus/minus results to calculate my own weights for statistical plus/minus.

]]>I took a look at seven popular player ratings. Two basic linear weights metrics based on boxscore stats - John Hollinger’s Player Efficiency Rating (PER), and Dave Berri’s Wins Produced (WP). Two metrics built on Dean Oliver’s individual offensive and defensive ratings - Justin Kubatko’s Win Shares (WS), and Davis21wylie’s Wins Above Replacement Player (WARP). And three plus/minus metrics based on team point differential while the player is on the court - Roland Beech’s Net Plus/Minus (Net +/-), Dan Rosenbaum’s Adjusted Plus/Minus (Adj +/-), and Dan Rosenbaum’s Statistical Plus/Minus (Stat +/-). For the purposes of comparison I looked at the per-minute (or per-possession) versions of all these metrics (e.g. WP48 instead of WP, WSAA/48 instead of WSAA, WARPr instead of WARP).

Using data from Basketball-Reference and Doug’s Stats I calculated PER and Wins Produced on my own, so the values may differ slightly from those you’ve seen elsewhere (I should note here that Wins Produced has a position adjustment that sets the average guard’s rating equal to the average big man’s rating, a feature (or bug?) which is not present in any of the other systems). For Win Shares and WARP I got this year’s ratings from Basketball-Reference and this APBRmetrics thread, respectively. For Win Shares I converted Win Shares Above Average (WSAA) to a per-48-minute rating (I was able to duplicate the calculations for Win Shares on my own but I wasn’t sure how to calculate Loss Shares). For Net +/- and Adjusted +/- I used data from BasketballValue, and I calculated Statistical +/- on my own. For all metrics other than Adjusted +/- players who played for multiple teams in the season did not have their stats combined but instead had each stint looked at separately.

First, the top and bottom 10 in each boxscore-based rating system among players who played at least 500 minutes in 07-08:

Next, the top and bottom 10 in each plus/minus-based rating system among players who played at least 500 minutes in 07-08:

Averaging how each player was ranked by all seven metrics, here is a consensus top and bottom ten, along with each player’s rank in each metric:

One thing that jumps out is that despite being rated the first or second best player in the league by five of the seven rating systems, Chris Paul is not among the consensus top ten. He dropped to 14th overall due to his very mediocre Adjusted Plus/Minus ranking (154th out of 329). Exactly why he rated so low in this metric has been the topic of some recent debate. Amare Stoudemire was another player who ranked much lower in Adjusted Plus/Minus (194th) than in the other metrics.

Another eye-popper is seeing Amir Johnson, the 21-year-old Detroit power forward who’s been riding the pine in the playoffs, ranked first in the league in Adjusted Plus/Minus. This actually isn’t as great an anomaly as might be expected - Johnson rated rather well across the board. His consensus ranking was 15th. He was rated lowest by PER (64th), but he ranked 11th in Win Shares and 20th in Statistical Plus/Minus. Obviously one has to use some caution considering he played under 800 minutes on the season, but the fact that he rated well in several metrics could be a good sign for the future.

To further examine the rating systems, for each one I wanted to see which players it liked better (or worse) than the other systems. To do so I found the difference between each player’s ranking in that system and his average ranking in the six other systems. In the chart below if a player is ranked very high in the given metric but much lower in the other six, then he will appear near the top of the list, as a player that that metric “likes” more than other metrics do (e.g. PER likes Kevin Durant a lot more than other systems, but doesn’t like Shane Battier as much as other systems). This could be seen as a list of players that the metric overrates (or underrates, if you’re looking at the bottom players) relative to other rating systems. Or, if you think the players at the top of a list tend to be underrated (statistically), then maybe that metric is the one for you.

Win Shares is highly tied in to team wins, and that can be seen clearly from how highly it rates role players from great teams and how poorly it rates stars from awful teams. One can make corresponding diagnoses for the other metrics based on these lists, in terms of systems over- or under-rating usage, rebounding, scoring, etc.

Next, expanding on the Chris Paul example from above, here are the players that the rating systems are either in greatest agreement on, or in greatest dispute over. To calculate this I just took the standard deviation of the players’ rankings in the seven metrics. All the rating systems agree that Manu Ginobili is pretty good and Acie Law is pretty bad, but they can’t agree on whether Al Jefferson is one of the best or one of the worst players in the league.

One thing that stands out on the last chart is that some of the metrics seem to group together in their evaluation of players. Net Plus/Minus and Adjusted Plus/Minus both rated Casey Jacobsen much better than the other five metrics, and they both rated Al Jefferson much worse. To quantify how much each player rating is in agreement with each other rating, I calculated the correlation coefficient between each metric for all players who played at least 500 minutes last season (here again it should be noted that Adjusted Plus/Minus was calculated using season totals even for players who changed teams during the season, unlike the other metrics which split things up).

Here we can see that Net +/- and Adjusted +/- are similar to one another (correlation of 0.73) but very different from the boxscore-based ratings. Statistical +/-, which is meant to be a boxscore-based estimation of Adjusted +/-, does estimate it better than the other boxscore metrics with a correlation of 0.49, but also correlates pretty strongly with those boxscore metrics. WARP is somewhat surprisingly very highly correlated with PER (0.93), perhaps due to the weight both place on usage.

I’ll try to put together a spreadsheet so anyone can download all this data soon. Until then I’d be interest to hear any interpretations of these charts that people have.

]]>Most of the equations in this post are not my original work but instead were taken from various sources. I’ve tried to compile them all into one place and in a fairly logical order that can benefit both a newcomer to the topic as well as those with more advanced knowledge looking for a refresher. The main sources are various posts and comments by Tangotiger, MGL, and others on The Book blog, Andy Dolphin’s appendix to The Book, and the Social Research Methods site. Throughout this post I will link to several specific pages that are of relevance. I would also recommend two excellent introductions to regression to the mean by Ed Küpfer and Sal Baxamusa.

Regression to the mean is rooted in true score theory (aka classical test theory). The basic idea is that a player’s observed performance over some period of time (as measured by a statistic like field-goal percentage) is a function of [1] the player’s true ability or talent in that area and [2] a random error component. It should not be forgotten that this is a simplified model, and it leaves a lot of stuff out (team context, for one).

Observed measure = true ability + random error

A player’s true ability can never be known, it can only be estimated. A player’s observed rate is the typical estimate that is used (i.e. we assume a player with a 40% three-point percentage is a “40% three-point shooter”), but by using regression to the mean we can get a better estimate. This is done by combining what we know about how the individual fares in a particular metric with what we know about how players generally fare in that metric.

The first step is to convert the true score model from the individual level to the group level by looking at the spread (or variance) of the distribution of many players’ stats:

var(obs) = var(true + rand) ...but since the errors are by definition random, they aren't correlated with true ability, so... var(obs) = var(true) + var(rand)

If you look at the field-goal percentages of a group of players, some of the variation would be from the differing shooting abilities among the players, and some would come from the differing amounts of random luck each player had. As the equation shows, the overall variance (the standard deviation squared) of players’ observed rates is equal to the sum of the variance of their true rates and the variance of the random errors.

The next important concept is reliability. The more reliable a stat is, the less affected it is by randomness, and the better it captures a player’s true ability. If a metric stays consistent when looking at one sample of a player’s performance and comparing that to another sample from the same player (for instance, one season compared to the next), that metric is reliable (assuming that the player’s true ability was the same for both samples). Below I will use the correlation coefficient symbol “r” to represent reliability because what we’re really looking at are correlations. Reliability is a specific type of correlation though - the correlation between a measure and itself.

There are many different ways of calculating reliability. Here are some, divided into empirical and mathematical:

**Empirical methods:**

- Year-to-year correlation
- Split-half correlation
- Cronbach’s alpha
- Intraclass correlation

**Mathematical methods:**

- Derivation from var(rand) and var(obs)
- Tango’s regression equation method
- Tango’s z-scores method
- Andy Dolphin’s mathematical derivation

Year-to-year correlations (YTY r’s) are the most basic and widespread method. The idea is simple - for some group of players, find the correlation between how those players performed in that stat in one season with how they performed the next season. Assuming players’ abilities don’t change much from year to year, and that luck isn’t correlated from year to year (that is, that a player having good luck one season doesn’t make him more likely to also be lucky the next year), then if a stat has a low year-to-year correlation this must be because it is picking up more of the fluctuating randomness and less of the consistent true ability, and vice versa. This is the basis of how correlating a metric to itself gives a measure of reliability.

There are some issues with year-to-year correlations. For one thing, the assumption that player abilities are stable from one season to the next is surely false. Player improvement and decline (as well as other context changes between seasons) are added sources of season-to-season fluctuation, and thus YTY r’s may underestimate a stat’s reliability. To deal with this one can instead use split-half correlations (like this). Instead of correlating one season to the next, these split a season in half and correlate one half to the other. This could be all games from the first half of the season compared to all games from the second half, or better yet, performance in odd-numbered games compared to even-numbered games (or even odd-numbered shot attempts vs. even-numbered ones). The same idea can be used on a larger scale, comparing performance in odd-numbered years to even-numbered years. Cronbach’s alpha is basically split-half correlations to the extreme - instead of just comparing one subset of 41 games (like odd-numbered ones) to the other 41, it looks at every possible subset of half the season compared to the counterpart half, and averages all these correlations. It’s similar to intraclass correlation, a technique that Pizza Cutter often uses on his blog.

To this point I’ve been sidestepping a major issue in calculating reliability. This is the fact that reliability is dependent on sample size (here meaning the number of opportunities or attempts each player had for the given stat). The year-to-year correlation for a stat where players have only a few dozen opportunities per season will likely be much lower than the correlation for a stat where players have thousands of opportunities in a season. To see why this is so it is useful to look at the mathematical calculation of reliability.

As discussed above, reliability measures how much a metric captures true ability vs. random error. It is really just a measure of the percentage of the total observed variance that comes from variance in true ability. (Again I’ll use r to represent reliability, though here it should be noted that we’re not really dealing with a correlation.)

r = var(true)/var(obs) r = var(true)/(var(true) + var(rand))

Here we can clearly see that reliability depends on var(true) and var(rand). var(true) is the spread of true talent in the metric’s skill area in the population. The more variation in skill, the greater the reliability. The more random error, the smaller the reliability. But we can delve deeper than this and calculate the value of var(rand). If the stat we’re looking at is a binomial opportunity rate, meaning that each opportunity/trial/attempt has two possible outcomes - success or failure, then we can use the binomial distribution to rather easily calculate var(rand) for a group of players if we know the average rate of the group and the average number of opportunities for each player (this can also be done through a more complicated process for multinomial stats - see Andy Dolphin’s appendix to The Book or this post by Ed Küpfer). Here is the formula for binomials:

var(rand) = PopMeanRate*(1 - PopMeanRate)/PopMeanOpps

The basic idea (on an individual player level) behind binomial randomness is that on each opportunity we are guaranteed an error of either PlayerTrueRate or (1 - PlayerTrueRate). For each shot attempt, a true 40% shooter will either make it (100% FG%, error of .6) or miss it (0% FG%, error of .4). After two attempts, the player with 40% skill must either be shooting 0%, 50% or 100%. As the number of opportunities increases, the observed rate will approach the true rate as the effect of randomness decreases (for a full derivation of this equation see this paper). So now we can see how sample size affects reliability. Fewer opportunities means a higher var(rand), and a higher var(rand) means a lower reliability.

(I’m actually unsure about what the exact formula for var(rand) should be. If every player in the population has the same number of opportunities, the one listed above is fine. But once there is some variation in opportunity levels between players, it’s unclear to me whether one should use the arithmetic mean of each player’s opportunities or the harmonic mean (as discussed here), or something else. I’ve tried different things and none seem to work perfectly. This isn’t a big deal when the players have had a similar number of opportunites.)

So when using empirical correlations to measure reliability, one must be cognizant of the number of opportunities each player has had for the given stat (this important point has been emphasized by Tango and MGL on their blog). Ideally, the group of players looked at should all have had a similar number of opportunities, as the reliability of a stat for players with 100 opportunities will be different from the reliability for that same stat for players with 1000 opportunities.

Going back to where we left off above, we know that reliability is the ratio of var(true) to var(obs), but we can’t calculate var(true) directly, so some manipulation is needed.

r = var(true)/var(obs) r = (var(obs) - var(rand))/var(obs)

This gives us a way to get reliability from var(rand), which we’ve already seen can be calculated from the population mean rate and mean opportunity level, and var(obs), which is simply the variance of the observed rates of the players in the population. Again we have to be careful that the players we look at have a similar number of opportunities, but the advantage of this method is that we don’t have to worry about dealing with multiple seasons (as in year-to-year correlations) or splitting data up by game (as in split-half correlations).

Once we know the reliability of a statistic, we can use this to estimate the true ability for individual players area through regression to the mean. The more reliable a metric is, the more we confidence we have that a player’s true talent level is near the observed rate that he produced. But if reliability is low, there is a greater chance that the observed rate is a poor representation of true ability. So we can put something like a confidence interval around observed rates, with rates accumulated from more opportunities (lower var(rand), higher reliability) having thinner intervals and rates accumulated from fewer opportunities (higher var(rand), lower reliability) having wider intervals. But regression to the mean takes this a step further by taking into account the other half of reliability, var(true), the spread of true talent in the population.

This is where things get Bayesian. We don’t just know that player X shot 40% on 100 three-point attempts. We know more about player X - he’s an NBA player, he’s a shooting guard, he’s a starter, etc. And we know more about three-point shooting - it doesn’t vary that much among starting SG’s in the NBA, starting SG’s on average shoot Y% from three, etc.

Once we pick the population to regress toward (more on that later), we have two estimates of the player’s true ability - the player’s observed rate, and the mean rate of the population. How do we weight each of these estimates to arrive at one best guess? By using the metric’s reliability. As discussed above, if the player’s observed rate was produced from a small number of opportunities (high var(rand), low reliability), we put less weight on it. But now we have an additional piece of information to use as well - the spread of talent in the population. If players vary little in the stat, large deviations from the mean are more likely to be random flukes, and thus the player’s observed rate should be given less weight. On the other hand, if there is large skill variation between players, then an extreme observed rate deserves less skepticism, and should be given more weight. As we’ve seen, the calculation of reliability takes into account both of these factors - it increases as var(true) increases and it decreases as var(rand) increases. So we weight the player’s observed rate by the stat’s reliability, and the population mean by one minus the reliability. Plugging this into the formula for a weighted average (WeightedAvg = (Measure1*Weight1 + Measure2*Weight2)/(Weight1 + Weight2)), we get the following:

Regressed rate = (PlayerObsRate*r + PopMeanRate*(1 - r))/(r + 1 - r) Regressed rate = PopMeanRate + r*(PlayerObsRate - PopMeanRate)

In other words, we regress (1 - r) percent of the way to the mean (treating r as a percentage rather than a decimal). If the reliability is .8, then we regress 20% of the way from the player’s observed rate to the population mean to arrive at our estimate of true ability. A 50% shooter from a population with a mean of 40% would be regressed to 48% (40 + .8*(50 - 40)).

I’m not sure whether Tango came up with this or whether it was Andy Dolphin or MGL. I know Andy wrote the math-heavy appendix to The Book which outlines the full method of regressing to the mean (discussed below), but I don’t remember this shortcut equation from The Book and I know I’ve seen it a lot in Tango’s blog posts (like this one).

As mentioned above, it’s not the case there is one reliability figure for each metric. Reliability is metric specific and opportunity specific - the r for 3PT% for players with 50 three-point attempts is different than the r for 3PT% for players with 200 three-point attempts (even if we’re regressing to the same population). But when calculating reliability by the methods previously described (whether empirical or mathematical), one must use a specific opportunity level and thus arrive at a result that is specific to that level. This formula allows one to generalize from such a result and use that one r to generate other r’s for different opportunity levels for that statistic.

The first step is to calculate reliability for a specific opportunity level (it doesn’t matter the method). Call this opportunity level KnownOpps and the reliability KnownR. These are used to calculate a constant value specific to that metric that can then be put into a general formula allowing one to calculate r for any opportunity level.

constant = (1 - KnownR)*KnownOpps/KnownR constant = KnownOpps*var(rand)/(var(obs) - var(rand)) General r = opps/(constant + opps) General (1 - r) = constant/(constant + opps)

New r’s calculated from this equation can then be plugged into the regressed rate equation from above to regress the stat for players with different opportunity levels. The result is identical to taking a weighted average of a player’s observed rate and the population mean using opps for the player weight and constant for the population weight.

Regressed rate = (PlayerObsRate*opps + PopMean*constant)/(opps + constant)

Another way to think about this is in terms of adding constant number of opportunities at the population mean to the player’s observed rate. If the player’s observed rate was .5 in 100 opportunities (50/100), the population mean is .4, and the constant for that metric was determined to be 60, then add 60 attempts at 40% (24/60) to the player’s 50/100 to arrive at the regressed rate of (50 + 24)/(100 + 60) = .4625.

Again I’m not sure who came up with this method, but Tango has used it a number of times (like here). This is a method for calculating r when your population contains players with varying numbers of opportunities for the stat in question. Basically what it does is simulate the distribution of observed rates that would be expected if all players in the population had the same true rate, and then it looks to see how much more spread there is in the actual observed rates than there is in the simulated distribution.

Calculating by this method is a little more involved. For each player in the population, one must first compute two numbers, the PlayerVarRand and PlayerZscore. Then r can be computed using the variance of the PlayerZscores of all the players.

PlayerVarRand = PlayerObsRate*(1 - PlayerObsRate)/PlayerOpps PlayerZscore = (PlayerObsRate - PopMeanRate)/sqrt(PlayerVarRand) r = 1 - (1/var(PlayerZscores)) r = (var(PlayerZscores) - 1)/var(PlayerZscores)

var(PlayerZscores) is actually an estimate of var(obs)/var(rand). So we can also use this method to get var(rand) and var(true) for the population:

var(PlayerZscores) = var(obs)/var(rand) var(rand) = var(obs)/var(PlayerZscores) var(PlayerZscores) = var(obs)/var(rand) var(PlayerZscores) = var(obs)/(var(obs) - var(true)) var(true) = var(obs) - var(obs)/var(PlayerZscores)

The advantage this method has over calculating r from var(obs) and var(rand) is that it takes into account the varying number of opportunities that each player in the population had. However, I’m not completely sure how it works. Using this method one can calculate an r, but what opportunity level is this for? It can’t be the case that this r applies for all opportunity levels and that all players should be regressed the same amount. What should be plugged into KnownOpps in Tango’s regression equation in order to derive the constant? Again I don’t think that simply the arithmetic mean of all the players’ opportunities is the answer. This is also an issue when calculating r empirically from a correlation.

However, one still can adjust the r calculated from this method for different opportunity levels, and thus regress players different amounts based on their varying opportunities. This can be done by calculating player-specific r’s directly from var(true) (calculated as described above) and PlayerVarRand:

r = var(true)/(var(true) + PlayerObsRate*(1 - PlayerObsRate)/PlayerOpps)

In the appendix of The Book, Andy Dolphin described a more detailed mathematical method for regressing to the mean. The basic idea of taking a weighted average of the player’s observed rate and the population mean rate remains, but here each measure is weighted by the reciprocal of its uncertainty squared (uncertainty is basically equivalent to the standard deviation). So if the player’s rate has a larger uncertainty than the population mean, it will be weighted less (1/uncertainty^2 will be smaller), and thus will be regressed further toward the population mean.

(Uncertainty(PlayerObsRate))^2 = PlayerVarRand = PlayerObsRate*(1 - PlayerObsRate)/PlayerOpps (Uncertainty(PopMeanRate))^2 = PopVarTrue PlayerWeight = 1/(Uncertainty(PlayerObsRate))^2 = 1/PlayerVarRand PopWeight = 1/(Uncertainty(PopMeanRate))^2 = 1/PopVarTrue Regressed rate = (PlayerObsRate/PlayerVarRand + PopMeanRate/PopVarTrue)/(1/PlayerVarRand + 1/PopVarTrue)

PopVarTrue (which is the same thing I’ve been referring to as var(true)) is calculated in a somewhat similar method to the z-scores above. First calculate PlayerVarTrue for each player using PlayerVarObs - PlayerVarRand (with an additional term of (1 - PlayerOpps/PopOpps) added in), then take a weighted average of each of these where the weights are 1/uncertainty^2 of each player’s PlayerVarTrue (for the weighted average I’ve used the Excel formula of SUMPRODUCT(Measures,Weights)/SUM(Weights) rather than the mathematical summation notation):

PlayerVarTrue = PlayerVarObs - PlayerVarRand PlayerVarObs = (PlayerObsRate - PopMeanRate)^2 PlayerVarRand = PlayerObsRate*(1 - PlayerObsRate)/PlayerOpps PlayerVarTrue = (PlayerObsRate - PopMeanRate)^2 - (1 - PlayerOpps/PopOpps)*(PlayerObsRate*(1 - PlayerObsRate)/PlayerOpps)) PlayerVarTrueWeight = 1/(2*(PlayerObsRate*(1 - PlayerObsRate)/PlayerOpps + PopVarTrue)^2) PopVarTrue = SUMPRODUCT(PlayerVarTrues,PlayerVarTrueWeights)/SUM(PlayerVarTrueWeights)

You may have noticed that PopVarTrue, the figure we are trying to calculate, appears within the formula we are using to calculate it. We can avoid an infinite loop by iterating - first put in any random number (smaller than one) for PopVarTrue in each player’s PlayerVarTrueWeight (using the same random number for each player). Then solve for PopVarTrue through the weighted average formula. Next, take that result, and plug it back into all the PlayerVarTrueWeights, and solve for PopVarTrue again. Take that result, and plug it back in, etc. Eventually things will stabilize (meaning that the number you get out of the weighted average formula will be the same as the number you had just plugged in to each PlayerVarTrueWeight).

One can also calculate r from this method:

r = PlayerWeight/(PlayerWeight + PopWeight) r = (1/PlayerVarRand)/(1/PlayerVarRand + 1/PopVarTrue) r = 1/(PlayerVarRand*(1/PlayerVarRand + 1/PopVarTrue)) r = 1/(1 + PlayerVarRand/PopVarTrue) r = PopVarTrue/(PopVarTrue + PlayerVarRand)

Whew, that was a lot of math. Once you’ve got some of that down, you can get on to the fun stuff, which is applying these formulas onto real players’ stats. But that will have to wait until my next post. I hope to look at some practical results both to see what they tell us about the reliability of different basketball stats, and to explore the differences between the various methods I’ve discussed in this post. I also plan on addressing the issues of choosing a population to regress to, dealing with varying opportunity levels, determining whether a “skill” exists, and using regression to the mean as part of a projection system (like Tango does with his Marcels). Until then, if anyone can answer some of the questions I’ve posed about these different methods for regressing to the mean (or correct any of the mistakes I’ve surely made), I’d love to hear from you in the comments.

]]>Player Team(s) Poss oppDRtg ------------------- ------- ---- ------- Linas Kleiza DEN 3943 110.8 Sasha Vujacic LAL 2482 110.5 Vladimir Radmanovic LAL 2981 110.3 J.R. Smith DEN 2994 109.9 Kelenna Azubuike GSW 3574 109.8 Carl Landry HOU 1341 109.7 Andris Biedrins GSW 4278 109.6 Carlos Boozer UTA 5584 109.5 Stephen Jackson GSW 5914 109.5 Al Harrington GSW 4528 109.5 Carlos Arroyo ORL 2425 109.5 Jordan Farmar LAL 3330 109.4 Baron Davis GSW 6635 109.4 Dikembe Mutombo HOU 1152 109.3 Andrei Kirilenko UTA 4386 109.3 Steve Nash PHX 5641 109.3 Shelden Williams ATL/SAC 1508 109.3 Maurice Evans LAL/ORL 3329 109.2 Deron Williams UTA 6002 109.0 Hedo Turkoglu ORL 5910 109.0

That’s a pretty interesting list. There are a lot of players from great offensive teams. Maybe this is saying that those offenses weren’t so much great as they were lucky - they had the good fortune of facing weaker defensive lineups than other teams faced. But I don’t think this conclusion is warranted. I can think of a few other theories to explain some of the entries on this list.

First, an obvious place to look for players who faced weak defenses would be backups to offensive stars. If a team is facing the Lakers, they will probably try to have their top perimeter defenders in whenever Kobe is in the game. When he goes to the bench, their best defenders will rest too. So Kobe’s backups often have the advantage of facing lesser defenders. This theory could help explain the presence of a number of players in the table above.

Second, there are a number of players on the list from undersized or small-ball teams like Golden State, Orlando and Phoenix. One explanation for this could be that to match up with such teams, opponents often go small as well. For the purposes of that particular game, this could be the best defensive strategy. But on the season as a whole, these makeshift undersized lineups won’t fare very well defensively. So it could just be the case that teams like Golden State typically face lineups that are poor defensively in normal circumstances (i.e. when facing regular-sized teams).

Player Team(s) Poss oppDRtg ----------------- ------- ---- ------- Sergio Rodriguez POR 1158 102.2 Dominic McGuire WAS 1270 102.4 Earl Barron MIA 1651 102.9 Dan Gadzuric MIL 1033 103.2 Antoine Wright NJN/DAL 2332 103.5 Stephon Marbury NYK 1550 103.7 Al Thornton LAC 4137 103.7 Francisco Elson SAS/SEA 1547 103.8 Darrell Armstrong NJN 1033 103.8 Jermaine O'Neal IND 2450 103.8 Hilton Armstrong NOH 1363 104.0 Julian Wright NOH 1194 104.2 Jason Collins NJN/MEM 2230 104.3 Marcus Banks PHX/MIA 1112 104.3 Malik Allen NJN/DAL 2019 104.3 Mardy Collins NYK 1222 104.4 Smush Parker MIA/LAC 1135 104.4 Bostjan Nachbar NJN 3220 104.5 Antoine Walker MIN 1735 104.5 Kwame Brown LAL/MEM 1421 104.6

I find this list harder to interpret. It seems to be made up of worse players from worse teams compared to the first list. Probably a lot of it is just randomness - no matter what, some players have to face better defensive lineups. Beyond that, I’d be interested to hear any theories people might have.

In theory, one could use these kinds of adjustments to identify players whose offensive stats may have been inflated (or deflated) based on the level of defenses that they went up against. Of course, one would have to take into account alternative explanations like the small-ball theory, and consider the fact that when a player faces a lineup that’s poor overall defensively that doesn’t mean that the individual player guarding them was a poor defender.

Here is a Google Spreadsheet containing the data for all players from this past season.

]]>To do this I started with lineup data from BasketballValue. To adjust each lineup’s offensive rating, I calculated a weighted average of the season defensive ratings of all the opposing lineups that that lineup faced. These defensive ratings were weighted by the number of possessions the original lineup played against that defensive lineup. This meant that for each lineup I had its offensive rating and its average opponents’ defensive rating. I subtracted the second from the first to get an adjusted measure of the lineup’s offensive production. So if a lineup had a good offensive rating but played against poor defensive lineups, its rating was decreased, while if a lineup had a poor offensive rating but played against good defensive lineups, its rating was increased.

The adjustments I made were only one level deep. In college football ranking systems you sometimes see similar multi-level adjustments for strength of schedule that take into account a team’s record, its opponents’ records, and its opponents’ opponents’ records. The same thing could be done here - I’m adjusting each team’s offensive ratings for their opponents’ defensive ratings, but I could first adjust the opponents’ defensive ratings for *their* opponents’ offensive ratings. Theoretically, one could do this infinitely, and I think the results would ultimately be similar to what you’d get from a regression-based method like Dan Rosenbaum uses for his adjusted plus/minus. But I’m just going to do one level of adjusting, partly because it can be calculated pretty quickly with some pivot tables in Excel, and partly because you just don’t gain that much the deeper you go. This is because over the course of a season, things tend to even out, and most lineups end up facing a similar mix of good and bad opposing lineups. The variance in opponents’ defensive ratings is a lot less than the variance in lineup offensive ratings, and the variance in opponents’ opponents’ offensive ratings would be even smaller.

Below are the adjusted rankings for offensive rating, defensive rating, and point differential. I excluded lineups that played together for less than 200 offensive possessions (or 200 defensive possessions). “ORtg” is the lineup’s offensive rating (points per 100 possessions), “oppDRtg” is the weighted average of the defensive ratings of the opposing lineups faced. “offDiff” is the additional points scored per 100 possessions over what would be expected based on the quality of the defenses faced. “DRtg”, “oppORtg”, and “defDiff” are the defensive counterparts to those stats. “totDiff” is the sum of “offDiff” and “defDiff”, which represents the additional point differential per 100 possessions over what would be expected based on the quality of the offenses and defenses faced.

I uploaded a table containing every lineup’s offensive and defensive numbers to Swivel (it was too big for Google Spreadsheets). You can view it here or download it in CSV form here.

]]>Last week, the New York Times had an article on using Chernoff faces to visualize data about baseball managers. The Arbitrarian followed that up with a post that used Chernoff faces to compare some star players in the NBA. Chernoff faces are a way to display data by mapping it onto simplified human faces - you can read more about them here and here. The main reason I’m linking to these is because this method of visualizing data was invented by my great-uncle, Herman Chernoff. He’s made a lot of contributions to the field of statistics in his career, but most of them (like this) aren’t as fun as the faces that he came up with thirty-five years ago.

]]>

Q:Generally, who should have a larger role in evaluating college and minor league players: scouts or stat guys?

A:Ninety-five percent scouts, five percent stats. The thing is that — with the exception of a very few players like Ryan Braun — college players are so far away from the major leagues that even the best of them will have to improve tremendously in order to survive as major league players — thus, the knowledge of who will improve is vastly more important than the knowledge of who is good. Stats can tell you who is good, but they’re almost 100 percent useless when it comes to who will improve.In addition to that, college baseball

issubstantially different from pro baseball, because of the non-wooden bats and because of the scheduling of games. So … you have to pretty much let the scouts do that.

These issues seem to me to be important in basketball as well, and I think they are a good starting point for thinking about the statistical analysis of sports. Taking them in reverse order, here’s one way of framing James’ points:

James claims that the context of college baseball is so different from the majors that translating from college production to projected pro production by controlling for these context differences is a very difficult task. Behind this claim is a model of player statistical production wherein a player’s stats are the result of his abilities and the context in which he plays.

There’s actually an additional element at play, which is randomness. Tango, MGL and Andy Dolphin have done a fantastic job of exploring this factor in baseball in great detail in “The Book” and on their blog. The basic idea is that, ignoring context for the moment, player production = player ability + randomness. Randomness has a larger effect when sample sizes are smaller and when there is little variation among players in ability. To control for this one can regress statistics to the mean, which is a way of starting with player production and separating it out into ability and luck.

The fuller model would be player production = player ability + context + randomness. Because of the individual nature of the (offensive side of) baseball, the context effects on batting statistics aren’t that complex and are typically controlled for in advance (by adjusting for things like runners on base, pitcher quality and park effects - though James suggests that the context differences between college and the majors aren’t so easy to deal with). However, in basketball, almost all areas of the game are impacted by context in complicated ways as a result of the team nature of the sport. So while it’s important to try to control for randomness in basketball stats, I think understanding the effects of context is the more pressing issue. How will a player’s production change when put in a different role, when playing with different teammates, or when playing in a different coach’s system?

To try to answer these questions and control for context there are a lot of methods that can be used. Measuring statistics per possession rather than per game is a way of controlling for the context of differing tempos. More generally, in a previous post I outlined a way of dealing with the issue by looking at how players stats change when they change teams. I haven’t followed up on that method as promised, in part because I think there may be a better way to approach things by using multilevel modeling (which you can learn about from this book or this article). Eventually I hope to post some results of this approach.

As if untangling skill from context wasn’t hard enough, James’ first point emphasizes that skill itself can change in ways that are difficult to predict - some players improve more than others. And James suggests that in baseball, statistics are much more useful for measuring ability than for predicting change in ability.

In basketball I think this is a question for the future, since for now we still have a long way to go on measuring a player’s current skill. But it is important. Why do two players with similar college production go on to have greatly different pro careers? Are we just not looking at the right stats (e.g. maybe there are hidden indicator stats that do shed light on future improvement, such as a high free-throw percentage suggesting the potential for improved three-point shooting)? Or, like James is suggesting for baseball, do we have to look outside on-court stats to try to predict player improvement? This could mean looking to objective (but off-court) measures like age, quickness, strength, vertical leap, and wingspan, or even getting into harder to define areas such as effort, intelligence, diligence, leadership, heart, and other “intangibles.” Or one can take the scouting approach and look to sub-skills players exhibit on the court that aren’t easily quantifiable statistically but that suggest the potential for overall improvement with the right coaching (e.g. a player’s shooting form or how well they box out). At this point I’m not sure we can say just what the right mix is of these varying approaches.

]]>