To test out the height ordering measure I came up with, and to try some of the methods described in recent posts on the Sabermetric Research blog, I decided to run some correlations to look at the relationship between a player’s height and his rebounding performance.
For the 2006-07 season, I looked at all players who played at least 200 minutes (which came out to 397 players, counting stints with different teams separately). I chose 200 minutes as the cutoff because the correlations seemed to stabilize at that level (at lower cutoffs the correlations were lower because of fluky low minute guys, and at higher cutoffs the correlations were very similar to what they were at the 200 minute cutoff). The explanatory variables that I used were height (in inches) and height ordering (which is on a 1 to 5 scale, with 1 indicating that the player played all of his minutes as the shortest player on the court for his team). The response variables were defensive rebounding percentage and offensive rebounding percentage. DRB% is an opportunity rate measuring DRB/(DRB opportunities), or more specifically, DRB/(team DRB while the player was on the court + opponent ORB while the player was on the court). ORB% is similar but uses ORB opportunities. The actual formulas, which estimate the on-court part, are as follows:
DRB% = DRB/((5*MIN/tmMIN)*(tmDRB + oppORB))
ORB% = ORB/((5*MIN/tmMIN)*(tmORB + oppDRB))
Last season, among players who played at least 200 minutes, Kevin Garnett led the league in DRB% at 30.7%. Earl Boykins finished last at 5.1% for his stint in Denver. For ORB%, Justin Williams was first at 17.6%, while Keith McLeod was last at 0.3%.
Statistical analysis of baseball is far more advanced than its basketball counterpart. But we can use that to our advantage by learning from the work done in baseball and applying it to the context of basketball. Of course not everything transfers directly due to the differing natures of the games, but more often than not the ideas, theories and methods used to analyze baseball can be adopted to some use in basketball.
To that end, I’ve been reading a lot of sabermetric work recently, even though I really have no interest in learning in just which base/out states it makes sense to lay down a sacrifice bunt. I’d like to recommend some of the books and websites that I’ve found to be great sources of ideas.
Identifying a player’s position is useful for all sorts of statistical analysis of basketball, but unfortunately position in basketball is not nearly as well-defined as position in baseball. The traditional breakdown into point guard (1), shooting guard (2), small forward (3), power forward (4), and center (5) works some of the time, but breaks down at the edges. Some teams’ offensive systems don’t differentiate between the roles for the two wing positions (SG and SF), or between the two post positions (PF and C). Some players play one positional role in their team’s offense yet typically guard an opposing player that plays a different positional role in his offense (e.g. Kirk Hinrich, who plays PG for the Bulls offensively but often defends opposing SGs). Many players play different positions at different times in the same game depending on which teammates they are on the court with. For all these reasons and more, having a list saying Player X is a PG, Player Y is a PF, Player Z is a SF, etc. is bound to be lacking.
How can positions be assigned in a more objective and informative manner?
OK, here’s some of the HotZones data I promised. I’ve uploaded the team by team data to Swivel - offensive data from 03-04 to 06-07, and defensive data from 03-04 to 06-07. You should be able to download each as a .CSV file and easily import them into a spreadsheet or database.
I’ll have more analysis later, but for now here’s a quick look at how frequently and how well teams shot from different distances.
League percent of FGA taken from each distance:
0-8 ft 8-16 ft 16-24 ft 24+ ft
------ ------- -------- ------
2003-04 40.6% 18.0% 22.9% 18.5%
2004-05 40.5% 16.4% 23.7% 19.4%
2005-06 41.1% 15.2% 23.7% 20.0%
2006-07 41.3% 14.7% 23.0% 21.0%
------- ------ ------- -------- ------
Total 40.9% 16.1% 23.3% 19.7%
League field-goal percentage by distance:
0-8 ft 8-16 ft 16-24 ft 24+ ft
------ ------- -------- ------
2003-04 53.7% 37.7% 38.8% 35.1%
2004-05 54.6% 38.3% 40.0% 36.0%
2005-06 55.3% 39.0% 40.5% 36.2%
2006-07 56.3% 39.6% 40.4% 36.2%
------- ------ ------- -------- ------
Total 55.0% 38.6% 39.9% 35.9%
The potential trends that pop out to me are the decline in shots being taken from 8-16 feet and the increase in shots from 0-8 feet and 24+ feet. As far as FG% by distance goes, teams seem to be shooting better from both 0-8 feet and 8-16 feet.
You can see that a lot of shots are taken from 0-8 feet. This is where 82games’ distance breakdowns are useful, as their categories of dunks, tips, and close shots provide divisions within 8 feet (some of the shots they classify as jumpers are also within 8 feet).
Another thing to note is that 24+ feet FG% differs slightly from three-point percentage because the HotZones data excludes shots from beyond half-court.
For reference, here are league-wide boxscore stats by season from Basketball-Reference.
To follow-up on my discussion of rate stats, I’m going to look at how this theoretical foundation can help evaluate passing stats created from the starting point of assists.
The basic assist-related player stats are assists per game and assist-to-turnover ratio. Assists per game is a time-period rate, while assist-to-turnover ratio is an opportunity rate (technically it’s an opportunity ratio of successes/failures, but it can easily be transformed into an opportunity rate of Ast/(Ast + TO)).
A lot of the advanced stats in basketball are simply refinements to traditional stats to remove potential biases. So from Assists/Game we can instead shift to Assists/Minute, which controls for playing time, or to Assists/Team Possession, which controls for pace (and playing time). We can even go one step further and shift to Assists/Team Play, which also controls for offensive rebounding (possessions don’t keep track of the extra plays that result from offensive rebounds).
If we turn to the opportunity rate of Ast/(Ast + TO), a flaw is noticeable. Some turnovers have nothing to do with passing - they may be the result of a player trying to score and traveling or committing an offensive foul. In other words, turnovers are not the corresponding failures to the successes of assists. So to make a better opportunity rate for assists, we first need to determine what constitutes an assist opportunity.
Well, wouldn’t you know it, right after I published my long piece on NBA.com’s HotZones, they decided to change them. The HotZones page is still functional and linked to from NBA.com, but the latest incarnation is now called NBA Hot Spots. It doesn’t appear to have any new features, but the menus do allow you to access more of the data that the HotZones page didn’t (the 07-08 regular season and preseason, and the 06-07 playoffs). It looks like it’s just a new front-end, so all the tips from my previous post should still work, as the data is still stored in the same place on the NBA.com server. Hat tip to Hoopinion for noticing the move.
One traditional way of categorizing sports statistics is to divide them into counting stats and rate stats. A counting stat measures the accumulation of successes (or failures) in some area. Total points, field-goal makes and misses, free-throw makes and misses, assists, rebounds, turnovers, blocks, steals, and fouls are counting stats. A rate stat measures the rate or frequency of the accumulation of successes (or failures). Baseball-Reference has a good summary of the difference in the context of baseball here.
I think it can be useful to split rate stats into two subcategories - opportunity rates and time-period rates.
The folks at the official NBA site added a great feature a few years back called HotZones. It’s not on the level of MLB.com’s fantastic PITCHf/x data, but it’s useful nonetheless. It consists of season-level shot charts for every player and team, broken down into 14 zones. This is data that wasn’t previously available. ESPN has had game-by-game shot charts with their boxscores since 02-03, but in a basically unusable form. 82games has shooting by distance since 02-03, but in addition to lacking side-to-side splits it groups shots into pretty large distance categories (what they label “Jump” shots includes some shots closer than 8 feet as well as everything beyond 8 feet).
So NBA.com’s HotZones offer a lot of valuable information, but unfortunately they are also presented in a very difficult to use format. They are embedded in a Flash application, which means easy linking as well as copying and pasting are out of the question. Though you can select any team or player and a variety of splits, there’s no way to see what the league average FG% is for a specific zone, or who the league leaders are. And because of errors in the Flash menus, you can’t even access a lot of the available data (as of this posting, data is present on the server but inaccessible through the Flash menus for players from past seasons who are no longer in the league, the 07-08 regular season, the 06-07 playoffs, and the 03-04 regular season). However, with a bit of digging, I was able to find some ways around these problems. What follows are instructions for how to link to HotZones pages (including those you can’t get to through the menus) and how to download HotZones data in a format that allows for easy manipulation in a spreadsheet.
Welcome to the Count The Basket blog. I started Count The Basket to make it easier to find all the basketball statistics available on the web. In this blog I will present some ways of using all that data to try to better understand the game of basketball.