Learning from Sabermetrics
Statistical analysis of baseball is far more advanced than its basketball counterpart. But we can use that to our advantage by learning from the work done in baseball and applying it to the context of basketball. Of course not everything transfers directly due to the differing natures of the games, but more often than not the ideas, theories and methods used to analyze baseball can be adopted to some use in basketball.
To that end, I’ve been reading a lot of sabermetric work recently, even though I really have no interest in learning in just which base/out states it makes sense to lay down a sacrifice bunt. I’d like to recommend some of the books and websites that I’ve found to be great sources of ideas.
Books
• The Book: Playing the Percentages in Baseball, by Tom Tango, Mitchel Lichtman & Andrew Dolphin
• Baseball Between the Numbers, by the guys at Baseball Prospectus
These two books cover a lot of the same hot-button sabermetric issues (when to bunt, clutch hitting, the size of pitching rotations, etc.), though they sometimes use different methods and reach different conclusions. BBTN has a wider scope, looking at some off-field issues, while The Book goes into more depth on the on-field questions. I got a lot of ideas from both, but if I had to recommend just one it would be The Book for its invaluable math-heavy appendix by Andrew Dolphin covering random variation and regression to the mean (if you’re interested in applying this to basketball check out this APBRmetrics post by Ed Küpfer).
• Baseball Hacks, by Joseph Adler
I mentioned this book previously as my inspiration for finding how to download hidden HotZones data. It’s a series of tips and tricks on how to find baseball statistical data on the web, download it, and analyze it. It includes tutorials for using databases (MySQL & Access), spreadsheets (Excel), statistics programs (R), and scripting languages (Perl). I found these very helpful - I learned quicker from the book’s method of using these programs to actually carry out sabermetric projects rather than just by reading through a manual. It’s hard to describe exactly what the book is like, so you might want to check out this example or take a look at the table of contents. Almost every tip in the book can be applied to finding, downloading, and analyzing basketball stats. I highly recommend it.
Blogs
If you’re more interested in information of the free variety, there are some excellent sabermetric blogs that contain a ton of content that can be adopted to basketball.
• “The Book” blog (and Tangotiger’s site)
In my eyes, this is THE one must-read, check-it-everyday stats site on the web. It’s by the authors of The Book, though most of the posts are by Tangotiger (Tom Tango) or MGL (Mitchel Lichtman). I only found it recently (after reading The Book), but since I did I have been combing the archives and unearthing many great posts. Tangotiger always has wonderful ideas and analysis, but what sets this site apart are the great debates and discussions that go on in the comments. The comment section is often filled with interesting posts by great sabermetricians - sometimes challenging the original research, sometimes advancing it, and sometimes going off on tangents. But the debate is civil and cooperative and often posters work together to get to the bottom of an issue. In future posts I’ll definitely be linking to some of the discussions that I think can be adopted to basketball, but for a starting point I recommend this massive discussion of defense-independent pitching statistics (later summarized in a 22-page PDF).
These are two other excellent sabermetric blogs by Phil Birnbaum and Pizza Cutter, respectively. They may not be as prolific as Tango, but they always have high-quality, interesting posts.