<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Regression to the Mean</title>
	<atom:link href="http://www.countthebasket.com/blog/2008/05/19/regression-to-the-mean/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.countthebasket.com/blog/2008/05/19/regression-to-the-mean/</link>
	<description>Advanced Stats for Basketball</description>
	<pubDate>Sat, 13 Mar 2010 06:43:31 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5</generator>
		<item>
		<title>By: Pizza Cutter</title>
		<link>http://www.countthebasket.com/blog/2008/05/19/regression-to-the-mean/#comment-217</link>
		<dc:creator>Pizza Cutter</dc:creator>
		<pubDate>Tue, 20 May 2008 18:05:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.countthebasket.com/blog/?p=65#comment-217</guid>
		<description>Mathematically, the full version makes sense, although the population variance itself is only an estimate and is also likely to be affected by considerations of what kind of sampling size we're looking at.  For example, the population variance of everyone in the league after their first 10 (free throws/at bats/whatever) is going to be more muddled (small sample sizes are of course given to larger swings to the extremes) and probably bigger than after 250 attempts, and since there aren't a gigantic number of players in either the NBA or MLB, it's hard to say that we can "law of large numbers" that consideration away.  In order to have an appropriate comparison, if I'm looking at a player who has had 100 plate appearances, I should look at what the estimated population variance for 100 PA's would be, given everything else we know.</description>
		<content:encoded><![CDATA[<p>Mathematically, the full version makes sense, although the population variance itself is only an estimate and is also likely to be affected by considerations of what kind of sampling size we&#8217;re looking at.  For example, the population variance of everyone in the league after their first 10 (free throws/at bats/whatever) is going to be more muddled (small sample sizes are of course given to larger swings to the extremes) and probably bigger than after 250 attempts, and since there aren&#8217;t a gigantic number of players in either the NBA or MLB, it&#8217;s hard to say that we can &#8220;law of large numbers&#8221; that consideration away.  In order to have an appropriate comparison, if I&#8217;m looking at a player who has had 100 plate appearances, I should look at what the estimated population variance for 100 PA&#8217;s would be, given everything else we know.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Eli</title>
		<link>http://www.countthebasket.com/blog/2008/05/19/regression-to-the-mean/#comment-216</link>
		<dc:creator>Eli</dc:creator>
		<pubDate>Tue, 20 May 2008 05:01:13 +0000</pubDate>
		<guid isPermaLink="false">http://www.countthebasket.com/blog/?p=65#comment-216</guid>
		<description>Ok, here's how I see the derivation of the regression equation, r = opps/(constant + opps). From Andy's method, start with r = (1/PlayerVarRand)/(1/PlayerVarRand + 1/PopVarTrue).

&lt;blockquote&gt;&lt;pre&gt;r = (1/PlayerVarRand)/(1/PlayerVarRand + 1/PopVarTrue)
r = (PlayerOpps/(PlayerObsRate*(1 - PlayerObsRate)))/(PlayerOpps/(PlayerObsRate*(1 - PlayerObsRate)) + 1/PopVarTrue)
r = PlayerOpps/(PlayerOpps + PlayerObsRate*(1 - PlayerObsRate)/PopVarTrue)&lt;/pre&gt;&lt;/blockquote&gt;

This mirrors Tango's r = opps/(opps + constant) formula. The constant is proportionate to the reciprocal of PopVarTrue (which, like the constant, has a unique value for each metric). Tango's method won't be exact because it doesn't adjust for the rate of the player in question, just the opportunities. But it looks to me like the full version (r = PlayerOpps/(PlayerOpps + PlayerObsRate*(1 - PlayerObsRate)/PopVarTrue)) is sound.</description>
		<content:encoded><![CDATA[<p>Ok, here&#8217;s how I see the derivation of the regression equation, r = opps/(constant + opps). From Andy&#8217;s method, start with r = (1/PlayerVarRand)/(1/PlayerVarRand + 1/PopVarTrue).</p>
<blockquote><pre>r = (1/PlayerVarRand)/(1/PlayerVarRand + 1/PopVarTrue)
r = (PlayerOpps/(PlayerObsRate*(1 - PlayerObsRate)))/(PlayerOpps/(PlayerObsRate*(1 - PlayerObsRate)) + 1/PopVarTrue)
r = PlayerOpps/(PlayerOpps + PlayerObsRate*(1 - PlayerObsRate)/PopVarTrue)</pre>
</blockquote>
<p>This mirrors Tango&#8217;s r = opps/(opps + constant) formula. The constant is proportionate to the reciprocal of PopVarTrue (which, like the constant, has a unique value for each metric). Tango&#8217;s method won&#8217;t be exact because it doesn&#8217;t adjust for the rate of the player in question, just the opportunities. But it looks to me like the full version (r = PlayerOpps/(PlayerOpps + PlayerObsRate*(1 - PlayerObsRate)/PopVarTrue)) is sound.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Eli</title>
		<link>http://www.countthebasket.com/blog/2008/05/19/regression-to-the-mean/#comment-215</link>
		<dc:creator>Eli</dc:creator>
		<pubDate>Tue, 20 May 2008 00:04:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.countthebasket.com/blog/?p=65#comment-215</guid>
		<description>Tango's regression equation is at best an approximation, since r really depends on both the player's opportunities (which his method adjusts for) and the player's rate (which his method ignores). I'll have to think more about its mathematical derivation and whether that makes sense. And I'm definitely going to try it out on some data in my next post and compare the results it produces to results from other methods.</description>
		<content:encoded><![CDATA[<p>Tango&#8217;s regression equation is at best an approximation, since r really depends on both the player&#8217;s opportunities (which his method adjusts for) and the player&#8217;s rate (which his method ignores). I&#8217;ll have to think more about its mathematical derivation and whether that makes sense. And I&#8217;m definitely going to try it out on some data in my next post and compare the results it produces to results from other methods.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Pizza Cutter</title>
		<link>http://www.countthebasket.com/blog/2008/05/19/regression-to-the-mean/#comment-214</link>
		<dc:creator>Pizza Cutter</dc:creator>
		<pubDate>Mon, 19 May 2008 22:49:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.countthebasket.com/blog/?p=65#comment-214</guid>
		<description>A few things:

1) When doing binary outcomes, don't use Cronbach as a measure of reliability.  Use Spearman's split-half coefficient.

2) I'm still not convinced on Tom's correlation coefficient method.  At the end of the comments section, I did some testing in the same data set that I used for the split-half paper and found that there was no such constant that could be obtained.  I'm not a math major, but I've taught stats and it just doesn't seem right.  If Andy did that, and he's out there, perhaps he can enlighten.

3) Thanks for the link love.  Even though I know nothing at all about basketball.</description>
		<content:encoded><![CDATA[<p>A few things:</p>
<p>1) When doing binary outcomes, don&#8217;t use Cronbach as a measure of reliability.  Use Spearman&#8217;s split-half coefficient.</p>
<p>2) I&#8217;m still not convinced on Tom&#8217;s correlation coefficient method.  At the end of the comments section, I did some testing in the same data set that I used for the split-half paper and found that there was no such constant that could be obtained.  I&#8217;m not a math major, but I&#8217;ve taught stats and it just doesn&#8217;t seem right.  If Andy did that, and he&#8217;s out there, perhaps he can enlighten.</p>
<p>3) Thanks for the link love.  Even though I know nothing at all about basketball.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ryan J. Parker</title>
		<link>http://www.countthebasket.com/blog/2008/05/19/regression-to-the-mean/#comment-213</link>
		<dc:creator>Ryan J. Parker</dc:creator>
		<pubDate>Mon, 19 May 2008 15:45:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.countthebasket.com/blog/?p=65#comment-213</guid>
		<description>Eli, this is an invaluable resource.

I've always wondered about the background of the method that is listed in the appendix of "The Book", and this has made things a lot clearer (and kept me busy with a lot of good reading).

I look forward to reading whatever else you have planned to write on this topic.</description>
		<content:encoded><![CDATA[<p>Eli, this is an invaluable resource.</p>
<p>I&#8217;ve always wondered about the background of the method that is listed in the appendix of &#8220;The Book&#8221;, and this has made things a lot clearer (and kept me busy with a lot of good reading).</p>
<p>I look forward to reading whatever else you have planned to write on this topic.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
