Pages

Monday, November 29, 2010

Penn is not Mightier Than Eye in the Sky

"Stat gurus" believe that "advanced basketball statistics" provide all of the information one needs to completely understand the NBA at both a team and an individual level, while most NBA scouts and personnel directors say "The eye in the sky does not lie"--or, as Indiana Pacers scout Kevin Mackey told me, the "eyeball is number one." I have long criticized "stat gurus" for covering their work with a pseudoscientific patina that obscures the reality that they do not rigorously test their hypotheses or even provide margins of error for their supposedly flawless calculations--but it seems like this season may provide at least an informal test of some of the "stat gurus'" methods and conclusions. It is important to distinguish between baseball's advanced statistics and "advanced basketball statistics"; baseball is a station to station game consisting of discrete actions that can be much more accurately described statistically than a basketball game can, because basketball involves 10 players simultaneously moving and interacting: this is akin to the difference between writing a computer program to play checkers (computers have "solved" checkers already) and writing a computer program to play chess (computers play chess very well but are not even close to "solving" the game), though computers play chess much more proficiently than "stat gurus" are currently able to evaluate basketball productivity at an individual level.

Most of the "stat gurus" have insisted for several years that LeBron James is by far the best player in the NBA and at least some of them would also rank Dwyane Wade as the second best player. Putting James and Wade on the same team instantly caused "stat gurus'" computers to overheat (no pun intended); by their calculations, such a duo would be so "efficient" that it should dominate the NBA even accompanied by three stiffs--but of course the Miami Heat also have Chris Bosh (one of the NBA's top 15 players according to both "stat gurus" and traditional talent evaluators) plus a solid cast of role players. As I said--tongue only partially in cheek--before the season began, if the "stat gurus" are honest they have to admit that their formulas predict that the Heat will win 90 out of 82 regular season games; after all, they have been telling us for years that adding James to the Lakers would have produced the best team in NBA history. I certainly expected--and still expect--for the Heat to be a very good team this season but in my Eastern Conference Preview I correctly identified some of the Heat's important weaknesses and refuted the ludicrous predictions that the Heat would cruise to 70-plus wins en route to easily capturing the NBA championship.

We are nearly one fifth into the 2011 NBA season and the Miami Heat are currently 9-8, sixth in the East and just one and a half games ahead of the Cleveland Cavaliers, a team that the "stat gurus" said would be terrible without James and that would only win 12 games according to one yahoo.

Before Miami's November 24th loss to Orlando, ESPN's "stat guru" Tom Penn declared that the Heat's fans should not be worried despite their team's pedestrian record because the Heat had posted outstanding points per possession numbers both offensively and defensively. Penn asserted that the Heat's exceptional efficiency according to these metrics strongly indicated that the Heat would prove to be an elite team. Jon Barry asked Penn why the Heat have a mediocre record if they are such an efficient team and Penn responded that the Heat's record is merely a reflection of a small sample size of games. Penn added that the Heat had lost a couple of close games and that in the long run he expects LeBron James and Dwyane Wade to carry the Heat to victories most of the time in such games.

Penn's reasoning makes no sense: why should the sample size of (at that time) 14 games be considered small in terms of the Heat's won/loss record but be considered large enough to make a meaningful judgment about the team's so-called efficiency? The reality is that what both the won/loss record and the "advanced basketball statistics" show is that the Heat have blown out inferior teams and have played well at home but they have lost against good teams and they have not played well on the road; in layman's terms, the Heat are frontrunners who have yet to display the necessary mental and physical toughness to be an elite team.

Penn's faulty logic is an excellent example of one of the flaws of "advanced basketball statistics" that I have hammered away at for years: the "stat gurus" rarely if ever provide margins of error or any other kind of guidelines outlining the limitations of the numbers that they cite. The "stat gurus" act like everyone should accept such numbers as 100% accurate and completely incontrovertible. Five years ago, I had a protracted online debate with Bob Chaikin--a "stat guru" who is equally notorious for his sharp tongue and for his narrow, dogmatic views--about Antoine Walker's value. Chaikin, relying on nothing more than "advanced basketball statistics" generated by his basketball statistics simulator, insisted that Walker was the worst starting power forward in the NBA, while I said that just as it is obvious that Walker is not the best starting power forward it is also obvious that he is not the worst. After Walker rejoined the Boston Celtics, he performed quite well and helped lead them to the playoffs. I politely asked Chaikin to explain the dichotomy between what his simulator says and what everyone could see; just like Penn used sample size as an excuse to disregard the Heat's obvious problems, Chaikin dismissed Walker's contributions because of an allegedly small sample size (though Chaikin erroneously referred to "small sample population size"--it is surprisingly common for "stat gurus" to be uninformed about the most basic terminology and concepts regarding the science that they are allegedly practicing). I countered by asking Chaikin to define what an appropriately large sample size would be and challenging him to be willing to admit that he is wrong if Walker's production exceeds Chaikin's expectations once that sample size is filled. Naturally, Chaikin could not produce a valid response to this and so he resorted to childlike name calling, an all-too common reaction in the blogosphere; the jarring thing about my discussion with Chaikin was not his insistence that he was right about everything but rather his refusal to accept even the possibility that he could be wrong. After that exchange, I realized that "stat gurus" are practicing a faith-based religion and not engaging in the scientific inquiry that they allege they are doing.

Elizabeth Kubler-Ross described the stages of grief and I have discovered the stages of dialogue between a "stat guru" and an unbiased, informed basketball analyst:

1) The "stat guru" makes a bold statement/prediction that contradicts what intelligent observers see.

2) When an intelligent observer points out flaws/limitations regarding what the "stat guru" said, the "stat guru" says that the observer is simply allowing himself to be biased by what his eyes see; the "stat guru" insists that the numbers are 100% correct and cannot be refuted.

3) When the intelligent observer then posits some rational, alternative interpretations of the numbers, the "stat guru" resorts to insulting the observer.

This is why I stopped posting at APBR Metrics, the church for the believers in the religion of "advanced basketball statistics"; instead of being a "heretic" inside their small church, I prefer to simply refute their pseudoscience in an open forum.

The intriguing postscript to this story is that Walker eventually landed in Miami, where he started all 23 playoff games in 2006 and ranked second on the team in playoff mpg as the Heat won the championship. Naturally, Chaikin and other "stat gurus" continued to insist that Walker was a terrible player even though Hall of Fame Coach Pat Riley made Walker an integral part of the Heat's rotation.

I regret that I wasted so much time trying to "convert" a rabid cult member like Chaikin but I am glad that--unlike him--I sought the truth, ultimately writing an article (The Enigmatic Antoine Walker) that presented a very balanced account of Walker's strengths and weaknesses as a player. Walker is hardly my favorite player--"stat gurus" disregard the most basic scientific practices but they love to accuse their critics of being biased--and I certainly have never in my life been a Celtics fan; what intrigues me about Walker is the seemingly visceral hate that "stat gurus" direct toward him. "Stat gurus" seem to be personally offended that any NBA coach would put Walker into a game and that is why I used Walker's return to Boston as an opportunity to try to understand why "stat gurus" hate Walker so much and to initiate a conversation about the strengths and limitations of "advanced basketball statistics"; it seems obvious that if the "stat gurus" say that Walker is the worst starting power forward in the NBA but he ends up playing a key role on a championship team than the "advanced basketball statistics" are not telling a completely accurate story. Dan Rosenbaum is one of the few "stat gurus" who appreciated the nuances of my comments; he realized that I was not trying to pump up one player but rather to critically examine "advanced basketball statistics" as a whole and he understood that "stat gurus" must effectively answer the kinds of questions I raised instead of shrieking ad hominem insults a la Chaikin.

The Walker episode shows us that "stat gurus" have been wrong before and that they refuse to admit that they are wrong even in the face of clear empirical evidence. So, don't expect Miami to win the championship this year--and after Miami is eliminated from the playoffs don't expect the "stat gurus" to admit that they were wrong about James, Wade or anything else. After all, a "true believer" never lets reality get in the way of his "faith."

-----

Postscript:

The disturbing trend is not that a few "stat gurus" talk amongst themselves at APBR Metrics but rather that they have brainwashed some mainstream media writers (Henry Abbott) and outlets (Wall Street Journal) into uncritically accepting their declarations as gospel. I am certainly not an anti-statistics Luddite; I have always loved numbers and I look forward to the day that someone comes up with a way to apply statistics as meaningfully to basketball as they have been applied to baseball but it is very important to not blindly accept anything uncritically. Albert Einstein's Theory of Relativity may be the most successful theory in history, yet physicists are still constantly testing it to make sure that it is the best possible description of reality. If "stat gurus" were truly scientists and not just cult preachers then they would welcome intelligent questions about their ideas, but "stat gurus" not only possess little evident capacity for self reflection/self criticism regarding their "advanced statistics" but they appear to have no interest in research showing that assist numbers may be inflated--and it should be obvious that no matter how good a formula is its conclusions will be meaningless if the basic data (culled from box scores) is inaccurate.

After I did my research about Chris Paul's inflated assist totals, David Biderman of the Wall Street Journal contacted me to learn more about this subject--but he did not quote me in his subsequent article and instead cited David Berri, who has since apparently become one of Biderman's chief sources (Biderman can scarcely write two words about the NBA without mentioning Berri). I asked Biderman why he did not even mention my article about Paul's assists (Biderman had initially told me that my article inspired him to write his own article) and Biderman replied that he did not have enough space in his article to do justice to my in depth analysis. In other words, Biderman and/or the Wall Street Journal are not willing to allocate enough space to provide a proper, complete discussion of the flaws regarding current NBA scorekeeping and the limitations of basketball statistical analysis but they are perfectly willing to provide a forum for Berri to repeatedly issue pithy, misleading and inaccurate sound bites!

That tells you all you need to know about "advanced basketball statistics" and about how the mainstream media functions.

27 comments:

  1. I think there needs to be a distinction between descriptive statistics and statistics which draw conclusions (can't remember the proper term). I think there has been tremendous progress in terms of descriptive statistics like TS%, rebound rate,etc. On the other hand I think the conclusive statistics have A)really fallen short and B) are given too much credence and not put in the proper context.

    I'd like to see the "stats gurus" practice more of a stats-to-court or court-to-stats technique. What I mean by that is identifying a statistic about a player or team that stands out and then watching many games to identify what is causing it/is it a legitimate stat. Or doing the opposite identifying a behavior or result on the court and then trying to identify a associated stat.

    It seems that right now there is an over emphasis on being predictive to the point that correlation is described as causal instead of associative.

    I'm new the site and enjoyed reading many of your articles this weekend. Thanks

    ReplyDelete
  2. excellent article! I'm learning about statistics in college now and it so easy to be confused with all the concepts and the jargon. Stats can always be used (often misleadingly) to EXPLAIN things according to anyone's beliefs/notions but when it comes to PREDICTION, that's where these stat gurus are exposed. The failure to predict Miami's current situation is a perfect example. Prediction and application to me are the only things that count.

    ReplyDelete
  3. David,
    Point in case, look at what Hollinger said were not the reasons for the Heat Struggles:
    1. Mike Miller and Udonis Haslem because they had low PERs, thus cannot impact a game enough to push this team to greatness.
    -*Hollinger forgets one little information: Mike Miller is one of the premier 3pt(and catch and shoot) shooters in the league and that Udonis could grab rebound and shoot the midrange jumper at an efficient rate.

    2. Its not the one ball theory.
    - really? An old school analyst would have told hollinger before the season started that all three of these players dominated the ball. It would be hard for them to accept smaller roles offensively and their relatively young age(prime years) would make it harder.

    3.It's not the lack of point guard and center play. he points to Arroyo(13) and Ilgauskas(15) high Pers
    -Hollinger forgets both players are mediocre on the defensive end nor are they physical players. scoring 20 points and giving up 40 points doesn't equal winning.

    4. It's not the supporting cast.
    Most of these players weren't wanted by other teams in the league for a reason.(ironically 2 players other teams pursued and fail are injured right now- Mike Miller and Udonis).
    =========================
    He says teh reason they are struggling is because of the "Big 3". The reason he says is because of their significant drop in PER. that is a narrow-minded take on the situation. Fact is none of these players hold themselves accountable for the Heat's mistakes. These playres lack the humilty or shame suffered for succeeding seasons suffered by Kevin Garnett, Paul Pierce and Ray Allen to be successful. Given their antics during Free-Agency, after signing the contract(on the stage) and during the summer shows that right now their egos are too big to do that.

    Lebron said 2 days ago that if he changed his game, he'd be just a role player.

    ReplyDelete
  4. David,

    Although you and I have not always seen completely eye to eye on all things related to the game of basketball, when it comes to the proper role to be played by a pure form of statistical analysis applied to hoops, we are on the exact same page. Amen, to what you've written in this article, on this specific subject.

    ReplyDelete
  5. Jeremy:

    I agree with you completely on both counts: it is nice to have access to stats like true shooting percentage and rebounding rate but the "stat gurus" have gone too far with their dogmatic faith in their ability to predict/explain what happens during NBA games.

    Although I favor the approach that you suggest (identifying key stats and then watching games to identify the causes of statistical trends), the sad reality is that "stat gurus" are completely disinclined to follow the scientific method of forming a hypothesis and then rigorously testing it. In fact, many "stat gurus" believe that it is pointless or even harmful to actually watch games, because (in their misguided opinions) the eye is subjective but the numbers are objective and self explanatory.

    ReplyDelete
  6. Eric:

    Yes, "stat gurus" are very fond of using numbers to "explain" things after they have happened but that is quite different from accurately predicting what will happen. Also, their "explanations" frequently are biased by an incomplete understanding of basketball strategies and the execution of those strategies; one example of this that I often cite is that Pau Gasol's field goal percentage and offensive rebounding numbers have increased since he joined the Lakers in no small part because Kobe Bryant attracts so much defensive attention--but the "stat gurus" simply look at Gasol's "efficiency" and conclude that he is as good, if not better, than Bryant.

    ReplyDelete
  7. Jack F:

    As I have repeatedly said, for better or worse Hollinger feels compelled to explain everything in terms of what his proprietary stats (most notably PER) say. I am sure that from his perspective the big problem for the Heat is that their three All-Stars are not as efficient as they were in previous seasons. That is certainly true but saying that the Heat are losing for this reason is about as illuminating as saying that the Heat are losing because they are scoring fewer points than their opponents. The important question is why aren't these players dominating the way that the "stat gurus" (and others) predicted--and that is the question that I addressed in my previous article about the "lukewarm Heat."

    ReplyDelete
  8. @Jeremy,

    You hit it right on the head. I as well draw a distinction between what I call "Supplemental Stats" i.e.; ts%, Efg%, points per possession etc. and emphatic stats i.e.; PER, win shares etc.

    I find that supplemental stats are great because they help put certain aspects of individual games/players into a proper context whereas emphatic stats like PER tend to be little more then bold proclamations that in the end have very little value. I don't even think the +/- stat is all that helpful to be honest.

    ReplyDelete
  9. DMills:

    I think that +/- can be a useful tool if the sample size is large enough and relevant enough (i.e., consisting of a wide variety of opponents as opposed to just good teams or just bad teams) but I would not trust it as the sole or primary measure of an individual player's effectiveness. Adjusted +/- could perhaps give some indication of which five man units are most effective but the problem is getting a large enough sample size--there is a lot of "noise" if a given five man unit has only played a small number of minutes together and/or if that unit has faced a narrow range of competition (i.e., very strong teams or very weak teams). One aberrant performance could easily throw the numbers out of whack; for instance, if a five man unit has only played 100 minutes together and in one 12 minute stretch an opposing player got hot and hit four three pointers then the +/- numbers might not really give an accurate indication of how effective that five man unit could possibly be in future games.

    ReplyDelete
  10. @David,

    I was speaking only in terms of an individual players in the aftermath of a single game. The guy that used to do the Lakers post game show was a big believer in +/- to determin how effective a player was on a given night. So for example in game 7 of the NBA finals Kobe had a +/- of 0 while Ray Allen also had a +/- of 0. Paul Pierce was a +1 while Ron Artest was a -1. Looking at that alone would cause one to think that they somehow canceled each other out. But having watched the game nothing could be farther from the truth.

    ReplyDelete
  11. I think +/- is also inherently biased against defensive players. Take for example Matt Bonner who has a stellar +/- rating. Popovich often subs him in when the Spurs have the ball, and then subs him out when they are defending. These substitution patterns often happens at end of quarters, end of games, and after the first free throw.

    So when Popovich puts Bonner in, he either gets 0 or a +2(or 1 or 3). On the defensive possession, he gets subbed out so the other players left on the court either gets 0 or a -2(or 1 or 3).

    In short, defensive liabilities often get the chance to go up in +/- but are often protected from going down. This substitution pattern happens a few times in a game but can affect a player's +/- by a few points.

    David, what do you think of the Hornets and the Pacers this year. They are playing better than either you or stat gurus have predicted.

    By the way, have you read Woj's latest on LeBron? Wow this just keeps getting better and better.

    ReplyDelete
  12. Dmills:

    You are right about the limitations of +/- in terms of looking at just one game (or a small sample size in general); even the biggest proponents of +/- acknowledge this, though I have yet to see any kind of coherent statement indicating how big of a sample size is needed for +/- to be valid.

    I have mentioned +/- in some of my game recaps but I always use it to reinforce my observations and never as standalone "proof" of anything.

    ReplyDelete
  13. Anonymous:

    "Adjusted +/-" (which takes into account the other nine players on the floor) is supposed to address some of the concerns that you mentioned but I agree with you that +/- can be a very "noisy" stat, particularly if it is used to evaluate individual players.

    I did not think that the Hornets would be a playoff team in the tough Western Conference; it seemed like Chris Paul's desire to play for another franchise plus the team's apparent lack of depth would be two major negatives but so far Paul's situation has not been a distraction and Coach Monty Williams has turned the Hornets into a very tenacious defensive team. That said, the Hornets have lost four of their last five after their quick start, so it will be interesting to see what their record will be by the end of the season.

    I predicted that the Pacers would add 8-10 wins to their 2010 total (32) and said that they might sneak into the playoff picture. Right now they are sixth in the East, so I cannot really say that I am shocked by how they are playing. If they are in the top four by the end of the season then I will be very surprised but I suspect that they will finish somewhere between 7th and 10th (those teams will be separated by very few wins).

    ReplyDelete
  14. Sharp

    Reading through Jeremy's comment earlier, I was surprised to find that Hollinger felt that Haslem was a non factor in the Heat's success. Has the an even seen a Miami game in the last five or six years?

    His impact on the Heat is analogous to Derek Fisher's impact on the Lakers. Not the most talented guy on the team, but one of the vocal leaders and a player who has made a career out of effort and hustle.

    Hollinger is a joke, and ESPN should face criminal charges for charging people money to read his material.

    ReplyDelete
  15. @David,

    I have seen some interesting stuff with people looking at the +/- of certain combinations of players on the court together. That's one usage of +/- that I've found somewhat accurate. At least in terms of the "eye in the sky" test.

    ReplyDelete
  16. Dmills:

    I think that the Mavericks make use of that kind of +/- data as supplied by Winston/Sagarin.

    ReplyDelete
  17. The inability of most statisticians in basketball to test their own metrics is atrocious. I have been looking at SRS, Hollinger ratings and playoff seeds to see which metric is most successful in accurately predicting playoff series. SRS slightly outperforms playoff seeds going back to 1950, and, while I am still working on Hollinger's ratings, his metrics seem to be lagging greatly behind the other metrics. (you can check out my results at basketballfantatic.blogspot.com, all new articles will be posted at blindsidescreen.blogspot.com.) I am going to start working on another metric if I get permission and results from its author
    http://courtsideanalyst.wordpress.com/
    This metric looks very promising as it not only takes into account scoring margin, but also takes account scoring margin by other teams against the same opponents, which should limit discrepancies in cases where teams tend to blow out bad teams, tend to lose close games, etc., etc.

    Good job comparing baseball to basketball in terms of advanced metrics. Basketball is a team sport, and it is no surprise that two players that play remarkably similar roles (James and Wade) to struggle relatively when they are placed on the same team. Like you, I do not expect this to continue. You would not run into the same problem if two players who hit .300 and 40 homers were on the same team in baseball.

    Antoine Walker was a great example to use. Clearly his and Payton's addition to the Celtics that year helped them out. (I believe they went 11-1 the rest of the year.) You seem to have a good handle on stats despite your disdain of some metrics, so you may already know this, but usually when statisticians are talking about too small a sample size, it is usually any sample size under 30, as this is the general cut-off point for statistical significance ( limiting for variables.) Seeing more basketball stat gurus test out their own hypotheses would be a welcome change, and may be a necessary one as there becomes much dissatisfication and distrust with stats such as Hollinger's ratings.

    ReplyDelete
  18. David,

    At the risk of being unoriginal, I'd like to reiterate a point made in the "comments" section of many of your articles: your pieces are truly excellent, both from a basketball standpoint and from a grammar/structure/argument standpoint--"a work of art contains verification in itself."

    The issue I have with "descriptive" statistics is not exclusive to basketball, or really to sports at all; it is much more fundamental than that.

    There is really nothing in this word that is simple. Every aspect and interaction is full of nuance and consequence and to believe otherwise is to have a very superficial view of things. For any topic to be truly understood one must eschew black and white interpretation and appreciate and acknowledge each shade of gray. This is not to imply that "true" conclusions cannot be reached, only that these conclusions must proceed from empirical evidence and logical reasoning, with an appreciation for the fundamental nature of the topic at hand; and this is easier said than done. In fact, it is extremely difficult to reach "true" conclusions, and much easier to compartmentalize an issue and analyze it in a vacuum.

    It is human nature to take the path of least resistance, so it should be of no surprise that most people resort to the latter approach--and what better way to analyze things in vacuum than with numbers?

    Numbers, by nature, are painted with a "nuanceless" brush, making statistics the ideal tool for those who wish to "distill" reality with the ostensible intent of removing all that is extraneous to isolate a primal, unbiased reality. However, most statisticians lose sight of the fact that once reality is put through a numerical sieve it is not reality anymore. Once they begin slinging their material as reality they have effectively trapped themselves: to admit their fundamental mistake is to admit the fundamental flaws in their work, which marginalizes their self-created importance.

    I do not want anyone to misinterpret this comment as a tirade against statistics per se, as I am currently pursuing a minor in that very subject. However, people must realize that without a "philosophical" understanding of the limitations and fundamental assumptions inherent in statistics, they are in danger of upholding numbers as descriptive of, rather than supplementary to, reality.

    This is an especially insidious notion, peddled not only by "stat gurus" (as you like to say) but also by any outlet that treats-as-gospel predictive statistics with very little, if any, reality based context--a point that you highlighted very well in your article.

    I chose the word "insidious" for a reason: look at the mess that statistics-as-gospel has caused. Writers/analysts/politicians/etc., who greatly benefit from a black-and-white portrayal of things, have found in statistics the perfect ally: an extremely malleable pseudoscience that can be manipulated to “unequivocally prove” just about anything. Trafficking in statistics sells because people want reality to be spoon-fed to them from neat little bowls of numbers, without having to worry about nuance or shade.
    The presence of this condemnable dynamic is one of the reasons I find 20secondtimeout so refreshing. You have attempted to embrace all the intricacies of NBA basketball and to put each in its proper context. As far as I can tell, you are the only NBA blogger who has done so and been so thorough about it.
    This is my first comment on your website, but I look forward to leaving more in the future.
    -David-

    ReplyDelete
  19. Seif-Eldeine:

    My point with the sample size issue is that basketball "stat gurus" rarely if ever mention this subject unless they think that it suits the argument that they are trying to make; this is what Penn did regarding Miami--saying that the sample size of games was too small for their won-loss record to be meaningful but large enough for their "efficiency differential" to be relevant--and this is what Chaikin did regarding Antoine Walker. The funny thing with Chaikin is that not only did he refuse to specify how large of a sample he would need to see to possibly admit that he was wrong but he also repeatedly used the wrong terminology!

    The larger concern here is not how good Miami will be or how good Walker was but rather the way that the "stat gurus" act like they are practicing science when they are really just manipulating numbers to serve their own biases (and/or talk their way into writing/consulting jobs). Several years ago, Chaikin wrote a piece for SI.com that not only contained the most basic factual errors but also flip-flopped (without any explanation) from his previously stated evaluation of Eddy Curry. When I politely pointed this out in the APBR Metrics forum Chaikin first refused to respond and then went off on some ad hominem diatribe mentioning Prozac and a bunch of other nonsense. The moderator closed the thread (without giving me a chance to reply) before Chaikin could do any more damage to his credibility or to the credibility of "advanced basketball statistics"; this moderator is the same "stat guru" who later declared that the Lakers were utilizing a new, highly complex defensive strategy when the reality--as explained to me by Lakers' assistant coach Jim Cleamons-- was far more mundane: "The only thing we’re doing is what a lot of teams have decided to do: basically, playing a man to man defense that is actually a zone; we’re sending an extra defender over in situations that we feel threatened. There’s no big secret about it; that’s what we’re trying to do: give more help when we can and we’ve been fortunate thus far."

    Some "stat gurus" (Rosenbaum, Oliver, Beech) are doing good work and understand the current strengths and limitations of basketball statistical analysis but too many "stat gurus" act like they are oracles who have received wisdom from above that dare not be questioned by mere mortals; contrast the way that Pelton made the Lakers' defense sound like an impenetrable mystery with the way that I interviewed a coach and got him to break things down in a way that a casual fan can understand.

    ReplyDelete
  20. David:

    The problem with works that contain potent truths that powerful people do not want to hear is that sometimes their authors are sent to the gulag, either literally or in a metaphorical sense. That is how I ended up in the "samizdat" section of the basketball commentary world while the bleats and tweets of the "stat gurus" are faithfully broadcast to the masses via ESPN, the Wall Street Journal, etc.

    Your comment provides a cogent summary of the limitations of statistical analysis and it should be required reading for every editor/publisher who allows writers to use "stat gurus" (in any field, not just sports) as a prop to replace genuinely thoughtful analysis of the subject at hand.

    ReplyDelete
  21. Penn's point, which you say makes no sense, is that point differential is a better predictor of future success than win-loss record. In other words, a team's win-loss record is more likely over time to come to reflect its efficiency differential than is its efficiency differential is to reflect its win-loss record. Why is this so senseless?

    Also, you say that a constant tactic of stat gurus is to name call and make ad hominem attacks; but it seems to me you're more likely to find ad hominem attacks on this site than you are in, say, one of Hollinger's pieces. The fact that you put scare quotes around the phrase stat gurus is indicative of this.

    I agree that some metrics are poor predictors of future success; PER and adjusted plus minus are both terrible in this respect. However, wins produced is actually pretty highly correlated with actual team performance.

    ReplyDelete
  22. Anonymous:

    If you are familiar with my work then you know that I think that point differential is a very useful statistic. Penn did not say one word about point differential. He spoke of Miami's offensive and defensive efficiency, which he defined in terms of points per 100 possessions. He said that Miami's efficiency is off of the charts in both metrics and that this is a better indicator of Miami's strength as a team than the Heat's pedestrian record. I have two objections to Penn's reasoning: (1) He says that Miami's won-loss record is not important because the sample size of games is small but he ignores the fact that the offensive and defensive efficiency numbers are based on that same small sample size of games; (2) Penn completely ignores the ominous fact that the Heat padded their statistics by blowing out sub-.500 teams and that the Heat have been atrocious (both in terms of won-loss record and efficiency) against plus-.500 teams.

    I predicted that the Heat would win around 60 games and ultimately be the third best team in the East (which I define not by regular season record but by playoff finish--i.e., I expect Miami to be eliminated by Boston or Orlando). I still think that Miami will win around 60 games despite the Heat's slow start but the Heat cannot be considered the best team in the East, let alone a dynasty in the making, until they prove that they can beat good teams. If Penn believes, as many "stat gurus" do, that a great efficiency differential built on routing bad teams bodes well for the Heat's championship hopes then he should have explicitly stated that, he should have explained why he believes this to be true and he should be prepared to amend his conclusion after the Heat lose to Boston or Orlando in the playoffs. I do not believe that a few routs of weak teams early in the season tell us anything about Miami's potential to win a championship but I do believe that their poor showing against good teams shows that the Heat have a lot of work to do.

    What I have said about "stat gurus" is that when I offer specific challenges/refutations to their methodologies they respond with ad hominem and/or off the subject remarks because they are unable to refute what I have said. Examples of this include my exchanges with Chaikin, Dave Berri's never retracted erroneous assertion regarding what I actually said about the Knicks' alleged improvement, the blathering of Knickerblogger on that same subject, Basketbawful's potshots when I refuted their attempt to statistically prove that Nash was better than Bryant a few years back, etc. I have never lobbed the first insult in an exchange with such people but it is only natural that after they make things personal I respond by pointing out--correctly--that these people have serious flaws in their critical thinking and/or writing skills.

    The comments that I have made about "stat gurus" purely relate to their work; I do not comment about their personal appearance or their personal character. Contrast that with a certain yahoo who has repeatedly insulted me both publicly and in private correspondence, in addition to trying to pit bloggers at Ballhype against me by falsely claiming that I was voting "thumbs down" to their posts--a ludicrous assertion that I refuted by pointing out that anyone could prove that I had not voted those posts down by looking at my public "thumbs up/thumbs down" record at Ballhype before and after those posts had been made; the yahoo left that issue alone after I had made it clear that anyone who was so inclined could easily verify that I was telling the truth and he was lying.

    ReplyDelete
  23. Anonymous:

    I put "stat gurus" in quotes because it is ironic that they are viewed as gurus when so much of their work is so deeply flawed. ESPN announcers refer to Penn on air as a "stat guru" (you can be the judge if they mean to put those terms in quotes or not). Also, in a different context Mike Lupica once said that it is time for the gurus to start "guruing" and it is in that sense that I put "stat gurus" in quotes: it is time for the "stat gurus" to start "guruing" by defining sample sizes and margins of error and by making hypotheses with testable results.

    Dan Rosenbaum and other authentic researchers have pointed out in detail just how flawed Wins Produced is, so I recommend that you spend less time at True Hoop (Abbott has made himself Berri's Paul the Baptist) and more time familiarizing yourself with the relevant literature on this subject.

    ReplyDelete
  24. This is great, great stuff. I'm glad I found this blog.

    People, in general, believe the "stat gurus" because people, in general, will believe any math-based formula, even if the formula is so ridiculous it doesn't make any sense.

    I'm probably stepping on toes here, but if you read Dean Oliver's "Basketball on Paper," basically the guy just trashes every other rating system and puts forth a ridiculous one of his own -- resulting in Artis Gilmore being a much more "efficient" offensive player than Larry Bird, for example. It's because Oliver's "offensive efficiency" formula essentially just measures field goal percentage, or (if you combine 3s and 2s) points per shot. So of course Artis Gilmore will look better, because it's hard to miss dunks.

    I will tell you that the baseball "stat gurus" are just as obnoxious. They will tell you, over and over again, that Felix Hernandez winning the Cy Young is "a victory for sabermetrics." However, SF Giants GM Brian Sabean is well-known in the sabermetric community for not following sabermetric principles.

    Curiously, the "stat gurus" don't mention this, and don't call the Giants' World Series win "a victory for old-school thinking."

    ReplyDelete
  25. Matt:

    I think that Oliver is actually more rational than most of the "stat gurus" because he at least acknowledges to some degree the limitations of "advanced basketball statistics," while Berri goes so far as to say that his work should not even be considered theoretical anymore (!)--a truly astounding contention in light of the fact that the Theory of Relativity is still being challenged experimentally despite its track record of success while Berri's track record consists of nothing more than selling books to the gullible and turning Henry Abbott into his Paul the Baptist spreading the Wages of Wins gospel to the unwashed masses.

    While baseball's "stat gurus" can also be obnoxious I do think that there is a much sounder theoretical basis to their work than there is to the work of their basketball counterparts; baseball is a station to station game of discrete actions that can be quantified, while a basketball game consists of 10 players acting (and interacting) simultaneously. As you suggest, though, the extent of the "sabermetric revolution" in baseball has probably been somewhat overstated by the media, at least in terms of producing World Series champions (in addition to the example you cited with the 2010 champion Giants, Philadelphia's Charlie Manuel--who led the Phillies to two World Series, winning in 2008--is about as "old school" as they come).

    Rick Barry once told me that the only pure basketball stat is free throw percentage; every other number is either subjective and/or misleading. For instance--as you indicated--field goal percentage does not reflect a player's shooting range or ability to create a shot for himself, so a one dimensional player who does nothing but dunk will always have a better field goal percentage than a versatile player who operates from a variety of areas on the court. I am not saying that Gilmore was one dimensional--he was far from that--but I agree with you that field goal percentage is not the best way to determine an individual player's offensive efficiency.

    Numbers and statistics can provide valuable information but only if they are placed in the proper context.

    ReplyDelete
  26. David:

    You mention three "stat gurus" who you believe to be doing good work (Rosenbaum, Oliver, Beech). Where can I find their work? And do you have any recommendation on which metrics they use which are the most successful?

    ReplyDelete
  27. TBF:

    Dan Rosenbaum is an economist who is much more prominent in his field than Dave Berri is in his field, though Berri is better known to the general public because he has hoodwinked Malcolm Gladwell and Henry Abbott into tirelessly promoting his work. Rosenbaum's work is easy to find if you put a little effort into it, so I will just provide one link to get you started:

    The Pot Calling the Kettle Black.

    Please note how Rosenbaum explicitly addresses the issue of whether or not NBA decision makers are more rational than "stat gurus," while Berri haughtily contends that he knows far more than NBA decision makers do; also note that Rosenbaum addresses Berri respectfully (far more respectfully than Berri deserves in my opinion, because Berri hardly addresses others respectfully) and that Berri, like Chaikin in the examples I cited in this article, refuses to discuss Rosenbaum's measured, direct criticisms of the flaws in Berri's work.

    Dean Oliver wrote a well known book called Basketball on Paper. I don't agree with everything he writes/says but I prefer his measured approach (no pun intended) to Berri's know it all attitude. Oliver played and coached college basketball, so he has a much different perspective than Berri (Berri insists that watching games is detrimental to understanding them, which makes about as much sense as an astronomer saying that using a telescope interferes with understanding the universe).

    Roland Beech runs the website 82games.com. I don't necessarily agree with all of his conclusions but I respect the way that he--like Rosenbaum and Oliver--is searching for the truth as opposed to arrogantly insisting that he has figured everything out.

    Some of the "stat gurus" try to portray me as some kind of anti-stat Luddite but that could not be further from the truth. I was a very active participant in the APBR Metrics board until I got fed up with the pettiness and ignorance of several of the people who post there. I think that the quest to better understand basketball with the proper application of statistical methods is a very noble one and I follow the field with much interest but I disagree with the "stat gurus" and their blogging acolytes who insist that they have "solved" the sport and that they better understand basketball than anyone else, including NBA front office personnel.

    If you go back and look at some of the APBR Metrics threads in which I posted you will see that Rosenbaum repeatedly says that the questions I raise about "advanced basketball statistics" are valid and that the "stat gurus" should try to answer those questions as opposed to dismissing them out of hand. When I ask these questions and when I point out that there are problems with the way that the NBA tracks assists I am not trying to say that we should throw all of the numbers out of the window but rather that we should find ways to improve how the basic box score stats are recorded and that we should rigorously test "advanced basketball statistics" instead of blindly worshiping at the feet of certain "stat gurus" just because Abbott or Gladwell has fallen in love with them.

    ReplyDelete