Economics is Not a Science, Nor is Basketball Statistical Analysis"Economists cannot predict tomorrow's economy; they cannot agree on the state of the economy today; they cannot even arrive at a consensus on why the economy behaved as it did in the past...We may not ever be able to build a positive science of economics based on empirical knowledge but that is no reason to wrap the little we know in a pseudoscientific fog of superstition."--Walter Russell Mead, 1993.
Considering how the U.S. financial markets are crumbling right before our eyes, don't those words from 15 years ago send a chill up your spine? Economists want the general public--and especially their employers in the academic world--to think that they are practicing science but sadly that is not the case. What does this have to do with basketball? In recent years, many people who trained academically as economists and/or statisticians have tried to use "econometric" style models to make evaluations about basketball players and teams. Frankly, I'm not sure if it would be more frightening if these guys devoted all of their energies to confusing the general public about basketball or if they abandoned this effort and resumed their failed attempts to understand our past, present and future economy. In either case, they will no doubt continue to make bold pseudoscientific declarations with complete confidence that they understand the issues better than anyone else.
Roger Lowenstein's review of Nassim Nicholas Taleb's book The Black Swan does not mention basketball at all but the way that he debunks the idea of economics as science provides a good template for explaining much of what is conceptually wrong with "econometric" based basketball statistical analysis. Lowenstein summarizes Taleb's thesis simply and bluntly: "His heavy artillery is aimed at financial academics and economists, the latter because they try to forecast such stuff as market moves and interest rates. He calls them frauds. It’s as good a word as any." Taleb used to be a derivatives trader on Wall Street, so he has firsthand experience with just how poorly economists understand what they are talking about. Lowenstein says, "The academics who drive him to tears are the ones who have explained—or misexplained—his old profession. They think that markets are from Mediocristan when in fact they inhabit Extremistan." Lowenstein explains, "Mediocristan is the terrain of the ordinary, the part of the world that conforms to the bell curve. It answers to statistics and knowable probabilities. Height resides in Mediocristan. You may find one 7-footer on your block, almost certainly not two...Personal wealth, however, is from Extremistan. For instance, the average wealth of 1,000 people will be very different if one of those people is Bill Gates. This distinction is potent. In Extremistan, past events are a faulty guide to projecting the future. Gates may be the world’s richest person, but it isn’t unthinkable that someday, someone (at Google, perhaps?) will be twice as rich. Wars also reside in Extremistan. Prior to World War II, the planet had never experienced a conflict as terrible. Then we did. Suppose you frequent a pond. Day after day you see swans—always white. Naturally (but incorrectly) you presume that all swans are white. World War II was a black swan—horrific and unpredictable. Market crashes are black swans. Winning at blackjack is not one. The odds in casino games are known. The finance profession has badly mischaracterized markets in such a way as to overlook the possibility of black swans. Business schools teach that risk is quantifiable—that markets resemble a casino. You will draw the bad queen once in a deck but never twice. That is why securities analysts presume to define the 'riskiness' of stocks in precise, arithmetic terms. They model the future on the past. But stocks, alas, are from Extremistan. Taleb makes much of the example of Long-Term Capital Management, a subject about which I wrote a book. The spectacular meltdown of the hedge fund, run by Nobel Prize-certified economists and intellectual heavyweights, was a primo example of faulty precision, of modeling markets according to past events. The fund’s genius managers couldn’t predict the black swan of the Russian debt default; they drew a run of bad queens, and down they went. And L.T.C.M. is merely emblematic; it is the entire profession of finance, its edifice of modern portfolio theory, and virtually every tool that financial consultants regularly rely on, that Taleb identifies as wrongheaded."
It is precisely that kind of wrongheaded thinking--and arrogance--that leads to much of the nonsense that is spewed by people who think (or at least claim) that they are scientifically analyzing basketball (I'm not sure if these guys know better but enjoy selling books and getting a lot of publicity for themselves or if they really believe what they are saying). Unfortunately, the general public suffers not only from illiteracy (or at least poor reading comprehension) but also from "Innumeracy." If I make a skill set-based comparison of two players based on my informed opinion--i.e., an opinion based not only on watching a lot of basketball but also on interacting with professionals who make their living evaluating basketball players--every Tom, Dick and Harry thinks that his opinion is just as valid and informed but if some guy invents a formula, gives it a catchy name and says that player x is worth 30.2 but player y is worth 28.7 then Tom, Dick and Harry are ready to bow down to those numbers as if they are the Golden Calf. Guess what--all those numbers reflect are the knowledge (or lack thereof) and bias of the person who created that formula; the numbers may be 90% correct or 90% incorrect but most people don't understand math or statistics so they don't feel comfortable challenging the numbers, or else they only lash out at the numbers that speak poorly of "their guy" but they love the numbers that elevate "their guy" and/or downgrade "their guy's" rival. One tell-tale sign that these numbers are not the products of science is that you do not hear their creators speak of margin of error. If a scientific formula spits out the number 30.4 as a player value the reality is that there is a certain probability that the actual value is somewhat higher or lower than that--but the "stats gurus" rarely if ever mention this and they certainly don't emphasize this point as much as it should be emphasized; they like to promote the idea that their numbers are "exact" while observations by seasoned professionals (scouts and other talent evaluators) are subjective. It is true that observations are subjective but so are the stat formulas; that is why intelligent people understand that you have to combine observation with statistics and that you have to watch games in order to figure out what the numbers really mean. For instance, who is charged with a turnover is not nearly as significant as what really caused the turnover. If Pete Maravich throws a great pass to a stiff who fumbles the ball out of bounds, Maravich may get a turnover in the boxscore but anyone who understands the game realizes that the problem is that his teammate can't catch the ball; on the other hand, if Maravich carelessly throws the ball away or makes a pass that no one could reasonably be expected to catch, then that reflects badly on him regardless of how the scorekeeper officially documented that play.
"Stats gurus" plainly do not want to discuss or consider the fact that some of their most precious numbers--the raw data that they plug into their formulas, stats like assists, steals, blocked shots and turnovers--are subjectively recorded. During last season's playoffs, I did a detailed post demonstrating that Chris Paul's supposedly record setting playoff assist totals were in fact inflated by generous scorekeeping. Shouldn't that be of interest to the "stats gurus"? Isn't that claim something that they seriously need to investigate on their own to either confirm or reject? I provided very specific information so that anyone could watch a tape of the game and find the exact plays that I described and thus judge for themselves whether or not each of those assists should have been awarded. Yet I see no indication that the "stats gurus" are the slightest bit concerned about the fact that a lot of their basic data is seriously flawed. A lot of these guys spent a good portion of the season pumping up Chris Paul as the MVP and it is highly likely that they did so on the basis of bogus assist numbers. Based on a skill-set evaluation of Paul's game, I consider him to be the best point guard in the NBA and a top five MVP candidate but that is not the point; the point is that if you are basing your whole analysis of the NBA purely on numbers and some of the basic numbers you are using are not right then your whole analysis is bogus. If a real scientist finds out that the raw data he has gathered is flawed then he understands that he has to gather new, accurate data. Unfortunately, many of the basketball "stats gurus" are not scientists; they are "mad scientists" at best.
No one should misinterpret what I am saying to mean that I am some kind of Luddite who is against using basketball statistics; what I am against is the misuse of basketball statistics, just like I am against the misuse of media platforms by people who spout hype and biased commentary as opposed to communicating information in a fact based, objective manner. Dan Rosenbaum and Dean Oliver are two welcome exceptions to the above critique of basketball statistical analysis; everything that I have seen of their work indicates that they understand the limitations of what their statistical analysis can show and that they are working hard to improve what their models can do as opposed to acting like they have everything figured out already. In an insightful post titled Using statistics in basketball: the bar is higher, Rosenbaum writes, "Statistical analysis can play a critical role in basketball decision-making, but it can also be misleading if the complexities of the game of basketball (and the statistical issues generated by those complexities) are not well understood. In other words, the bar is higher for statistical analysis in basketball than it is in baseball. Ultimately this will greatly benefit the teams that incorporate skilled statistical analysts in the right way, because the greater complexities in basketball will mean that it will be harder for other teams to ever catch up with the first teams that get this right. It will be fascinating seeing how this all plays out over the next few years."
posted by David Friedman @ 4:37 AM