"The Formula That Killed Wall Street": A Cautionary Tale for Anyone Who Places Too Much Value on Basketball Statistical AnalysisI have written several articles that are critical of the methods used in basketball statistical analysis, including a post titled Economics is Not a Science, Nor is Basketball Statistical Analysis. Some of the most prominent proponents of basketball statistical analysis are economists by trade and they apply the methodologies and techniques used in that field to try to quantify what happens on the basketball court. Let me make it quite clear that I am entirely in favor of trying to better quantify and measure the effectiveness of basketball players and teams; what I object to is the haughty contentions by some people in this field that they have already succeeded in accurately making such measurements. Basketball statistical analysis provides some interesting tools that can assist anyone who is trying to compare players and teams but there are limitations to what the numbers alone can accurately depict.
Basketball statistical analysis and new video technology have already made for a good marriage in terms of helping teams to more easily produce accurate scouting reports depicting the tendencies, strengths and weaknesses of players and teams. It is now possible to quickly break down game film (or, to be precise, game DVD) and catalog what happened on every pick and roll play, every out of bounds play, every postup and so forth. Most if not all teams are already applying some form of statistical analysis to the tendencies that emerge from game footage and this obviously represents a quantum leap forward in terms of game planning. Cavaliers assistant coach Hank Egan called this "corporate knowledge" when I interviewed him more than three years ago and he said that technological improvements have helped to increase the sophistication of defensive play in the NBA.
The problem is when some people invent certain formulas in which they add up some numbers, multiply other numbers by certain factors, subtract some other numbers and then produce a final number that supposedly "rates" a player's overall performance. It should be obvious that this "rating" is limited by several factors: the accuracy of the original boxscore data, whether or not the additions, multiplications and subtractions correctly value what a player does and, perhaps most importantly of all, the fact that not everything that a player does on the court is captured numerically. I have yet to see any of these stat gurus say that Player X is rated 33.4 with a margin of error of +/- 2.5 points; the stat gurus don't even mention a margin of error because they could not begin to calculate one: they are not performing scientific measurements like a biologist or an astrophysicist--they are massaging basketball statistics in a way that they find appealing and that they believe to be correct (or that will produce conclusions that fit in with their own preconceptions and will be easy to market to book publishers or in other forms of media).
It is interesting that, until fairly recently, economists believed that they had created a formula that--as Wired author Felix Salmon writes in Recipe for Disaster: The Formula That Killed Wall Street--"allowed hugely complex risks to be modeled with more ease and accuracy than ever before." Salmon explains that David X. Li's "Gaussian copula function" formula "made it possible for traders to sell vast quantities of new securities, expanding financial markets to unimaginable levels. His method was adopted by everybody from bond investors and Wall Street banks to ratings agencies and regulators. And it became so deeply entrenched—and was making people so much money—that warnings about its limitations were largely ignored." Li's formula provided "a brilliant simplification of an intractable problem," enabling financial analysts to plug in some data and derive "one clean, simple, all-sufficient figure that sums up everything." The problem is that "people used the Gaussian copula model to convince themselves they didn't have any risk at all, when in fact they just didn't have any risk 99 percent of the time. The other 1 percent of the time they blew up. Those explosions may have been rare, but they could destroy all previous gains, and then some." The economic mess that we are all struggling through now is one of those "1 percent explosions."
Nicholas Nassim Taleb, who I quoted in my Economics is Not a Science, Nor is Basketball Statistical Analysis post, told Salmons, "People got very excited about the Gaussian copula because of its mathematical elegance, but the thing never worked. Co-association between securities is not measurable using correlation. Anything that relies on correlation is charlatanism."
Salmon concludes, "In the world of finance, too many quants (quantitative analysts) see only the numbers before them and forget about the concrete reality the figures are supposed to represent. They think they can model just a few years' worth of data and come up with probabilities for things that may happen only once every 10,000 years. Then people invest on the basis of those probabilities, without stopping to wonder whether the numbers make any sense at all."
This is very analogous to the current situation with basketball statistical analysis; too many of its practitioners "see only the numbers before them and forget about the concrete reality the figures are supposed to represent." They do not have the wisdom or humility to admit that their simple, pretty formulas represent just one, limited interpretation of raw data.
A Wired article titled Road Map for Financial Recovery: Radical Transparency Now! contains a parable that very aptly describes the flaws in the methods used by some basketball statistical analysts:
As (Christopher) Cox sees it, that massive computational power has primarily been used by financial engineers, who create abstract models of how the market should operate and make bets based on those models. "You know Borges, the writer?" Cox asks. "He wrote those fantastical short stories. He has one called On Exactitude in Science." The parable tells of a kingdom obsessed with creating a perfect map of itself—an essentially useless quest that leads them to draw a map that is the same size as the territory it is supposed to represent. Cox sees the story as a metaphor for the modern financial industry, which is so obsessed with modeling the market that it has lost sight of the data beneath those models.
Basketball statistical analysts do not yet have all of the necessary data to completely "model" the sport, nor do they fully understand how to use the data that they have. Trying to produce such a model is certainly a worthy task--but I just wish that the people who are working toward this goal would stop declaring "Mission Accomplished!" when the reality is that they are in the beginning or intermediate stages of that mission.
posted by David Friedman @ 12:07 AM