The statistic is great but imperfect.
This week ESPN rolled out a “new” statistic in its NBA toolkit. They call it Real Plus-Minus, or RPM. RPM is, essentially, the latest version of a statistic called xRAPM which has been made publicly available online by its developer, Jeremias Englemann, for a few years. Now, xRAPM is short for “expected regularized adjusted plus minus,” which, while very accurate, is just a ridiculous mouthful, so it’s understandable that ESPN would want a name that rolls off the tongue a bit more easily. Unfortunately, the use of REAL as the operative portion of the name they chose for their version of this metric indicates a certainty that belies the level of uncertainty that remains in the RPM framework. It is this inability or unwillingness to delve into the uncertainty in the numbers which is the biggest problem with ESPN’s roll out of RPM.
It’s understandable, given the effort ESPN surely had to go through to get these numbers under the ESPN brand, that they’d want to express their confidence in them. Still, these numbers are going to be used in more and more discussions about player value, and as such, it’s important that the underlying assumptions and framework for the metric are properly understood, so that they may be used in proper context. Here’s how ESPN introduced RPM to the general population:
What is real plus-minus
As the name suggests, real plus-minus shares a family resemblance with the +/- stat in the box score, which merely registers the net change in score (plus or minus) while each player is on the court.
RPM is inspired by the same underlying +/- logic: If a team outscores its opponents when a player is on the court, he’s probably doing something to help his team, whether or not he’s putting up big numbers.
But the familiar +/- stat has a serious flaw: Each player’s rating is heavily influenced by the play of his on-court teammates.
For example, in the basic +/- numbers, Thunder backup point guard Reggie Jackson is ranked 27th in the league. But he’s also spent the majority of his minutes playing alongside Kevin Durant, the league’s likely MVP. What we really want to know is how much of Jackson’s elite rating is attributable to his own play, and basic +/- simply can’t tell us.
But real plus-minus can.
(Emphasis mine). This is simply not true. The reason it’s not true is because it is quite literally impossible to totally attribute the impact of an individual player to the margin of victory in a basketball season. The best we can do–and what RPM actually does–is use math to come up with a best estimate of the value of each individual player. Again, ESPN has an incentive to go for the spectacular description, so this is hardly surprising, but it is too bad. RPM, and xRAPM before it, are incredibly powerful predictive tools and are probably the best estimate that presently exists for determining the all-in-one value of a given player, in their role on their team. It is, however, imperfect. It’s imperfect for perfectly reasonable reasons, but it’s not without its caveats, due to the methodology. The trouble is, ESPN has hidden parts of the methodology and described its assumptions insufficiently and incompletely. (Kevin Pelton did go on Zach Lowe’s Grantland podcast in order to explain how RPM works in somewhat more specific terms, but he didn’t get into all of the caveats I’ll get into below, and it’s not entirely clear why the explanation he gave was not part of the initial roll out. There’s also a good chance that many who read the initial RPM introduction did not also manage to listen to a podcast on a totally different, though affiliated website.)
It’s not that the method is proprietary or must stay hidden either, as anyone with the curiosity and free time can go Googling or diving into the APBR Metrics message board archives to find out just about everything that goes into RPM. Englemann has been very open about his process from the beginning. There’s a lot of fancy math involved that goes over my head, but here’s a couple of things I have gleaned from reading and paying attention:
RPM uses data from the prior year to reduce noise.
(Ed. note: After the first season it was used, ESPN stopped using prior year data in its calculation of RPM, which reduced its predictive ability but allowed them to more credibly use it for season-end awards discussions without being accused of using data from years before and muddying the waters.)
In any metric based around adjusted plus-minus, there is bound to be some level of noise or uncertainty. This is due to issues of collinearity and relatively small sample size. Collinearity, in this context, just means that there are often players who play together a lot or who only ever sub in and out for one another and thus it is hard to disentangle their value in +/- from one another. Small sample size results from their being a relatively small number of lineups from each team to draw upon in sussing out the value of individual players. As a result of this, some means are necessary to reduce the noise to have the best possible estimates of player impact.
RPM utilizes just about every possible way to reduce noise there is. All of these methods for reducing noise introduce certain influences to the numbers and its important to note them. The first of these influences is that RPM uses prior year data as a way to inform the regression upon which RPM rests. This helps in terms of overall predictiveness, but it also makes individual player ratings representative of their last two years rather than just the year that is allegedly being measured. This has important implications, given the way that ESPN will likely use the data (particularly in their end of season awards discussions).
Currently, LeBron leads the league in RPM per 100 possessions and (barely) in Wins Above Replacement (WAR). This is partially a function of LeBron’s stellar year last year when he hit his likely peak as a player. The guys at Talking Practice have created Individual Player Value (IPV), which is an all in one metric that is pretty similar to RPM in methodology and results, but they report numbers that are not informed by prior years. Under IPV, LeBron is third behind Kevin Durant and Stephen Curry. It’s clear that giving LeBron credit for last year gives him an upper hand under RPM, but no mention is made anywhere about the use of prior year data in articles using RPM as a tool thus far. This should be mentioned whenever RPM is used to make individual player comparisons.
RPM, like xRAPM before it, uses a box-score based prior which actually makes up a large portion of the metric.
What this means is that when running the regression (i.e. a crazy big math equation where there are many variables that need to be solved for, in this case the variables are the RPM values of the individual players), the equation is given information about where the player should likely shake out in terms of value, given what’s known about their box-score stats. The box-score based prior which RPM uses is based on a regression of box-score stats against season lineup data to best predict results. Basically, what this means is that players that put up great box-score numbers will be benefitted under RPM, even if those numbers are somewhat hollow.
As an example, say you have a player like Carlos Boozer who is a defensive rebounding machine, but who is otherwise an awful defensive player. Under RPM, Carlos’s defensive value will be somewhat artificially propped up, because defensive rebounding, on average, serves as one proxy for defensive value. In terms of overall predictive accuracy and confidence in the numbers produced, the box-score based prior helps a lot, but on the margins, there will be problems like our man Carlos. You can paint a similar picture for a player who gambles for steals a lot or someone who blocks a lot of shots but has poor defensive discipline and blows rotations routinely. The same is true of a player who puts up numbers on offense, while neglecting all of the unmeasured things on that end, the box-score prior will inflate his value relative to his Platonic “True Value.” The portion of RPM which is determined by regression against point differential while on and off the floor mitigates some of these ill effects, but it’s important to know these influences exist.
RPM contains a height-based prior which boosts the defensive ratings of all taller players.
This is a final situation where something is included in RPM which improves predictive accuracy, but which introduces a certain amount of bias into the numbers on the individual level. The reason for including the height based prior is simple: on average, big men tend to be much more impactful on the defensive side of the ball than smaller players. Although this is undoubtedly true on average, not all big men are good defenders, so adding the boost to all players who are big will necessarily inflate some undeserving players for the sake of greater overall average accuracy.
Given the box-score and the height-based priors, it’s again easy to think about Carlos Boozer, who takes frequent naps on defense, but is tall and snags many defensive rebounds. RPM is going to make Boozer look better on defense than he deserves. Unfortunately, these influences can have cascading effects. Due to the fact that the regression of RPM necessarily ties all players together in one big equation, if Boozer is getting more credit than he deserves, someone he plays with frequently is probably getting jobbed. (Yes, I’m mad about Joakim Noah getting screwed in RPM by Carlos Boozer, you guys).
These influences or biases, however you choose to term them, are not the end of the world, and the logic of them is sound in terms of modeling the league, but RPM is not infallible and ESPN would do well to more fully explain the assumptions underlying the model and the potential consequences therefrom. RPM is the best all-in-one estimate of player value (within a given role) in the public domain, and good on ESPN for bringing it to a wider audience. However, it’s not a perfect measure and no one number metric likely ever will be in a game as dynamic as basketball. It is a valuable tool to use, alongside paying close attention to games and the more standard box-score metrics.
* Multiple years are used in order to reduce the noise caused by the relatively small sample of 1 season’s worth of data.