You only matter as much as your denominator. That's my Nate Silver-ian message of today.
I've stressed over how to develop a pitcher side of Equivalent Fantasy Average ever since I first drew EFA up back in January. On the hitting side, it's easy(ish). Guys are all doing the same basic things, in the same basic roles. Scaling them against one another let me see what value speed guys have when compared directly to power guys and the like. It boils all fantasy value down to a single number, scaled to batting average. In that way, it's not at all dissimilar to WAR.
Yeah, it's problematic that my usage weighs the batting average guys with 320 plate appearances equal to guys with 650. (And after today's piece, I might be changing a little of my approach to the offense EFA in the future.) But by and large, it was quick and relatively easy.
Pitching has proven to be much more difficult. The best starters throw 200 innings. Some of the best closers only throw 50. That's such a huge difference that direct, value-to-value comparisons just lie to you. I could do separate EFAs for starters and relievers, but that feels like doing separate EFAs for speedsters and sluggers, and that kind of defeats the whole purpose, yeah?
On top of that, offensive fantasy stats are all "more is better." On the pitching side, though, both WHIP and ERA work in the opposite direction, which makes ranging based on standard deviations all the more complicated.
For all those reasons, I've just been kicking it with hitter EFA so far. But no more, as I had a new notion over the weekend for how to make the pitching side of the metric work.
Okay, the next thing I'm going to do after the current thing is explain my methodology. But the thing I'm going to do now is talk about me a bit. I was always really good at math in school, inasmuch as you can show me just about any four-function problem (you know, +-*/) and I can do it in my head to a certain extent. It helped me all the way through, even if it meant I was something of a Rain Man about it, knowing the answers somewhat innately. It meant that when I got to college and chose to major in journalism, I got lots of "Wait, why not math?" questions.
So I gave in, and college mathified myself. It went well for a bit, until I got to Calc 3. At that point, math went from "Hey, what's the answer to this problem?" to "Hey, how does this problem work?" Fairly simultaneously, I went from "Hey, math's easy" to "Hey, math's hard." I went back to journalism after that.
What that means is that I'm good at math and its general concepts. Deeper math - and, relevant here, deeper statistics - can get a little hazier. In short, I'm pretty decidedly not Nate Silver.
Anyway, for those reasons and a few others, I do a lot of this EFA stuff with less than full confidence. Like, I know this stuff makes sense on the surface, but I wonder if it works all the way down. So yes, the next thing I'm going to do is explain my methodology. When I do, feel free to critique. The math you can do on your calculator watch will be flawless, but the math that goes deeper than that - basically, the logic - might have some issues. I'm comfortable with that. Think of this as a Pitcher EFA rough draft, and feel free to chime in accordingly.
Okay, so remember my thesis statement. You only matter as much as your denominator. Closers pitch barely more than a handful of innings in a season. If a guy implodes, like Joe Nathan this year, or is lights out, like Huston Street, it only has so much of an impact. Yeah, given your druthers, you'd rather a closer with 50 saves, 1.00 ERA, 0.50 WHIP. But 50 saves, 4.00 ERA, 1.50 WHIP helps way more than it hurts, because the innings - the denominators - just aren't there.
Wins, strikeouts, and saves I calculated exactly the same as I do the offensive side - find the mean and standard deviation of each stat, and figure out how many SDs above or below that mean each player's contribution is (If you need a refresher, here's the original EFA piece). From there, it's as simple as figuring out what that SD deviation means in my baseline stat. (For pitchers, EFA is given in terms of ERA, for the entire reason I created the metric - you want a presentation that resembles what we're used to looking at.)
ERA and WHIP, for those denominator problems, won't work the same way. As I said Wednesday, it's one thing to compare two .300 batting averages across a plate-appearance difference of a couple hundred; both of them have enough impact to matter. But an ERA in X innings is so much less influential than an ERA in (five times X) innings.
Then I realized - the denominator doesn't actually matter. If Dave gives up 50 runs in 80 innings, and Mike gives up 50 in 400, you know what that means for your fantasy team? Fifty. Because over the course of a fantasy season, basically every fantasy team is going to fit into a fairly defined range of innings pitched. Starter-heavy teams will hit an inning cap. Other teams will stream starters. One way or another, you get your innings.
So for ERA and WHIP, instead of using those stats, I simply used earned runs and baserunners allowed. And, so as to solve for the lower-is-better reality of these categories, I went with "how many are you below the maximum?" Ricky Nolasco has allowed the most earned runs in baseball this year, with 62 (through Tuesday). Justin Verlander has allowed the most baserunners, at 165. So I simply subtracted every player's total from those numbers, to make it a more-is-better scenario.
The reality is, the upper bound I established was meaningless. I could have figured out how far they were below a thousand runs, a million baserunners. Because it's all about distance from the mean, the actual value of the mean is immaterial. I just liked doing it the way I did it.
So that way, I had five values. Theoretically, that should have been all I needed. Except then, a pitcher who has zero runs allowed in 60 innings is valued exactly as highly as zero runs in five innings. That artificially inflates relievers an insane amount. So, contrary to hitters, for pitcher EFA I added a sixth category, and it's that denominator. I calculated pitchers' number of SDs above or below the mean in innings pitched as well.
From that point, it was simply a matter of translating those values into an ERA-applicable number - because I'm using ERA, and lower is better again, players' individual stat corollaries occasionally fall to the negative - and average the now-six values.
This took a while, and more than one start-over. Here, look at my computer screen:
There were like 30 pages of that.
Anyway, I think it works. The values I got made sense. Saves make this whole process interesting, as they are like steals on crack - the vast majority of players have zero, but then there are some guys with 20-some. That means that the mean on saves is fairly low, the standard deviation is a bit higher, and guys who get saves get a really good ERA corollary as a result.
Of course, that makes sense, as there are so few guys getting saves, so each one matters that much more. Also, closers almost by definition lag far behind in counting stats like strikeouts and wins that more credit for their saves balance that out.
Anyway, that's Pitcher EFA. As I said, feel free to chime in in the comments if you have ideas on how to improve the metric. And for the next monthly Hitter EFA, I might incorporate plate appearances as a sixth category, just to see how it works.
I set the baseline for innings in EFA at 20. In retrospect, including the innings category means I probably didn't have to do that, but guys with fewer than 20 innings aren't that relevant to begin with. Anyway, I calculated the EFA for every pitcher with 20 or more innings, but the only ones I list here are ones owned in five percent or more of Yahoo! leagues.
I did it in part because dude, there are so many freaking pitchers. But also, a subpar middle infielder, like Nick Punto or something, can still do something on a given day to be relevant. But the third or fourth tier of middle relievers won't get saves, aren't likely to get wins, and don't do enough in the other categories to make up for the chart-clogging they'd do if I listed every pitcher.
As luck would have it, exactly 200 pitchers have 20-plus innings and a 5-plus ownership percentage. I didn't expect that, but it does make for handy tabulating.
Below is the chart. I'm listing it in small groups so I can offer thoughts after each section. Here we go:
When I started this, I really had no idea how closers would be represented. But now that it's done, this makes some sense to me. Yes, many relievers can save games, making the average closer replaceable in real baseball. In reality, though, the only ones who can get saves for you are the ones who do get saves. If you have Clayton Kershaw and he gets hurt, you might not find Jake Arrieta to pick up the slack, but you could. If Greg Holland gets hurt, though, you just have to put your eggs in Wade Davis' basket and hope.
The best closers, then, should rank highly in EFA, as what they do simply isn't replaceable. That's why 13 of the top 25 are closers, and -- remembering that whole "denominator" thing -- guys like Romo, who have struggled, are still fairly worthwhile in fantasy as long as they're getting that category that no one else is.
Basically everything else here makes sense to me. Kershaw has been great, but his lack of innings hurt him. Tanaka, Hernandez, Cueto, Wainwright ... that's a list that makes sense at the top.
Nathan is ahead of Doolittle right now because he's had the closer job all year and has gotten saves throughout that time; Doolittle is catching up fast. ... This section also saw the first middle reliever sighting, as Betances' dominance overcame his lack of innings and his lack of saves. ... Davis, too. ... Buehrle's win total masks his low strikeout total.
Hawkins, who has had the closer job all year, comes in so low among closers because -- as I've belabored all season -- he doesn't strike anyone out. Dude has 13 strikeouts in 29 innings. ... Even having missed two months, Fernandez still does well. He's so good. ... Sanchez has been great overall, but his DL stint and his relatively low strikeout total is keeping him in check.
|129||Jorge De La Rosa||COL||3.65|
|Rubby De La Rosa||BOS||3.76|
Exactly where the line is is a matter of personal preference, but somewhere between 100 and 200 is the line where the starting pitchers cross into decidedly "play the matchups" territory. In some cases, it makes more sense to have one of those middle relievers than one of those hurt-your-rate starters. ... Man, there are a lot of big names down near the bottom. Verlander, Cain, Sabathia, Buchholz. Massacre down there.