Jump to content
North Side Baseball

Recommended Posts

Posted

http://www.espn.com/mlb/story/_/id/18114272/miller-going-war-mystery-robbie-ray

 

The bolded is the answer you want, but this entire section(and article) is worth a moment to read.

 

The first major decision a WAR model needs to make is whether it's going to measure the pitcher's skill or the pitcher's results. Baseball Reference's WAR, or bWAR, focuses on the results. It starts with an estimate of how many runs an average pitcher would have allowed under similar circumstances to Ray's -- the typical offensive output of his opponents, adjusted for the ballparks he pitched in, the role he was used in and the defense that backs him -- and compares that figure to the number of runs Ray allowed. In Ray's case, bWAR sees a National League starter pitching in a hitter's park in front of a below-average defense; it expects a pitcher in his spot to allow 5.01 runs (earned and unearned) per nine innings. Ray allowed 5.42 runs per nine, which means that over the course of 174 innings, he allowed eight more runs than an average pitcher. He was, in other words, below average. Simple.

 

But is it? Take his start against the Cincinnati Reds on July 23. Ray struck out 10 batters in five innings and walked only one, a sensational ratio. Rarely does a "bad" pitcher strike out two batters per inning with good control, but Baseball Reference's WAR would have marked him as "bad" that game because he allowed six runs. Three were unearned, thanks to a first-inning error by the third baseman. Three were earned, and that came in an inning that went like this:

 

Single

 

Single

 

Home run

 

Groundout

 

Lineout

 

Strikeout

 

That's a bad inning, to be sure. But rearrange the sequence so it goes like this:

 

Groundout

 

Lineout

 

Home run

 

Single

 

Single

 

Strikeout

 

That way, Ray would have allowed only one run. Same basic pitcher -- six batters, two singles, a home run and a strikeout -- but only one's WAR is sunk.

 

FanGraphs' WAR (or fWAR) doesn't care how many runs Ray allows. It cares how well he threw on a per-batter basis, and it focuses its attention on the three "fielding independent pitching" outcomes -- strikeouts, walks and home runs -- for which a pitcher has the most individual responsibility. A FIP-based WAR assumes a league-average batting average on balls in play (BABIP) for every pitcher because BABIPs tend to regress toward that league average and because exceptions are often more about luck or defense than the pitcher's skill. (It also adjusts for ballpark and other contextual factors.)

 

In the case of July 23 against the Reds, Ray's fWAR would be the same, no matter the order the third inning played out. It would be almost exactly the same, regardless of whether the third baseman botched the play in the first.

 

The question is whether Ray deserves to be treated as "normal" on those things FanGraphs WAR strips out. In his career, he has allowed a BABIP nearly 50 points higher than the league average -- and the highest BABIP in baseball during that time (minimum 250 innings). He has been significantly worse with runners on base, which means he has been more prone to sequences like the one above that lead to runs. Either Ray is still flushing the bad luck out of his system -- a reasonable, if uncertain, assumption -- or he has individual deficiencies that FanGraphs WAR doesn't pick up.

 

Each of these metrics is complicated enough, but they are, essentially, arithmetic. A fan familiar with each model would know intuitively why one likes Ray and the other doesn't and where the discrepancies lie. But the third, Baseball Prospectus' WARP, based on a stat called deserved run average, is the relative black box of the trio, as it uses a method called mixed modeling that accounts for both fixed and random effects. DRA-based WARP accounts for every factor that can be measured, from the catcher's framing, blocking and throwing ability to the size of each umpire's typical strike zone to the baserunning skills of the pitcher's opponents to the weather to each defender's abilities. It puts heavy emphasis on those things a pitcher controls, then attempts to determine the pitcher's share of responsibility for everything else. According to tests run by its creator, Jonathan Judge, it's the most predictive pitcher-evaluation model. It's also beyond the average fan's ability to dive into. Why does WARP like Robbie Ray? For most of us, the answer stalls at "because WARP likes Robbie Ray."

 

ERA: 4.90

 

FIP: 3.80

 

DRA: 2.95

 

I can easily accept any one of these three assessments of Robbie Ray, but the human brain has a hard time accepting all three at once.

Posted

Well, they don't spend much time at all talking about the nuts and bolts of the stat, apart from saying that not only does it model the likely effect of lots of variables for every pitch (who is hitting, who is fielding, hopefully who the umpire is, what ballpark are you playing in, etc) but it updates the values of each of those variables as time goes by. IMO, it is interesting and is something that is overdue to be explored, but it sounds like over-fitting is a real concern https://en.wikipedia.org/wiki/Overfitting, and I want to know more details about what variables are being used...and how.

 

I've always wondered if a stat that compensated for game situation would be useful. For instance: should a walk with the bases empty and no outs with a two-run lead in the 8th be recorded as the same result as a walk with a man on second in a tie game and 1 out in the 5th? How much more valuable is a HR with no one on, down 2 in the 9th than a walk in the same situation...or a single? Is a walk actually the best outcome for a hitter, since they didn't have to risk putting a ball in play for an out? How much can we credit the pitcher or hitter with for a particular outcome before we are adding more noise than we started with?

Posted

seems like a worthwhile endeavor with better predictive value than cruder, better understood methods, but with how it factors framing, umpires, i partially wonder if theres's the element more of "what an era might look like if everyone was forced to throw pitches in the heart of the strike zone", which then seems to rate stuff and not pitching effectiveness or value

 

all told, it's amazing to consider how a stat could make a full 3.3 runs allowed/9(!) correction on Kyle Hendricks vs. Robbie Ray specifically; maybe exit velo gets incorporated in alternate corrective iterations

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
The North Side Baseball Caretaker Fund
The North Side Baseball Caretaker Fund

You all care about this site. The next step is caring for it. We’re asking you to caretake this site so it can remain the premier Cubs community on the internet. Included with caretaking is ad-free browsing of North Side Baseball.

×
×
  • Create New...