Colin Wyers: 2009 Marcels projections

Transmogrified Tiger · October 14, 2008

http://www.editgrid.com/user/cwyers/2009_Marcels_Projections

Name           PA    AVG  OBP  SLG  OPS
DeRosa        496   .277/.357/.433/.790
Fontenot      342   .281/.363/.439/.802
Fukudome      495   .258/.360/.389/.749
Hoffpauir     240   .282/.354/.444/.798
Johnson       387   .277/.346/.406/.752
Lee           549   .299/.386/.501/.887
Pie           246   .251/.321/.386/.707
Ramirez       522   .291/.370/.524/.894
Soriano       451   .279/.346/.528/.874
Soto          481   .289/.372/.494/.866
Theriot       530   .284/.357/.368/.725

Cedeno       ~250  ~.250/~.310/.~385/~.695

Cedeno's are estimates because I don't know which "Cedenro" is which. That's the average of the two, and they're pretty close to each other either way.

I'm too lazy to do the pitchers right now, but they're there too. In short:

Z: 4.07 ERA

Lilly: 4.49

Marquis: 4.70

Dempster: 4.11

Harden: 3.70

Marmol: 4.01

Wuertz: 4.33

Samardzija: 4.21

And I can't find Wood.

third eye · October 14, 2008

I'm too lazy to do the pitchers right now, but they're there too. In short:
Z: 4.07 ERA
Lilly: 4.49
Marquis: 4.70
Dempster: 4.11
Harden: 3.70
Marmol: 4.01
Wuertz: 4.33
Samardzija: 4.21

And I can't find Wood.

ERA may not be the most useful or projectable stat, but those are some pretty hurtful numbers across the board.

daske17 · October 14, 2008

And I can't find Wood.

That's what she said?

Tracer Bullet · October 14, 2008

Doesn't Colin post here sometimes? I'd like to hear his thoughts behind some of those projections. The Lilly ERA really surprises me, given his performance the last 2 years. And I find it odd that Harden's ERA would jump 2 full runs from this year's performance with the Cubs.

Derwood · October 14, 2008

pitching numbers look pessimistic and the hitting numbers look extremely close to 2008's actuals

Little Slide Rooter · October 14, 2008

I find Marmol and Harden very surprising, almost depressing.

Transmogrified Tiger · October 14, 2008

If you look at the link, you'll see that the ERA's are regressed to the mean in a big way. Harden is one of only 18 pitchers out of 1000 projected to have an ERA under 4.

And now that I moved the pitchers into an actual xls file, I found Kerry. 4.23 ERA.

Soul · October 14, 2008

I refuse to believe all our pitchers are going to balloon up to those ERA numbers.

It'll have to actually happen before I will accept it.

nilodnayr · October 14, 2008

Here are the starters 2009 marcels with the 2008 and 2007 FIPs in parens, respectively.

Z: 4.07 ERA (4.23, 4.58)

Lilly: 4.49 (4.41, 4.16)

Marquis: 4.70 (4.61, 4.99)

Dempster: 4.11 (3.41, 4.54)

Harden: 3.70 (2.95, 3.94)

Not so pessimistic, are they?

nilodnayr · October 14, 2008

the hitting numbers look extremely close to 2008's actuals

um, duh, they are marcels

Derwood · October 14, 2008

the hitting numbers look extremely close to 2008's actuals

um, duh, they are marcels

I don't know enough about marcels to know that this is a "duh" observation

Colin Wyers · October 14, 2008

They're Baseball Reference ID codes, if you want to know what to look up. cedenro01 is Roger Cedeno, and 02 is Ronny, IIRC.

As for the pitcher numbers - they are what they are. "ERAs" are BaseRuns per Nine adjusted to match up on the ERA scale. If I was making my own projection system, there's several things I'd do differently - I'd probably regress H a bit more heavily and K, BB etc. a bit less, among other things. These are just basic Marcels, and if you'll notice, the R scores (for reliability) are all lower for pitchers than hitters, which is why you see pitchers regressing to the mean more (or "balooning up," as one poster put it.)

As far as thoughts, I don't really have any - like I said, this is just a "clean room" reimplementation of Tango's Marcels projection system, and everything's pretty much out in the open as far as the methods he uses.

Soul · October 14, 2008

Here are the starters 2009 marcels with the 2008 and 2007 FIPs in parens, respectively.

Z: 4.07 ERA (4.23, 4.58)
Lilly: 4.49 (4.41, 4.16)
Marquis: 4.70 (4.61, 4.99)
Dempster: 4.11 (3.41, 4.54)
Harden: 3.70 (2.95, 3.94)

Not so pessimistic, are they?

It's horrible. They basically say Dempster was a fluke and Harden will collapse. Don't buy it.

What about Marmol? His ERA wasn't over 4 was it?

Transmogrified Tiger · October 14, 2008

Here are the starters 2009 marcels with the 2008 and 2007 FIPs in parens, respectively.

Z: 4.07 ERA (4.23, 4.58)
Lilly: 4.49 (4.41, 4.16)
Marquis: 4.70 (4.61, 4.99)
Dempster: 4.11 (3.41, 4.54)
Harden: 3.70 (2.95, 3.94)

Not so pessimistic, are they?

It's horrible. They basically say Dempster was a fluke and Harden will collapse. Don't buy it.

What about Marmol? His ERA wasn't over 4 was it?

the ERA's are regressed to the mean in a big way. Harden is one of only 18 pitchers out of 1000 projected to have an ERA under 4.

Mephistopheles · October 14, 2008

im starting to hate these types of things myself (not just marcel). theyre fairly useless.

nilodnayr · October 14, 2008

Here are the starters 2009 marcels with the 2008 and 2007 FIPs in parens, respectively.

Z: 4.07 ERA (4.23, 4.58)
Lilly: 4.49 (4.41, 4.16)
Marquis: 4.70 (4.61, 4.99)
Dempster: 4.11 (3.41, 4.54)
Harden: 3.70 (2.95, 3.94)

Not so pessimistic, are they?

It's horrible. They basically say Dempster was a fluke and Harden will collapse. Don't buy it.

What about Marmol? His ERA wasn't over 4 was it?

I don't know how anyone in their right mind would say that they have no reservations about Dempster for 2009 and onwards. Pitchers typically are worse as starters than they are as relievers; Dempsters transition was insane. A smart bet on Dempster coming into this season was that he'd be a league average SP and after this year a smart bet would probably be somewhere between a league average SP and what he put up this year (closer to league average). Therefore a 4.11 ERA doesn't make me say OMG ROFL IDK my BFF Jill, especially considering that the regression to the mean across the board puts his 4.11 as a pretty darn good year compared to the league. Same thing for Harden, a 3.7 is pretty darn great. People just think that if a guy puts up a 2 ERA he can do it year after year after year. Well guess what? Thats really really hard to do. That 3.7 is about a quarter run above his career FIP, ohhhhh sooooo pessimistic. If a 3.7 ERA is a collapse, well then let me escort you out of the deadball era and into the real world.

And Marmol, well Marmol probably isn't going to fit into the system so well because his H/9 is just otherworldly. I'm sure that gets regressed to the mean big time for him. He has a big problem with BBs that he lives with by not allowing guys to make contact with his pitches, but theres only a certain amount of sustainability with that. There should be concern there, but everyone is blind with Marmol love. With his odd peripherals, his FIP last year was 3.62, so its not like 4 is that insane. But like I said, I don't think he'll fit into a model well, especially one with major regression to the mean.

treebird · October 14, 2008

im starting to hate these types of things myself (not just marcel). theyre fairly useless.

marcel has got to be one of the worst, though. it's not impressive at all to say your system is accurate when you just predict everyone in the league is going to be mediocre. it's the crazy stuff (on both ends) that makes a projection system interesting. marcel is not interesting.

jersey cubs fan · October 14, 2008

im starting to hate these types of things myself (not just marcel). theyre fairly useless.

marcel has got to be one of the worst, though. it's not impressive at all to say your system is accurate when you just predict everyone in the league is going to be mediocre. it's the crazy stuff (on both ends) that makes a projection system interesting. marcel is not interesting.

I don't think a goal of a projection system should be to be interesting.

Colin Wyers · October 14, 2008

im starting to hate these types of things myself (not just marcel). theyre fairly useless.

marcel has got to be one of the worst, though. it's not impressive at all to say your system is accurate when you just predict everyone in the league is going to be mediocre. it's the crazy stuff (on both ends) that makes a projection system interesting. marcel is not interesting.

That's only true if the "crazy stuff" is adding value above and beyond the safe, boring stuff the Marcels is giving you. If it isn't, then it's just noise masking the signal.

I'm not trying to dispute the idea that there are better projection systems than Marcel. But they're not THAT much better.

As far as being useless, Meph - take Theriot's Marcel projection, versus his 2008 season line. Which is more instructive going forward? I don't know what it is that you think is more useful. Certainly regression-to-the-mean based projection systems (as well as the sim-score based PECOTA) did a better job of seeing where the Rays were headed than anyone else did.

Mephistopheles · October 14, 2008

i am not saying projection systems are wrong or the way they are found is wrong. theyre useless for the reasons tree said. the RMSE and such is low, but the distribution is incredibly inaccurate because of regression to the mean (i understand why this is true), and the way to avoid that is probability based projections.

a projection that marcel gives is basically this "im going to cover my ass and make sure i have the lowest possible rmse on various statistics, let's not worry about the distribution because i don't care"

and to answer theriot. neither one of them is useful. use a combination. the perfect one line projection system is a projection system that maps to actual stats accurately and maps to the distribution accurately.

and don't play the luck card. it's partially luck but a lot of it isn't. zambrano has't had a babip over .291 in his career and hasnt had one over .280 in the last four years. that's not luck. any projection system is going to get him wrong because they consider it (or a large part of it) luck. im not really ranting on marcels but the general idea thats been accepted in chone, marcels and even pecota as well

Mephistopheles · October 14, 2008

As far as being useless, Meph - take Theriot's Marcel projection, versus his 2008 season line. Which is more instructive going forward? I don't know what it is that you think is more useful. Certainly regression-to-the-mean based projection systems (as well as the sim-score based PECOTA) did a better job of seeing where the Rays were headed than anyone else did.

Actually....I did have the Rays losing 200 runs allowed this year...the real secret to the Rays success hasn't been projections outsmarting people, it's realizing historically bad bullpens don't often repeat. for theriot, marcels give .284/.357/.368, this year he hit .307/.387/.359 , so marcel is projecting a few singles less. big deal. the value between these two lines is what? a handful of runs over the course of the season? I don't give a crap about that. Im more worried about the likelihood of theriot bottoming out. that's much much much more important than a single regression based projection.

besides marcel doesn't give us the likelihood of him hitting .300/.380/.360 and neither does his last season. theres no value in regressing his statistics. just looking at his recent career and not doing weighting the averages like marcel is much more useful than weighting them into a single line with marcel.

Colin Wyers · October 15, 2008

i am not saying projection systems are wrong or the way they are found is wrong. theyre useless for the reasons tree said. the RMSE and such is low, but the distribution is incredibly inaccurate because of regression to the mean (i understand why this is true), and the way to avoid that is probability based projections.

a projection that marcel gives is basically this "im going to cover my ass and make sure i have the lowest possible rmse on various statistics, let's not worry about the distribution because i don't care"

and to answer theriot. neither one of them is useful. use a combination. the perfect one line projection system is a projection system that maps to actual stats accurately and maps to the distribution accurately.

and don't play the luck card. it's partially luck but a lot of it isn't. zambrano has't had a babip over .291 in his career and hasnt had one over .280 in the last four years. that's not luck. any projection system is going to get him wrong because they consider it (or a large part of it) luck. im not really ranting on marcels but the general idea thats been accepted in chone, marcels and even pecota as well

As regards Zambrano, I fail to see what your problem is here - I have Z forecast for a .279 BABIP, compared to an average forecast of .296 for the league. This isn't a DIPS/FIP system where BABIP is regressed 100% to the mean. In that regard, you're arguing against a strict DIPS-ERA (or FIP-ERA) approach; I know of no forecasting system that follows a strict DIPS approach like that.

Your larger point is that the spread of performance in the league is larger than the spread of talent produced by projection systems. That's true, because by definition sample data is more extreme than true-talent level - that's the "noise" we're filtering out by using regression-based projection systems. For players with a lot of PAs/IPs, the signal outweighs the noise - that's how Zambrano gets a much lower BABIP than the league norm in his projection, even after we regress.

Mephistopheles · October 15, 2008

im not really ranting on you or marcel or anyone projection system in general. just the idea. and i know exactly why the results are what they are. it's just the utility from them is non-existent. so its kind of a waste of time for someone like me and others who already have an understanding of things like certain skills being more based off of luck than others.

you're really not filtering out "noise" when you do this. the "noise" is always there. you just havent developed a system sophisticated enough to predict the "noise" so you assume the "noise" is random - or largely random. and im not meaning you as in you, im meaning you as an everyone.

a lot of the noise is predictable, if you work with probability based assessments. certain pitchers have higher likelihood of .250 babip seasons then others. you know this, and your projections sorta kinda include this. but not really. projections are what they are. more or less the middle value of production, with a 50% chance above and a 50% chance below.

if I take the last two years of a player, average the stats (which is even simpler than Marcel) and use that for a projection, we're going to get results that are pretty damn close to what you get. what's the purpose of projections? to try to construct the best team you possibly can using information we know. the problem is that the difference between my 2 year projection and your two year projection is only going to be a handful of runs, if that. We're talking five or six runs - tops. However, those five or six runs are only going to fall in the middle 45-55 percentiles for most players. So really the accuracy of one system to another really doesn't tell us anything we didn't know. What's really important are the likelihoods for large deviations from previous play - the busts and booms. Those are what win and lose divisions.

Colin Wyers · October 15, 2008

im not really ranting on you or marcel or anyone projection system in general. just the idea. and i know exactly why the results are what they are. it's just the utility from them is non-existent. so its kind of a waste of time for someone like me and others who already have an understanding of things like certain skills being more based off of luck than others.

you're really not filtering out "noise" when you do this. the "noise" is always there. you just havent developed a system sophisticated enough to predict the "noise" so you assume the "noise" is random - or largely random. and im not meaning you as in you, im meaning you as an everyone.

a lot of the noise is predictable, if you work with probability based assessments. certain pitchers have higher likelihood of .250 babip seasons then others. you know this, and your projections sorta kinda include this. but not really. projections are what they are. more or less the middle value of production, with a 50% chance above and a 50% chance below.

if I take the last two years of a player, average the stats (which is even simpler than Marcel) and use that for a projection, we're going to get results that are pretty damn close to what you get. what's the purpose of projections? to try to construct the best team you possibly can using information we know. the problem is that the difference between my 2 year projection and your two year projection is only going to be a handful of runs, if that. We're talking five or six runs - tops. However, those five or six runs are only going to fall in the middle 45-55 percentiles for most players. So really the accuracy of one system to another really doesn't tell us anything we didn't know. What's really important are the likelihoods for large deviations from previous play - the busts and booms. Those are what win and lose divisions.

And a lot of noise is simply noise. Way too much time and energy is wasted trying to assign meaning to randomness.

And, sure, for players in the flat part of the aging curve, with a lot of MLB PAs, a simple three-year average is pretty close to their Marcels projection. For players in the steeper part of the aging curves, players without a lot of playing time, etc., though, they'll be different. And using aging adjustments and regression to the mean is very important for players it greatly effects.

Beyond that, either I don't grok what you're talking about or you're Lyndon LaRouching it. Are you talking about something like PECOTA percentiles? (They've never been tested, incidently.) It's real easy to talk about a proposed alternative that doesn't really exist. Yes, these are median projections. I think that's the most useful when comparing two players, unless you have some sort of evidence that certain players (or certain types of players) are more/less likely to vary greatly from their median forecasts, for reasons that aren't simply captured by the reliability score of the forecast.

Brett · October 15, 2008

Lee with the highest OPS on the team? I just don't see that happening with the way he's trending.

In fact, I think a lot of the hitters look as overly-optimistic as the pitchers look overly-pessimistic.

Colin Wyers: 2009 Marcels projections

Recommended Posts

Top Posters In This Topic

Top Posters In This Topic

Create an account or sign in to comment

Create an account

Sign in

Member Statistics

Prospect News & Highlights

Recent News

Notes & Rumors

Recent Blogs