Wednesday, June 9, 2010

Mariners baseball: Believe big, achieve little



Bill James revolutionized baseball with the development and incorporation of advanced metrics (better known as 'sabermetrics' ... because ancient dinosaur cats are sweet). As the pioneer of baseball analytics, James poured over historical data, running models and numbers to try to find patterns and percentages hidden between the lines of information. He churned out more than 24 books since 1977 on baseball statistics and single-handedly created a revolution in not only Major League Baseball, but throughout the sports world. James envisioned an analytical player model beyond batting average, homeruns, and RBI, archaic statistics leftover from the trading card era (if it didn't fit on a baseball card, it didn't matter) that overvalued the wrong things.

But why does it actually matter how often a player gets a hit? Why does it matter how many runs a player drives in? How does that compare to a player who scores runs but doesn't drive them in? How does any of that stuff actually contribute to wins? Those are the very types of questions James asked himself trying to define a statistical generation. What really matters to a batter, simplified to its Jamesian core, is how often he gets on base (OBP), and beyond that, how often he gets on base combined with how often he gets an extra-base hit (OPS). Extra-base hits drive in runs and have the potential to drive in multiple runs. Singles don't. Sorry Ichiro.

James wanted to know the inner-workings of a baseball team, not from the transparent "let's score more than the other team" standpoint, but from a statistical foundation built around probabilities of success and the contributing variables to increased winning. If you add more runs-created at a certain position by X amount, how much will that contribute to your overall winning percentage? One game per year? Ten? If your best player naps in the dugout every fourth inning, will he finally hit a homerun? Or is it every fifth inning?
Toiling questions, no doubt.

It's with Bill James in mind that I've watched the Seattle Mariners struggle through the first third of the season. They can't hit, they can't win close games, they can't win PERIOD, and I've spent a lot of brain-scratching days trying to understand the why of that enigma. The Mariners' painful inability to produce runs is their biggest catalyst for failure, obviously, but I want to look even deeper into the statistics, using some of the James sabermetric-methodology, to grasp the true nature of Seattle's failures. We know they can't hit, but who, specifically, isn't producing? We know they aren't winning, but are they actually playing better than the statistics say they should?

Scary thought, huh?

The hardest part about sabermetrics is that they're disconnected from everyday language. They're distant, foreign, even a bit arrogant at times, and that's why they've had a hard time replacing the trading card trio (Avg., HRs, RBI) that 99.9% of baseball fans have come to accept as scripture.

But the two most-important takeaways from Bill James are his Baseball Pythagorean Theorem and the Runs-Created Approach, bear with me as I try to de-nerdify math.

(MATH BEAR, ARGHHHH!)


Baseball Pythagorean Theorem


James discovered that by using a wonky version of the Pythagorean Theorem (a² + b² = c²), he could accurately predict winning percentages for Major League Baseball teams with less than 2% variance. He'd unlocked the Holy Grail of baseball statistics: An accurate winning predictor.

For James, his a² + b² = c² turned into a slightly-more complicated formula that takes the scoring ratio (runs scored/runs conceded) and divides and squares and adds ... well, shit, I'll just show you what it does:



Remember, Bill James figured this out by himself, this 98% historically-accurate baseball formula, locked away in some sweaty love nest with a pile of acid papers stuck to the roof of his mouth. "Dude ... dude ... check this out, my hands are purple, and they're made of elves ... also, the Pittsburgh Pirates are going to win 37% of their games this year ... whoa."

And my mom thinks LSD's a bad thing.

For the Mariners, in this case, their current scoring ratio is an abysmal 0.84232. They've conceded 241 runs while only scoring 203. Not going to win a lot of games with that ratio. So what does that mean in terms of projected winning percentage based on their current success?

42%

For those of you keeping score at home, rifling through your Baseball Prospectus while your mom makes you grilled-cheese sandwiches, that's actually GOOD NEWS for Mariners fans! Hooray! Good news, everyone!

The Mariners are currently 22-34 with a winning percentage of .393 (39%). So, yeah, get excited Mariners fans, the team is absolutely terrible right now, eight games out of first place just a third into the season, but they're going to be roughly 3% better the remainder of the year! Awesome!

Unless the variance is 2% lower ... then they'll basically be this bad forever.

Awesome.

 Runs-Created Approach

Earlier this year, I scoured over box scores trying to crack the Bavasi Code of why the Mariners aren't scoring a lot of runs, putting together some wicked pie charts in my article "The Mariners and delicious pie." What I realized is that the issue extended beyond the Mariners' inability to simply "get hits," it came down to their inability to create runs and maximize their run-scoring opportunities.

Defining a player's individual contributions to his team is the ultimate goal for every general manager ... well, every general manager other than former Seattle GM Bill Bavasi, whose ultimate goal was to destroy the Mariners like Godzilla rampaging through Tokyo. Baseball is unique in that it's a team sport constructed entirely around individual success, which makes the weight of statistical analysis so much more valuable than in sports like football or rugby. Tweaking one part of a line-up can change the scope of your season, because a player creating more runs directly-contributes to the overall winning percentage of your team (see Pythagorean Theorem above).

Bill James developed runs-created as a way to define individual player contributions. He took the real meat and potatoes of baseball statistics, not the trading card trio, but real statistics (total bases, hits, walks, and at bats), and developed a formula that focused on what each player is doing to maximize run-scoring opportunities for his team:



I know it's accurate because it shows former Mariner Eric Byrnes contributed the least runs-created to the team before getting cut and joining a beer league softball team. James probably had that predicted in an Excel file somewhere.

But let's look at the rest of the Mariners, the non-beer league ones, and see how this pathetic team stacks up, who's really contributing and who should be cleaning up elephant poo at the Woodland Park Zoo. Wow, nice rhyme, Erik.


 Seattle Mariners  Games  Runs-Created
 Ichiro Suzuki 56 41.20
 Franklin Gutierrez  54 33.09
 Jose Lopez  55 19.82
 Chone Figgins 56 18.83
 Josh Wilson  30 15.36
 Casey Kotchman 49 14.31
 Mike Sweeney  27 14.04
 Milton Bradley 37 12.29
 Jack Wilson  26 7.33
 Rob Johnson  29 6.96
 Michael Saunders 20 6.05
 Ken Griffey, Jr. 33 5.05
 Josh Bard 8 4.62
 Eliezer Alfonzo  5 3.89
 Ryan Langerhans 14 3.86
 Adam Moore  19 3.75
 Matt Tuiasosopo  22 2.50
 Eric Byrnes  15 1.18


How about that, Ichiro contributes the most to the Mariners, no surprise there. But after Ichiro and Franklin Gutierrez, there's a dramatic drop-off in the team's productivity. Consider how high Mike Sweeney and Josh Wilson are in their limited number of games compared to the regular starting line-up and it's no surprise that the team finally started winning when Jack Wilson went on the DL and Ken Griffey, Jr. retired. Scoring more runs leads to more wins. Rocket science, eh?

But let's see how the Mariners compare to the rest of the AL West ...

I took the top nine hitters with the highest-number of games from each AL West team (Texas Rangers, Los Angeles Angels, and the Oakland Athletics) to determine the average "runs-created" per archetypal starting line-up.


 MLB Team Runs-created Avg.
 Los Angeles Angels 26.67
 Texas Rangers 25.42
 Oakland Athletics 21.98
 Seattle Mariners 18.54


Bam.

That's the thing about statistics, they're depressing. I'm depressed now. Because of baseball. And for the rest of you bastards still holding out hope that the team's just been unlucky, is just about to turn the corner, is still going to make the playoffs this year (yadda-yadda-yadda), well guess what, Bill James thinks you're an idiot. So does math. So do I.

Bill James, math, and I think you're an idiot.

Go Mariners.

View Part II here!

6 comments:

  1. erik, i sure hope you did this instead of working on a chart for dave

    ReplyDelete
  2. This article rules. Huzzah for mathematics!

    ReplyDelete
  3. If there is an award for Best Sabermetrics Article of the Month, this should win it. Great article, from a Yankees fan in CT

    ReplyDelete
  4. Dude...Eye opening and hilarious. Thank you for taking such a huge disappointment this year and applying math to it.

    The Seattle Mariners: Building to the future since 2002.

    ReplyDelete
  5. Great article, but where does the pitching aspect enter in?

    ReplyDelete
  6. The saber tooth tiger was not a dinosaur, it was a mammal. And Ichiro's OPS is .828. Otherwise, an interesting read.

    ReplyDelete