Thursday, November 05, 2009

The Ultimate Outlier

The Ultimate Outlier

The man who "called his shot" was, and is still, more than baseball.”

Jason Perry –

One might think that all that can be said about Babe Ruth has been said. After all, it has been over 61 years since he died, over 74 years since he played his last game, over 82 years since he hit 60 home runs in one season, and over 95 years since he played his first major league game. But it is a testament to the uniqueness of this player that there is still new ways to look at what he accomplished. I will attempt to do just such a thing in this essay – discuss a new way of looking at the accomplishments of The Sultan of Swat. The author of the accompanying article had the chance to delve into new territory, but instead he rehashed widely known statistics, as if teaching an introductory course to people who had never heard of The Great Bambino. The statistics used in the article are common and commonly available. Any and every baseball fanatic knows their way around the usual bunch of statistics in this article: home runs, RBI per game, batting average, slugging average, on base percentage, at-bats per home run, runs per game and OPS. Anybody and everybody could have looked up these statistics freely from the very same source the author used: Here was a chance, in 2007, with Barry Bonds chasing immortality and Hank Aaron, to look anew at the player who single-handedly changed the way baseball was played. His last name became an adjective, Ruthian, to describe the most incredible feats and achievements in any and all human endeavors. His name is universally recognized for excellence and is often used in terms like "The Babe Ruth of Ornithology" and "The Babe Ruth of Woodsmen". The Rajah of Rap was an outlier of outliers. Alone he skewed the mean of hundreds of players a season, and thousands over his career. Even when he's already been passed, as in Bonds chasing Aaron, he remains the one and only Home Run King. Let's take a look, and while we're at it, see how Bonds and Hammerin' Hank Aaron fare, as well.

Firstly, we will not take the entire career of The Colossus of Clout into consideration because his career can be divided into four distinct phases: his time as a pitcher and pinch hitter (1914-1917), the time when he first when into the outfield and revolutionized the game (1918-1926), the glory days of his career when more and more players followed his lead (1927-1933), and then the last part when he was merely human in a Ruthian league (1934-1935). It is true, as pointed out in the article, that in 1927 he out-homered every team in his league. What isn't mentioned is that he out-homered all but two teams in both leagues (the National League Saint Louis Cardinals hit 84 home runs and the Chicago Cubs hit 74 home runs to the Wazir of Wham's 60) But more importantly to my point, his teammate Larrupin' Lou Gehrig himself out-homered 7 of the total 16 teams of both leagues, and the Cubs Hack Wilson out-homered 4 of them. These two players were the first to follow The Terrible Titan into outlier status, and into modern baseball, and they did so in 1927. So that year is the official beginning of Ruthian baseball, as opposed to what is now known as The Deadball Era. In fact, in 1920 The Prodigious One out-homered every team in the American League except the rest of his Yankee team, and every team in the National League except Philadelphia. He commonly hit more home runs than most other teams during the first half of his career. So his feat of 1927 is not without precedence. To further solidify the point, Hack Wilson was known as the Little Babe Ruth and the National League Babe Ruth, and he set a record in 1930 that even The King of Crash couldn't touch – 191 RBI in a single season. That extreme deviation from the mean stands alone today, unlike most of The King of Swing's. 1927 marks the beginning point when the Maharajah of Mash no longer stood in a class by himself; he was being joined in his outlier status by other players.

Therefore, the second part of his career (1918-1926) is the subject of my statistical analysis, and the crux of how I think the accompanying article could have used statistics better. During these 9 years the Wali of Wallop was truly the outlier of outliers – he warped the normal distribution curve of more than 230 ballplayers per league per year all by himself. And he did it even though he was also a rotation pitcher as well as outfielder for the first two of those years.

A look at the data for that set of years notes 2 anomalies, 1922 & 1925, when The Big Bam did not put up his usual numbers. 1925 was the year of The Great Bellyache, when intestinal troubles aggravated by a profligate lifestyle sidelined The Bulky Monarch and he did not lead the league in anything except bellyaches. 1922 was a year he was beset by hangovers and injuries, yet he still led in Slugging Percentage, OPS and OPS+ in limited playing time. The players who led the league in homers those years, Ken Williams and Bob Meusel, did so with what was by far their highest seasonal home run total of their careers – in other words, not only were they outliers for the league and in baseball history up to that point, but those seasons were also outliers within their own careers even afterwards. As is always the case, only an outlier of outliers can begin to compare to His Eminence, and then in only a fragment of the whole. Let's take a new look at the data for The Kid of Crash, and then we'll go on to Bonds and The Hammer.

The method of comparison is simply arithmetic: the sets of data include the AL totals and averages, George Herman's totals and averages, and the totals and averages of the league minus Ruth. This compares the league with him against the league without him: The Circuit Smasher vs. the American Circuit (1918-1926).

The mean of home runs each batter hit is 1.6, yet The Bambino averages 38.6, and without him the league average batter would have a mean of only 0.6. He is the 1, yo! Adding him to the mix increased output by 133% – not up to 133% but an additional 133%! One man's accomplishment compared to the skilled efforts of 2165 players. In other words, adding The Man to the league added the equivalent of 2881 men. The numbers confirm what historians have been saying, The King of Clout at this time was bigger than the league – he transformed the game during this period like Prometheus of old, sweeping change in every direction.

Perhaps that number is too overwhelming to comprehend. The beauty of this Big Baby is that he causes ripples in the entire statistical fabric of the game. During this period Homeric Herman hit 1 home run every 14.9 plate appearances (PA), while the league without him took 130.8 PA to hit a homer. The difference The Sultan of Sweat made to the league is measured here at 8.77%, and applying that to the total players reveals The Slacker's achievements can be replaced by the simple addition of another 190 players.

Going down another level, The Big Babe clobbered 38.6 homers per season during this stretch, compared to the average player, pro-rated to synchronize PA, will eke out 4.4 home runs. George can replace 9 players, the entire lineup.

Still further, comparing OPS+, a statistic that normalizes players to the league and parks to try to get a player's true value, we set the league average at 100, while Herman holds out at 212 – roughly twice the threat of the average player to have a successful plate appearance. He's as good as two players at bat at the same time.

This ripple effect through the statistics is called Fractal Distribution. I just made that up. Oh yes, Bonds and Henry. They are but an afterthought after Dunne's Babe, but compare we must.

The Hammer's best 9 consecutive seasons are from 1956 to 1964. The mean of home runs hit is 4.0, and Aaron averages 36.2. Without Bad Henry the league average batter would have a mean of 3.9 homers. This means Aaron in the mix increased output by an additional 2.75%. Hank added the equivalent of 74 players to the league. During this period Hank hit 1 home run every 18.5 PA, while the league without him took 43.9 PA to hit 1 homer. The difference to the league Hank made here is measured at 1.74%. Applying that to the total players reveals the achievements of The Hammer can be replaced by the addition of 47 players. Going down another fractal level gives us stats of 38.2 home runs a season by Aaron, and the pro-rated average player hitting 11.75 homers. Hank can replace 3.25 players, or a bit more than a third of the lineup. One last iteration to OPS+, and Hank reaches 163 – roughly one and a half times the threat of the average player to have a successful at bat.

So we can see that while Aaron compares favorably close on an individual basis, the league around him had improved tremendously over 41 years.

Bonds best 9 consecutive years came during 1996-2004, 40 years later. The mean of home runs hit is 4.6, and Bonds averages 45.7. Without Bonds the league average batter would have a mean of 4.5 homers. This means with Bonds in the mix increased output by 1.57%. Bonds added the equivalent of 82 players to the league. During this period Bonds hit 1 home run every 13.5 PA, while the league without him took 37 PA to hit 1 homer. The difference to the league Bonds made here is measured at 1.09%. Applying that to the total players reveals the achievements of Bonds can be replaced by the addition of 57 players. Going down another fractal level gives us stats of 45.7 home runs a season by Bonds, and the pro-rated average player hitting 16.6 homers. Bonds can replace 2.75 players, or a bit less than a third of the lineup. The last level down is OPS+, and Bonds reaches 211 – almost catching The Big Bambino. So close, yet so far.

These and other statistics can be used as interpretive comparisons of data, which serves as a window of illustration on the impact outliers can have on the rest of the data. We can then consider how far the ripples can ripple through culture and time. That is the Beauty of numbers. The comparisons here clearly show one of these three extraordinary ballplayers is still an outlier among outliers. The King of Sluggers remains on his throne.

But Bonds has something to say. Bonds played in a Ruthian Game while The Wazir of Wham played in a human game. Consider the first statistic we investigated, home runs per player. We estimated there that adding The Big One to the mix was like adding 2881 players to the league. The number of players Bonds competed against had increased by 3041. So we have a correlation of that statistic: when more than 2881 players are added, along with 81 years, another Ruthian player emerges: Bonds. The fact remains that he did approach within 1 of his goal in OPS+, 211 to 212 – a statistic designed to find true offensive value. The fraction of a player less Bonds is would be noticeable over the course of a season, but in a single at bat, any random at bat, Bonds' performance has a greater than 99.5% chance of being Ruthian. This correlation is exactly the result predicted by the manipulation of the home runs per player statistic when tied to OPS+ as done in the above Fractal Distribution. Even still, The Yankee from Olympus was the one who set the conditions, and Bonds merely the statistic that filled them.

If this was Biology we could use this as an example of a single mutation that is so fit that it becomes dominant – like a colony of bacteria evolving an immunity to an antibiotic, but in this case the game of baseball evolving into a Ruthian Game. Unfortunately, therein lay the road to The Steroid Era, a foreseeable outcome, sociologically, to the evolving attempt to become The Hero, to be like George Herman “Babe” Ruth.

One last outlying stat: as I demonstrated within the essay, George Herman Ruth was also an outlier in number of nicknames attributed to one person. One last nickname (or two): Old Nigger Lips. That was one of the nicknames you didn't say to his face, but rather shouted from the anonymous gaggle of players in the opposing dugout – as the Chicago Cubs did in the 1932 World Series, prompting the legendary Called Shot home run to deepest center field. It didn't pay to get Tarzan mad (the nickname his teammates had for him).

No comments: