Analyzing xBABIP: Calculated Using Statcast Data (Part 1)

May 30, 2016; Chicago, IL, USA; Chicago Cubs first baseman Anthony Rizzo (44) hits an RBI single against the Los Angeles Dodgers during the fifth inning at Wrigley Field. Mandatory Credit: David Banks-USA TODAY Sports
May 30, 2016; Chicago, IL, USA; Chicago Cubs first baseman Anthony Rizzo (44) hits an RBI single against the Los Angeles Dodgers during the fifth inning at Wrigley Field. Mandatory Credit: David Banks-USA TODAY Sports /
facebooktwitterreddit

What is xBABIP and how can we use it to evaluate? In part 1, we are analyzing the batters who xBABIP fully support their BABIPs; for better or worse.

Statcast has given us a lot of data to play with and those smarter than I am are putting it to good use. Andrew Perpetua released his new batting average on balls in play results on Fangraphs back in May, titled xBABIP. And as always, I want to dive into these to determine if we can syphon any underlying value in analyzing a player’s potential fantasy value. In part 1, I’m going to be analyzing the batters who xBABIP fully support their BABIPs. For better or worse.

EXPLAINING XBABIP

BABIP alone is batting average on balls in play and the calculation is pretty simple; out of all balls put into the field of player (excluding strikeouts, excluding home runs essentially), what is that player’s batting average? League average BABIP is right around .300. Here is the formula below.

BABIP = (H – HR)/(AB – K – HR + SF)

So Perpetua is taking that same concept but using strictly Statcast data to create a BABIP based on a player’s batted ball abilities. He does a pretty good job at explaining his process and the kinks that haven’t quite been worked out in the article linked in the opener. But here are a few points I’ll highlight.

xBABIP is looking to pull out the luck aspect of BABIP from itself. Ever get into an argument with someone over whether or not X Player’s BABIP is luck or skill? Well this attempts to separate those so we can determine if an above average BABIP can be credited to a player for just being a damn good hitter.

Physics complicates things and Perpetua admits as much. The longer the ball hangs in the air means the longer time a player has the chance to make a play. He hasn’t quite worked out how to measure this with 100% accuracy.

His version of xBABIP is a pure batting stat and “only measures the player’s ability to bat the ball…conducive to reaching base safely.” This means two things:

He isn’t incorporating player speed, as a fast player will typically result in a higher BABIP even if his xBABIP is lacking.

This doesn’t take into account shift data as of yet. So player’s with lower BABIPs due to successful shifts against them might result in higher xBABIPs.

More from Fantasy Baseball

THE APPLICATION OF XBABIP IN EVALUATING PLAYERS

Now those two subpoints above are being worked on but I also wonder if he even should. A big contention in pitcher evaluation and the reason we have xFIP and FIP is a result of the home run ball and is it able to be controlled by the pitcher.

Some evaluators pick a side on which is the better stat but the truth is you should use both; for some pitchers, it’s more appropriate to use FIP — like Jordan Zimmerman or Gio Gonzalez, who have produced a below average FIP for pretty much their entire careers. Other pitchers and other circumstances call for us to use xFIP to evaluate pitchers.

I see that if we can hone and tone xBABIP, we can use it similarly to how we use discretion when using xFIP and FIP. Where BABIP might be more appropriate for your Dee Gordon‘s and xBABIP might be more appropriate for your David Ortiz‘s. Or not. We really don’t know yet.

But what I love about this new metric being able to calculate BABIP using only batted ball data (Statcast specifically) is that I finally have data to point to when we make assumptions about BABIP.

For example, a lot of people have said Nick Castellanos is about to crash back to earth. To which I responded in a thread, “Regression is coming but people who point to his BABIP being unsustainably high…that’s what happens when you have a 31.1% Line Drive rate. You are going to have a way above average BABIP at that point, especially when you are only making Soft Contact 10% of the time.”

Well in that instance, I was simply using the Fangraph provided Batted Ball data to make an assumption that a guy who hits line drives at that rate will have a great BABIP. But now with xBABIP, I actually have a BABIP to point to to say, “This is where his BABIP should be because he is hitting those line drives at that particular pace with the quality of contact he has shown.”

So in defense of Castellanos — who I selected as my No. 1 breakout third baseman earlier this year — I’m interested to see what his xBABIP is. And at the point of writing this sentence, I have no idea what his is yet; I said in that same comment, “He’ll fall somewhere between what he is doing now and what he did last season…” and let’s see if I’m right. But first, in looking at his spreadsheet, here were some things I’ve found.

PLAYERS WITH VIRTUALLY NO DIFFERENCE

The chart below highlights players with at least 100 PAs that I felt were interesting enough to include. So no, Danny Espinoza is not on this list because his low BABIP is due to him being a terrible batter and he doesn’t have much fantasy relevance. But if you want to search particular names, Perpetua provides a download to the list in his article.

Name

PA

BABIP

xBABIP

Dif

Jonathan Lucroy1190.3530.3440.009
Yoenis Cespedes1170.2990.2900.008
Marcus Semien1220.2060.1980.008
Melvin Upton1250.3330.3280.006
Christian Yelich1360.3750.3700.005
Gregory Polanco1400.3150.3100.005
George Springer1480.3020.2970.005
Melky Cabrera1410.3270.3240.004
Yunel Escobar1420.3270.3240.004
Brandon Drury1140.3250.3210.004
Matt Holliday1200.2740.2700.004
Starlin Castro1190.3330.3300.003
Elvis Andrus1230.3270.3230.003
Neil Walker1180.2530.2500.003
Anthony Rizzo1420.2440.2410.003
Miguel Cabrera1350.3400.3380.002
Josh Donaldson1540.2920.2910.001
Edwin Encarnacion1470.2810.2810.000
David Freese1240.3850.386-0.001
Brandon Crawford1320.2820.284-0.001
Joe Panik1250.2630.265-0.002
Jonathan Schoop1130.2750.278-0.003
Adrian Beltre1410.2740.277-0.003
Manny Machado1390.3750.379-0.004
Josh Reddick1370.3600.364-0.004
Chris Carter1310.2820.286-0.004
Nolan Arenado1450.2620.266-0.004
Troy Tulowitzki1360.1970.202-0.004
Kevin Pillar1400.3210.326-0.005
Jason Kipnis1310.3460.352-0.007
DJ LeMahieu1270.3370.345-0.009
Jose Altuve1500.3140.324-0.009
Jose Abreu1520.2730.282-0.009
Colby Rasmus1330.2600.269-0.009
Robinson Cano1470.2860.295-0.010

Alright, so there are a lot of names here. Some of them, we aren’t surprised to see when you have guys like Miguel Cabrera, Manny Machado, Nolan Arenado and Jose Altuve all justifying their BABIPs with some delicious, pure hitting, statcast data. But there are also some players justifying their BABIP with negative consequences.

Most notably — and this was EASY to see if you even looked at his Batted Ball data — Troy Tulowitzki has been a dumpster-patch kid this year in the majors. His .197 BABIP being supported by his xBABIP of .202 is easily the worst in baseball. When you are only hitting line drives 8.2% of the time, you deserve to be the worst in baseball. Until that changes, don’t expect much difference in batting average unless a string of good luck arises.

Also, Marcus Semien has had some fantasy relevance this year by hitting nine home runs while playing a shallow position at shortstop. But much like Tulowitzki, Semien hits line drives at only a 9% clip which fully supports his BABIP at .206.

More from FanSided

One name I was surprised to see on this list was Anthony Rizzo. Rizzo has been putting up some great numbers so far so you don’t even think to look at his metrics to analyze why his BABIP is so low. But his .244 BABIP being supported by a .241 xBABIP was shocking until you realize his line drive percentage has dropped by over 7% from the previous season. Fear not, though.

All of his Batted Ball Distribution and Batted Ball Quality metrics are stable (if not even better). As his FB% falls, his LD% will increase and thus his home run pace will fall, and his BABIP will rise.

More so than just guys who have their xBABIPs in full support of their BABIPs so far this season, what players have made a huge jump from career norms/last season to this season? That is, which players are making better contact on pitches for us to conclude that their jump in BABIP from last season to this season is real and not just luck?

Some names that fall into that category are David Freese. His BABIP of .385 is fully supported by his xBABIP of .386. Contact rates do even out over time so continue to monitor his performance. A name I really liked to see is Josh Reddick.

Reddick is one of those underrated fantasy players who puts up good stats each season but no one ever puts any respect on his name. His .360 BABIP and .364 xBABIP prove that his .316 average this season hasn’t been a fluke. Reddick is hitting less fly balls and more line drives which fully supports the increase. But it also might mean a dip in home runs for Reddick this season.

On the flip side of this coin, which players are showing BABIPs that don’t measure up to previous seasons? Well Tulowitzki gets the crown on this one but look at a guy like Neil Walker as well. This guy has actually been excelling this year with the longball (remember, it doesn’t factor into BABIP), but his contact rates are showing that he deserves the .253 BABIP he is sporting at the moment. That’s what happens when you add 13% to your Fly Ball rate but hey, the increase in home runs doesn’t hurt.

CONCLUSION AND PART 2 PREVIEW

So how are you going to use xBABIP? Well it’s like I said in the intro and what Andrew Perpetua’s goal was in creating xBABIP: analyzing players and to “tease apart the luck and skills aspects of balls in play.” And Perpetua will be the first to tell you that no, it’s not perfect yet but it is something.

Let’s never forget about our favorite rebuttal, either. Small Sample Sizes are abound when you’re only looking at less than 150 plate appearances. Even after using his metrics he posted in the article and the metrics shown on Fangraphs today, there were substantial changes in the BABIP numbers. So make sure you’re using it appropriately and not drawing any hard nose conclusions about players this early.

Next: Brewers Zach Davie: Deep Waiver Wire Add

Part 2 will be focusing on batters that have xBABIPs that are outpacing their BABIPs. I’m sure we will see a lot of slow-footed sluggers and left-handed batters who are shifted on a lot. And I’m sure in there, we will see a lot of batters who are getting really unlucky this season. Stay tuned.