Most of us love to digging into players’ underlying stats, attempting to distinguish skills improvements from luck to gain an edge in our leagues. What stats do you look at? Are those the stats you should be looking at?
I recently became skeptical about the how useful a hitters swinging-strike% is to understanding their K%. So I’ve made good use of a pandemic Saturday night and calculated the correlation between hitters’ plate discipline stats and their K% and BB%.
No doubt I’m not the first to do this, nor will I be the last. But I didn’t know what underlying stats correlate best, and presumably some of you could be interested as well.
TL, gimme results:
I looked at the r^2 correlation between each K% and BB% and these plate discipline stats: Contact%, SwStr%, CSW%, Swing%, O-Swing%, O-Contact%, Z-Swing%, Z-Contact%.
K% best correlates to Contact% with a r^2 of 0.74
K_estimate = 0.917 x Contact% – 0.900
Maybe there’s a reason the fantasy legends writing BaseballHQ Forcaster use Contact% in their player profiles and analysis.
BB% best correlates to O-Swing% with a r^2 of 0.55
BB_estimate = -0.388 x O-Swing% + 0.208
Interesting enough, gimme more details:
The data used are all hitter-seasons from 2017 to 2021 in which the hitter had at least 100 plate appearances. That comes to 1672 hitter-seasons in total.
The correlation was calculated between the hitters’ K%, BB%, and each of the underlying plate discipline metrics for the same season. That mean’s these are not meant to be predictive, but they may be stickier year to year than K% and BB%.
More details can be seen in the full tabulated results below.
R^2 Correlations of Hitter Discipline and K and BB rates
K% | BB% | |
---|---|---|
Contact% | 0.74 | 0.05 |
SwStr% | 0.55 | 0.04 |
CSW% | 0.6 | 0.01 |
Swing% | 0 | 0.49 |
O-Swing% | 0 | 0.55 |
O-Contact% | 0.63 | 0.024 |
Z-Swing% | 0.002 | 0.11 |
Z-Contact% | 0.62 | 0.02 |
A couple things to point out. Contact% is clearly the best correlated with K%, outperforming both Z-Contact% and O-Contact%. My previous go to, SwStr%, isn’t horrible, but my custom Fangraphs dashboard has been updated leaving it out.
It’s interesting to see that Swing% has zero correlation with K%. Intuitively, both Javier Baez (2021: 57% Swing%, 62% Contact%, 34% K%, 13% Barrel% ) and Jeff McNeil (2021: 56% Swing%, 85% Contact%, 14% K%, 4% Barrel%) type aggressive swingers exist.
Roughly speaking, BB% is correlated with opposite underlying stats as K%. The correlations aren’t as strong as we get with Contact% and K%, but we see that the more a player swings, the less likely they are to walk, particularly when they swing outside of the zone.
Now, what if we combine different underlying metrics to make a multivariate based predictive stat?!
I’ve got some interesting and probably-too-complicated ideas to pursue, but those will have to wait for another Saturday night. Or a month off of work.
Leave a Reply
Friends don't let friends talk to themselves.
You must be logged in to post a comment.