When someone says to you, “This team consistently scores goals,” what do you think? What characteristics do you align these teams with? Do these teams not get shutout a lot? Do these teams win more games? If goals act as if random, is this a skill to be consistent or just luck? The truth is that there hasn’t been much analysis on consistency, momentum, and other somewhat measurable statistics that seem to get thrown into the intangibles pile.
Shot quality seems to be an endless debate that brings out some of the worst and best analysis. Some people like DTM about Heart and Manny Perry have contributed expected goals models and all-encompassing player models. Ryan Stimson has contributed valuable tactical and micro statistic analysis to the shot quality debate. However, many have resorted to binning shots and players which, many times, leads to dangerous and bad analysis. Anytime you here “inner-slot shot for%” or “scoring chance for%,” and is used in place of corsi, fenwick, or expected goals, I suggest you ignore the analysis at hand.
Binning, when used correctly, can still be valuable. For example, we bin shots by type in expected goals. Some forms of machine learning, notably Markov chains, rely on binning sequences to predict outcomes. The fact is, there is no use for getting rid of shots, shrinking your sample size, and pretending non-scoring chances can’t be goals.
Throughout this post, I will be using Perry’s expected goals and scoring chances from 2008-2016, excluding the shortened season. I will try to set a bar for future analysis on consistency and its value in hockey along with its relationship to shot quality. This is directly related to the percentage of shots that are scoring chances. The first thing I must address is the repeatability of scoring chance percentage (scoring chances for / unblocked shot attempts for). Eric Tulsky wrote that it was a not skill a while back. The offensive repeatability, on the player level, using Perry’s scoring chance model is .18. That does not adjust for team or usage change which, most likely, have an effect. This is enough of a correlation to go on in our analysis since we will only be addressing team level offense today.
Kerry Whistant, a physics professor at Iowa State, did some research into consistency’s value in baseball. He found that a team is more likely to win games if it is more consistent. The stat the drives that consistency is SLG%. A team with a .080 higher slugging percentage than a team with the same runs scored per game throughout the season, will win one more game. That might not sound like a big deal, but wins cost millions of dollars in Major League Baseball along with the National Hockey League.
When I saw this, I was instantly interested as to how this could apply to hockey. I felt that scoring chance percentage could easily be slugging percentage. Slugging percentage, for those that don’t know, is the percentage of hits that are extra base hits. When there is a runner on second base, there is a higher likelihood that he will score than if he was on first. I first took a look at how scoring chance percentage effected game to game variance. Because of the fact goals act as if random, I used expected goals instead. I was surprised to find the average variations are higher as scoring chance percentage increases. This means that teams who take a higher percentage of high quality shots to get to their expected goal mark, will be less consistent.
The .33 r^2 is a large enough correlation that it must be taken seriously. Some issues that might arise from relying on more high quality shots could be the difficulty level as well as skill level. There is no question that these scoring areas are hard to get to. Take a look at the graph below. We would all like to take our shots from that yellow area, but defenses can create systems preventing your ability to achieve this ice frequently. The amount of skill you put on the ice will also vary throughout the season due to injuries and usage change.
Now we need to look at how variance might effect goals scored. As you can see from the chart below, teams with lower variance are more likely to outperform their expected goals. (dGF60=GF60-xGF60) The correlation of .18 is not the strongest in the world, but again, is a correlation that cannot be ignored. Predicting goals is hard enough, let alone predicting who will outperform your model.
When looking directly from scoring chances to delta goals scored, that correlation sinks, as expected, to .13. Still a fairly decent correlation however. On top of that, the average scoring chance percentage for a team that outperforms the expected goals model is 19%. The average for a team underperforming the model is 20%.
This data, of course, is all in-sample correlations. Predicting future scoring chance percentage would be very hard considering that coaches have a large effect on the stat. A reasonable usage for this knowledge would be evaluating whether or not you expect your team to regress. Upgrading from shooting percentage to delta shooting percentage is a great start, but if we apply our knowledge of consistency as well, we can see the image clearer.
Above is this year’s delta fenwick shooting percentage (actual – expected), our expected delta fenwick shooting percentage given the knowledge we have of tactics and their effect on consistency, and our adjusted delta fenwick shooting percentage (dFSh% – xdFSh%) for all teams. With scoring down, the average team will undershoot their expected shooting percentage (average dFSh% is -0.3), but some teams are a long way off their target. A good window to see almost all teams to be, is within .5 of their adjusted dFSh%. Boston, Buffalo, Carolina, Dallas, and Florida should all regress up towards the mean for the rest of the season, while Washington, Columbus, both New York teams, and Minnesota should trend downwards.
There seems to be a large difference between hockey and baseball. Baseball is full of one on one battles and the defenses ability to prevent scoring chances is a lot weaker in baseball than hockey. There is probably a line at which this reverses, but that is not in any sight of the NHL. The highest scoring chance percentage since the start of the 08-09 season was the New York Rangers at an insane 30%. That team had the second largest underperformance of the expected goals model in that time span. Given the information at hand, I wonder if shot quality is even a good thing. How much more does quantity matter?
This is a rather new topic to the hockey scene, and I hope to see more research into this topic. From a micro statistic lens, understanding how teams achieve the home plate area, and the most successful ways to do it, would be a good start. It would also be interesting to add in passing and zone entry data. From a macro statistic lens, using Markov chains to understand consistency’s effect goals and wins would be interesting. In the end this is just another argument in the shot quality debate.