I debuted the Annual Rate Charts (ARCs) on twitter over a month ago and since I’m unleashing the Tableau to the public, I wanted to ensure proper documentation was available.
You can find the link to the interactive Tableau Charts HERE!
What are Annual Rates Charts?
Annual Rate Charts (ARCs) are a way to measure the production and expected production of a player relative to their time on the ice. Production is quantified using the Goals Above Replacement (GAR) and Expected Goals Above Replacement (xGAR) models from Evolving-Hockey. If you want an in-depth analysis of how GAR is calculated, documentation is available here. In short, GAR is a single number that captures the contribution of that player in different game situations. GAR is subdivided into several categories, including even strength offense & defense, power play, penalty kill, takeaways and faceoffs. The number for each player represents the number of goals more (positive) or less (negative) that a player contributions relative to a replacement-level player (by position).
Expected GAR (xGAR) is also a single number assigned to each player, which is calculated based on the on-ice performance of a player (including rates, quality, shooting and goaltending). This number is the expected number of goals more or less above replacement-level that the player should contribute, based on their on-ice actions. In other words, xGAR represents the performance of a player, while GAR represents the results of that performance. When I tested the relationship between these values, I find that xGAR captures approximately 86% of the variation in GAR, which is quite substantial for a model including human subjects.
Annual Rate Charts (ARCs) can be differentiated from many other publicly available visualizations using GAR and xGAR data as it is relative to the amount of time a player is on the ice. These rates can be differentiated from many other popular visualizations that display GAR data in aggregate form. There are strengths and weaknesses to both rate and aggregate data, but both are useful in their own way.
Strengths and Weaknesses
Aggregate data displays the total contribution of a player. This is useful in evaluating and comparing the total impact that a player has relative to the total impact of his peers. However, aggregate data is limited in its effectiveness in communicating player value relative to their respective time on ice. This is one of the weaknesses of aggregate data in general, as player value is inflated with more minutes. This weakness of aggregate data is the primary strength of rate data. Rather than looking at total contribution, rate data is normalized by looking at a players’ contribution per 60 minutes of ice time. The weakness of using rate data is actually the opposite of aggregate data: players with a smaller sample inflate their rates compared to players with larger samples.
This weakness can be partially mitigated by removing players below a minimum threshold, but it still favors players in limited roles. This is something I address in my treatment of rate data by categorizing players by role, which includes top-6/middle-6/bottom-6 for forwards and top-4/bottom-4 for defensemen.1 Long story short, defensemen and forwards are divided into equal groups based on the median and percentile of their respective teams’ total positional even strength time on ice in that particular season.
For instance, the GAR/60 leader among forwards was Kailer Yamamoto at 1.1 GAR/60. His aggregate GAR was only 8, which is half of McDavid and Draisaitl’s. At face value, it looks like Yamamoto only contributes half as many Goals Above Replacement compared to McDavid, which is true over the course of the entire season. What aggregate data doesn’t take into account is the fact that Yamamoto only played 430 minutes at even strength compared to the 1224 and 1105 minutes of Draisaitl and McDavid, respectively. Am I suggesting that Yamamoto is better than Draisaitl and McDavid? Absolutely not. What I am suggesting is that rates do a better job than aggregate data at illustrating the relative value of a player, especially those that are in limited roles.
How is it calculated?
ARCs use the xGAR and GAR data provided by Evolving Hockey. This is displayed in aggregate form, which include several categories: GAR, Even Strength Defense (EVD), Even Strength Offense (EVO), Shorthanded GAR (PK), Powerplay GAR (PP), Takeaway GAR and Draw GAR. For the interest of this project, EVD, EVO, PK, PP and overall GAR values were used. These values are available from 2007-2020 seasons (n=11066 skaters). The first thing to do was generate a per 60 minute rate for each player, which was calculated by dividing the value by its respective TOI, for example:
EVO/60 = EVO/(EV_TOI*60)
This was done for all 5 categories (as well as the 5 expected categories) to generate 10 different GAR rates per 60. I could have just stopped with the rate charts and published the results, but I wanted to display a more intuitive number based on the rates for that specific season and position.¹ After organizing the GAR rate data per year and by position (D or F), the range of values was determined for that given year. The next step was to use the relative range as the basis for the 0-100 scale. This was calculated by using the following formula:
EVO_RATING=(Player EVO/60+ABS(Lowest EVO/60))/(Highest EVO/60+Lowest EVO/60)*100
For instance, Dmitry Orlov had a EVO rate of 0.03584 per 60 minutes and the lowest was -1.12465 while the highest was 1.194. If we plug these values into the excel formula:
EVO_RATING=(0.03584+1.12465)/(1.194+1.12465)*100
=(1.16049)/(2.31865)*100
=0.500050*100
EVO_RATING=50.05
This value is the percent relative range value for that year and position. This returns a value ranging from 0-100, with 100 being the largest rate and 0 being the lowest. I chose to use this method over percentiles or rankings because they can be extremely misleading in their conclusions.
Percent Relative Range vs. Percentiles
How is using this method superior to simple rank ordering or percentile rankings of players. Take for instance the following data as an example:
Player A – 10
Player B – 10
Player C – 9
Player D – 9
Player E – 8
Player F – 7
Player G – 5
Player H – 5
Player I – 1
Player J – 0
Range = 10
Mean = 6.4
Median = 7.5
Now ask yourself the following questions:
- Is the distance between the 1st (10) and 3rd player (9) the same as the 8th (5) and 10th (0)?
- What if I told you that Player F was in the 40th percentile, meaning that 40% of all players did worse and 60% did better. Is that an accurate depiction of their contribution?
- Is it fair to call Player H the 3rd-worst player?
As you can see from the example above, data that is not normally distributed (and GAR/60 is not) can be problematic for these methods and can be used to produce misleading conclusions. Using percentage relative range would give Player F a grade of 70, which is a far superior representation of their contribution relative to his/her peers. Rather than comparing the rank or position of a player relative to other players, percent relative range compares the production of a player relative to other players’ value.
Going back to the Orlov example, there’s 241 defensemen in the database (for 2019-20) that meet the 200 minute minimum criteria. Orlov ranks 126th out of 241, which puts him in the 48th percentile, meaning that 48 percent of all defensemen had a lower EVO GAR than he did. Using percentage relative range would give him a grade of 50, which is slightly above the median of the total range of values (0.0525). This means that while some may conclude that he was a bottom-half defender in the league relative to other players, I would contend he was better than average (barely), relative to other players’ ratings.
This is a very trivial example, but look at the top-5 in EVO/60 this year: Tennyson, Ellis, Dumoulin, Scandella, and Hague. It may sound impressive to say that X player ranked in the top-5, but there was a significant drop from the 0.99 EVO/60 for Ellis (who ranked 2nd) to 0.69 EVO/60 for Scandella (who ranked 4th). Both players would still rank in the 90th percentile, but Ellis performed at a much higher rate than Scandella and that fact would be lost when lumping all players based on percentile. They are both amazing players offensively, but a simple ranking or percentile does not adequately capture the nuance of their relative production. Using my calculation, Ellis would receive a score of 91 and Scandella would receive a score of 78, which I believe to be a better representation of their offensive contribution at even strength.
Charting and Conclusions
Once armed with the percentage relative range data for each player in their 5 GAR and 5 xGAR categories, I used Tableau to produce the dashboards you can find here. I also adjusted the values to range from 1 to 99, so as to provide a more ascetically pleasing form (a la EA Sports NHL ratings).
I hope that you find these charts useful in identifying players with upside that would be otherwise unrecognized by traditional aggregate charts. I also hope you are equally as satisfied by the grading scale using a relative scale based on the range of values, as opposed to alternative methods. If you want to create your own charts and graphs using this data, subscribe to be a Patreon for Evolving Hockey.
Footnotes:
1. The process for assigning roles was probably the most time intensive part of the visualization. It would be simple to pick an arbitrary number of minutes (e.g. 15.5) and classify defensemen who played greater than 15.5 top-4 and less than 15.5 bottom-4. The problem is that each team played a different number of even strength minutes per game, which means that the cutoff would be different for each team. In order to accomplish this feat, I used the following calculation in Tableau:
IF [POS]=”D” THEN
IF ([Toi Ev]/[GP]>[Median D]) THEN “Top-4”
ELSE “Bottom-4″ END
ELSEIF [POS]=”F” THEN
IF ([Toi Ev]/[GP])>[Median F top-6] THEN “Top-6”
ELSEIF ([Toi Ev]/[GP])>[Median F Bottom-6] AND ([Toi Ev]/[GP])<=[Median F top-6] THEN “Middle-6”
ELSE “Bottom-6″ END
ELSEIF [POS]=”G” THEN “”
END
with Median D defined as:
{ FIXED [Team],[Season],[POS]: MEDIAN([Time per Game])}
with Median F top-6 defined as:
{ FIXED [Team],[Season],[POS]: PERCENTILE([Time per Game],0.66)}
with Median F Bottom-6 defined as:
{ FIXED [Team],[Season],[POS]: PERCENTILE([Time per Game],0.33)}