Interviewing to be a Hockey Analyst with the Leafs: Prepared Questions

Ruben Flam-Shepherd
11 min readOct 22, 2018

This is the third of a four-part blog series detailing my experience applying/interviewing to be a Hockey Analyst with the Toronto Maple Leafs. You can find the first part here and the second part here. In this part I go through the questions I was asked to answer in preparation for the in-person interview. I hope this series sheds some light on the process of how interviewing for an analytics position with an NHL team works (or at least how one such process worked).

After the phone interview and subsequent invitation to an in-person interview I received an email outlining three questions that the Leafs’ hiring team asked me to prepare answers for ahead of time:

The answers that follow are a result of the work I did during the two-and-a-half days between receiving the above email and my scheduled interview.

Question 1. What is the closest of these cities to the geographic centre of all of the birthplaces of the Team Europe players from the 2016 World Cup of Hockey: Salzburg, Nuremberg, Zurich, or Prague? Show your work.

To answer this I scrapped player names from the 2016 World Cup of Hockey team rosters as listed on nhl.com using the Python Selenium package. As a side product of having written several hockey scrapers, I had already stored player birthplaces in a local database. I converted these birthplaces and the potential ‘closest cities’ (as provided in the email) to longitudinal and latitudinal coordinates using the geocoder module and the GoogleMaps API.

To get a rough approximation of the geographic center I averaged the longitudinal and latitudinal coordinates for all player birthplaces. However, these coordinates exist on a three-dimensional plane so calculating the true geographic center takes a bit more work. To that end the nvector module was used to average the vectors represented by all of the longitudinal and latitudinal coordinates; the average of these vectors represents the true geographic center. I then used the geopy module to calculate the distances between the potential ‘closest cities’ and the geographic center, identifying the city that was the shortest distance away. Finally, everything was visualized using plotly.

Components of solution. Blue boxes represent the Python module used for a given step.

The city closest to the geographic center of Team Europe’s players birthplaces ended up being Nuremberg, Germany:

Birthplaces of Team Europe players (blue), the geographic center (black) and comparison cities (green). The closest comparison city is Nuremberg, Germany (red).

As I had not hard-coded my solution to be specific to Team Europe, I was able to easily examine equivalent solutions for Team North America, Team Canada, and Team USA, which I also included:

Birthplaces of Team North America players (blue), the geographic center (black) and comparison cities (green). The closest comparison city is Kansas City, USA (red).
Birthplaces of Team Canada players (blue), the geographic center (black) and comparison cities (green). The closest comparison city is Montreal, Canada (red).
Birthplaces of Team USA players (blue), the geographic center (black) and comparison cities (green). The closest comparison city is Hamilton, Canada (red).

Question 2. Auston Matthews is eligible to sign a new contract on July 1, 2018. What is the largest contract that you would offer him (give both the number of years and AAV) and why?

To answer this question I structured my answer very similarly to how academic journal articles are laid out:

Abstract

In order to best assess the largest contract that should be offered to Auston Matthews, aggregate player statistics from the 2001–2002 NHL season onwards were collected. To control for era-effects (i.e., differential scoring rates in different years), all players in a simple season were ranked within the statistical categories of interest (goals, goals/60 min, minutes on ice/season, points, and points/60 min). Using these relative ranks, rookie seasons were compared and a cohort of players with similar rookie statistics to Matthews was identified. The first non-entry-level contract of these rookies was then used to create a regression model to identify the largest contract (term & AAV) that should be offered to Matthews.

Methodology

Player season statistics were scrapped from nhl.com/stats. Seasons with available statistics were identified from a drop-down menu on the web page which was used to propagate subsequent scrapping iterations. Pages displaying individual player information (height, weight, etc.) were also scrapped from nhl.com. These pages were accessed by suffixing a standard html address with the player’s unique identifying number (for example, this is Auston Matthews’ page). Player season and individual page information were stored in a database once scraped.

To correct for era-effects (i.e., differential scoring rates in different years), players (rookie and non-rookie) were ranked within individual statistical categories for each season. For example, in the 2016–2017 season Auston Matthews tied for second in goals scored, for which he would have been assigned a rank of 2.5. Rookie seasons were then collected by identifying a player’s first season in the NHL (note: in my rush to answer this I misidentified rookie seasons which should’ve been defined as a player’s first 20-or-more game season). Only rookie seasons occurring on or after the 2001–2002 season were used, as this was the first cohort of Restricted Free Agents (RFAs) coming of off entry level contracts to be signed after the implementation of the salary cap in 2005–2006.

Statistical categories ranked in this way included: goals, goals/60 min (to correct for playing time differences), time on ice/season (ignoring defensemen), points, points/60 min, and shots/60 min. Due to the limited nature of the scrapping (only aggregate data — player season statistics — were stored) analysis of more advanced statistics was not possible.

All of the above steps were done using various Python modules: manipulation of the web browsers and scrapping of statistics was done with Selenium; database access and creation was done using sqlite3; data wrangling and ranking of statistics was done using pandas.

To account for changes in the salary cap from season-to-season, the Average Annual Value (AAV) of a contract is expressed as the percentage of the salary cap (Cap Hit Percentage; CH%) in the year that the contract was signed.

Results

Table 1. Top 10 rankings of goals (relative to all players in given season) by rookie NHL players from 2001–2002 onwards.

Unfortunately, (or fortunately) for the Toronto Maple Leafs, since the 2001–2002 NHL season no rookie has achieved a higher ranking in goals scored than Auston Matthews (Table 1) nor, aside from Alex Ovechkin and Patrik Laine, has any rookie placed in the top 10. These two players present obvious comparison points, though Patrik Laine is also on an entry-level contract. At 4th and 5th, Sidney Crosby and Artemi Panarin can also be included. However, it should be noted that these players placed significantly further down the goal scoring ranks (13th and 23.5th, respectively). In fact, only 4 rookie players have finished in the top 20 in goals scored in a season in the last 17 years.

Table 2. Top 15 rankings of points·hr-1 (relative to all players in given
season) of rookie NHL players from 2001–2002 onwards.

A comparison of rookie point rate (pts/hr; Table 2), reveals that Matthews’ 2.87 pts/hr placed him 16th league-wide for the 2016–2017 season, 8th best in the context of rookie seasons. While Connor McDavid may not be a direct comparable (his 3.39 pts/hr rank was the best of any rookie) his recently signed 8-year contract has the highest AAV ($12 500 000) and CH% (16.67) of any in the league and can represent the upper bound of any contract offered to Austin Matthews. Bracketing Matthews are Jason Spezza, Evgeni Malkin, and Mitch Marner, all reasonable comparison points given their obvious similarities in pts/hr rankings and similar draft positions (both Spezza and Malkin went 2nd overall in their draft years). Relevant to the Leafs, it should be noted that Mitch Marner is eligible for a new contract at the same time as Auston Matthews.

Higher in the rankings, previously mentioned players include Crosby, Panarin, Ovechkin, and Laine. Jake Guentzel is ignored as his low point totals (33) are reflective of the small number of games he played (40). Patrick Kane and Jeff Skinner are reasonable comparison points given their very similar point totals (72 and 63 respectively vs. Matthews’ 69) and draft positions (1st and 7th respectively vs. Matthews’ 1st). Ryan Nugent-Hopkins was not included due to weak performances in subsequent seasons (and an AAV reflecting that fact). The curious case of Michael Ryder is dealt with below.

Table 3. Top 15 rankings of shots·hr-1 rank (relative to all players in given season) of rookie NHL players from 2001–2002 onwards.

Moving on to rankings of shot rates (shots/hr; Table 3), one can appreciate the obscene number of shots Ovechkin took (425), not only relative to other rookies, but to the entire league at the time (no one had more shots in the 2005–2006 season). However, this is slightly offset by the amount of playing time he received as his shot rate, which is still markedly larger than any other rookie players, is not 50% greater (as his number of shots was). Matthews’ 11.59 shots/hr is good for 5th in the ranking, however many of the players with similar shot rates (namely Meier, Gallagher, Brunner, Burnett and Booth) were likely afforded much less playing time compared to Matthews (due to their raw number of shots) and thus will not be considered comparable players.

Further down this ranking, Gabriel Landeskog and Nathan MacKinnon have similar shot totals (270 and 241 respectively) and draft pedigrees (2nd and 1st respectively) to Matthews’. However, their performances in subsequent seasons decreased significantly (perhaps reflective of the Colorado Avalanche’s performance as a whole). Assuming Matthews continues his torrid scoring pace, Landeskog and MacKinnon will not be suitable comparison points. Jeff Carter is ignored as, in addition to the slightly poorer draft pedigree (11th overall) and his much poorer goals rank (36th), he signed an 11-year contract after his entry-level deal, which is no longer possible under the current CBA.

Michael Ryder presents an interesting case. There are several convergences between his and Matthews’ statistics, including goal ranking (Table 1), points/hr ranking (Table 2), and shots/hr ranking (Table 3). However, the short term of his first contract after his entry level deal (1 year) in addition to his low draft position (216th overall) combined with the fact he never established himself as an elite offensive talent (his 63-point rookie season is the highest total he ever achieves) allow us to not consider him as a comparable player.

Table 4. Top 19 rankings of TOI for rookie seasons from from 2001–2002 onwards.

To identify players that received similar workloads to Matthews, total time on ice per rookie season was compared (TOI/season; Table 4). Indeed, many familiar names are seen. Novel names with the potential to serve as potential comparison points include Dany Heatley, Nicklas Backstrom, Jack Eichel, Ryan Malone, Paul Statsny, and Anze Kopitar. Ryan Malone is ignored due to incomparable offensive numbers. Jack Eichel will likely sign a contract soon (certainly in the next year) and will serve as an interesting comparison point to Matthews due to their similar draft pedigrees. The remaining players will be included as their performances in subsequent seasons revealed higher-percentile offensive numbers comparable to Matthews’ rookie performance.

Conclusion

Figure 1. Contract length (years) versus salary cap hit (%) for players comparable to Auston Matthews

Using the similar players identified above, a regression model was created by plotting the contract length versus the cap hit percentage (Fig. 1). The contract used for this was the first one signed after the entry level deal for each player. Contract length was used as the current consensus in the field* is that the longer the contract, generally, the higher the cap hit % of the contract. This is because the later part of the contract encroaches upon the player’s more expensive unrestricted free agency (UFA) years. Moreover, players don’t sign longer deals unless they feel they are being adequately compensated, and teams don’t sign players to longer deals unless they are confident that the player will maintain a high level of play for the length of the contract. Supporting this notion, a (weakly) positive correlation was observed between contract length and cap hit percentage (r-squared of 0.44).

Based on the salaries of the similar players identified, the regression suggests paying Matthews a base amount equal to 9.3% of the cap, increasing by 0.5% for each year of the contract. This would amount to a cap hit percentage of 13.2% given an 8-year contract. Assuming a growth of 4.7% in the salary cap, this would equate to a contract with an AAV of $12 564 000 over 8 years, starting in the 2019–2020 season. Obviously, this figure represents an proposed maximum; every effort should be made to lower this number during salary negotiations.

However, given the similarities between Matthews and several elite players in the league (Crosby, Malkin, Ovechkin, and McDavid), Matthews’ historic rookie season, and the premium that the NHL places on scoring goals, awarding him a contract with a cap hit percentage of 16% is not entirely unreasonable (assuming his scoring pace continues).

As an alternative, Matthews could be signed for shorter term ‘bridge’ contract at a lower cap hit percentage (as per Fig. 1). This is not advisable because 1) a premium will likely later be paid due to the player’s more powerful bargaining position as a UFA, and 2) the media speculation surrounding a subsequent contract may (emphasis on may) sour the player’s relationship with the team. With a longer term contract, the cap hit percentage is ameliorated as the salary cap increases year-over-year (as it is generally forecast to do*). For example, Alex Ovechkin’s contract with an initial cap hit percentage of 16.8% is now a much more palatable 13.2% (though 13-year contracts are no longer allowed).

*Volman, Awad, and Fyffe (2014) Stat Shot: The Ultimate Guide to Hockey Analytics. ECW Press

Question 3. Take us through a software project of yours: why you created it, what it does, how it works, how you’d improve it, etc.

For this question I walked through my hockey-stats package, which I’ve mentioned in previous articles. This is the package that I used to scrape the player season data that I had on hand to answer Question 1 (above) with. As this got fairly technical I’ll spare you the gorely details. If you’re interested in a general overview as to how I structured my answer you can find the actual PDF containing my original, unedited answers that I presented during the in-person interview here. In it I present a series of figures for this question that give you an idea as to how I logically progressed through my answer.

After a frantic two-and-a-half day sprint to get out the highest quality answers I could, on the morning of the interview I printed and bound 6 booklets with these answers and began the train-ride down to the ACC…

Stayed tuned for the final post in this series, in which I sit in a room with Kyle Dubas and Darryl Metcalfe for two-and-a-half hours…

--

--

Ruben Flam-Shepherd

Analyst/Software Dev. Hockey stats, tech, biology, programming, purveying verisimilitudes. Follow me on Twitter at https://twitter.com/rubenflamshep