©2018 by jessbehrens.com. Proudly created with Wix.com

Hawks, Doves, & Owls: Let's play some football! - Chp. 10


by Jess Behrens

© 2005-2018 Jess Behrens, All Rights Reserved​

I'm going to shift gears here for a second to take what I've learned and test it. In my experience doing research, one of the best ways to test a theory or a method is to break it. And, since I'm working with sports data here, what better way to do that than to apply this method to a completely different sport?

It turns out that NCAA College Football maintains many of the same basic stats I use in the basketball model I've been describing, except for one. They also have a post season, albeit a very different one that is spread out over a much longer time, larger space, & that is 'seeded' much less formally than the basketball tournament. Obviously, the game couldn't be more different than basketball, what with the whole tackling thing, set plays/downs, much more equipment, an entirely different set of players (most of the time) on 'offense' & 'defense', scoring is much different, etc., etc., etc.

Thus, the college football bowl season is the perfect candidate to test both the evolutionary game theoretic & traditional game theoretic (Key Player) components of this methodology. In addition to the difference in rules, etc. listed above, bowl season is dramatically different from the Men's Basketball Tournament. The biggest difference is, of course, the fact that the bowl games don't lead to another game, except for the 4 team playoff for the national championship. How does the lack of earning another game with a win affect the psychology of the teams in the Bowls? It's something that all of the pundits talk about each year, whether they call it the motivation factor or absentee coach factor or whatever. Obviously, psychology is a huge part of what I call 'the Hawk' species type. Wouldn't this psychological difference have a huge impact on how the bowls play out, especially among the Hawks? What about all of the future NFL-ers who are choosing not to play in their bowl games in record numbers?

Enough about why the football bowl games are the perfect foil for a method designed to predict the basketball tournament. I began this work by gathering the stats I normally use for basketball & replaced the missing one with the Football Power Index. Since I want the tournament & football data sets to be as similar as possible, I selected a subset of 68 teams (34 bowl games) from the 41 bowl games played during 2016-2017 bowl season. This required me to 'play NCAA selection committee' and eliminate 7 games. Of course, doing this is yet another difference from basketball. The selection committee uses a very rigid and standardized method for selecting and seeding teams, which of course has a huge impact on how the tournament plays out. My method involved making sure bowls with all of the ranked teams were included and then flipping a coin for the others.

With all of this done, I ran the numbers just like I do for basketball. First, I standardize the numbers using my method; I use that output to create the indexes; finally, I rank the indexes. The networks are then generated using the win & loss queries. All weights are converted to their Bayesian equivalents & centralities are calculated in Gephi. Key Player statistics & such are then calculated using the code I wrote & every team is given a 'Species Type' using the method I put together (Key Player is among the variables used here).

Output from all of this was then exported as .csv's and the monte carlo/evolutionary game theory simulations were run in python 3 (anaconda). Plots were subsequently done in Seaborn. I will now run through the method I used to 'type' the tournament as an example of what was done in previous chapters but not listed.

Linear vs. Less Linear

As was first described in Chapter 6, the first consideration in 'typing' tournament results is to determine if the given years simulation results should be classified as linear or less linear. Figures 1 & 2


Figure 1. Hawks vs. Owl, Linear Tournament Years Plus 2016-2017 Football Bowl Data

show the 2016-2017 football bowl games classified as linear & less linear, respectively. The r-value for linear years minus the football data is r = 0.91. In figure 1, r = 0.89, which is very similar to the


Figure 1. Hawks vs. Owl, Less Linear Tournament Years Plus 2016-2017 Football Bowl Data

basketball only data. By comparison, Figure 2, which is the less linear tournament years plus the 2016-2017 bowl data, has r = 0.65 while r = 0.55 in less linear tournament years without the football data. Thus, the 2016-2017 football data is most likely linear, given that the change in r is much larger when it is included with these years.

The next step is to determine if the 2016-2017 football Hawks & Owls separate energetically or if these two types overlap. Figure 3 shows the box plots for each tournament year 2005-2018 & the football


Figure 3. Total Energy, Hawks & Owls, with Conf. Intervals, All Tournament Years & Football Data

data (labeled as 9999 to differentiate it). Figure 3 is admittedly crowded, and I apologize for that. I wanted to give the reader an idea of how the years split, which is something I didn't do in Chapter 7, when the topic of separation was first covered.

The output from the football simulations are on the right, labeled 9999. As you can see, they overlap significantly. Given that the 2016-2017 Football Bowl data is also linear, this makes its 'type' linear without separation. From Chapter 8, we know that these years favor Owls to win the whole thing, and include 2006, 2009, 2011, & 2014. That leads to the first real testable question: What was Clemson's (the 2016-2017 National Champions) species type? What about Alabama?

Hawks & Owls

Sure enough, it turns out that Clemson is an Owl & Alabama is a Hawk. In fact, that year's 4 team championship included 3 Hawks (Ohio State, Alabama, & Washington). Clemson was the only Owl. All very, very similar to this last seasons tournament, in which Kansas, Michigan, & Loyola were all Hawks with Villanova the only Owl. This past year was also a linear year without separation. So, that's 1 for 1. But how did the Hawks, in general, do? In Chapter 4, I posted a table that showed Hawks are significantly week in the first round, and significantly strong in the Elite 8. Given that the bowl series is 1 game (except for the 4 team championship), it's fair to wonder if Hawk under performance is repeated in the bowls. When you drop the Alabama/Washington, because both teams are Hawks, you end up with 11 out of 14 Hawks losing their bowl games! That's not quite p<0.05, but it is close (p=0.053).

Species Fitness Plots

Figures 4 & 5 show the population & individual fitness plots for linear years without separation that includes the 2016-2017 bowl data. There's nothing really remarkable about either of them. They both


Figure 4. Population Fitness by Species, Linear w/out Separation Tournament Years with Football Data

look like the plots for linear without separation tournament years that I posted in Chapter 8. Figure 4 (All Species Types p<0.001; r-values: Hawks 0.79, Owls 0.94, Dove-Owls 0.81, Doves 0.89) shows virtually no difference between the Hawk & Owl population fitness line & Figure 5 (All Species Types p<0.001; r-values: Hawks -0,19 Owls -0.21, Dove-Owls -0.11, Doves 0.58) clearly shows the Owls centered within


Figure 5. Individual Fitness by Species, Linear w/out Separation Tournament Years with Football Data

the much more stochastic Hawk scatter plot. I include these two plots to make the point that even though it uses a different statistic, the football data conforms to what is expected from the basketball only data.

Predicting the Bowl Games

The bowls are enough different from the tournament in structure that it's worth it to look into how one would predict the bowl games. Obviously, the real way to do this, and which I will do, is to explore all of the variables, including species type, with a random/conditional forest. However, a quick look at some of the Network centrality measures and their relationship to winning and losing a bowl game can be done fairly quickly and without extensive discussion.

Table 1 shows 4 of these measures and their effectiveness according the Poisson. Only one provides significant insight into the outcomes of the bowl games, and that is the Net Key Player score for each


Table 1. Effectiveness of Network Centrality & Key Player Measures at Predicting Bowl Games

team. In 25 out of 35 games (34 bowls plus the national championship), the team with the higher Net Key Player score wins (p<0.05). Of note is the fact that neither the Win Network Key Player Score nor Loss Network Key Player Score provides any significant predictive power in and of itself. Only when these two scores are combined (Win Key Player - Loss Key Player) does the measure provide insight. Of interest is that, unlike in Basketball, the Weighting ratio I use to modify the Key Player metric and described in Chapter 9 provides no additional accuracy.

If we tear into this a bit more, it becomes clear that the success seen in Table 1 is focused on the bowl games that, uh, 'don't matter'. I think they matter, but that's the take by the media. In fact, in the games involving ranked teams, the Key Player metric is just 50/50. Despite the lack of significance in these games, this pattern does make some sense. After all, the major bowls involve teams that have done the 'best' over the course of the season. And the professionals involved in inviting these teams do a good job - teams in these games are typically Hawks/Owls in the 2016-2017 bowl data set. Thus, these are games that usually don't occur until the Sweet 16 or Elite 8 in the tournament. Given these parameters, it makes sense that the results would be 50/50 with respect to a metric designed, first and foremost, to predict the first round.

But never fear! There is a solution. I separated out the participants in 11 of the top 2016-2017 Bowl Games to search for patterns: Orange, Peach, Fiesta, Cotton, Rose, Sugar, Outback, Holiday, Cactus, Alamo, & the National Championship Game. Table 2, like Table 1 looks at the relationship between Network Centrality Measures & predictive success using the Poisson. As you can see, much like Table 1,


Table 2. Effectiveness of Network Centrality & Key Player Measures at Predicting Major Bowl Games

there is only 1 strong predictor, making these results strong candidates for examination using random/conditional forests. The one metric that did work is, not surprisingly, the same metric that provides insight into the later rounds of the NCAA Tournament - something I call Final Game Key. This statistic is simply a count of the number of win queries a given team falls on that fulfill one simple criteria - at least 75% of the teams in the query made at least the Final Four. There are literally hundreds of them in the vector of ranked indexes I use to generate the network, many of them that include 2 single ranks or 2 small ranges. Figure 6 is an example of these 'Final Game' queries with (Wins = 9999 are for 2016-2017 Football Teams)


Figure 6. An Example of a 'Final Game' Query

significance to the 2016-2017 Football postseason. If you're wondering who had the most in 2016-2017, no, it wasn't Clemson, although they did have 8. It was Oklahoma State, with a rather pedestrian 11 - given that 2008 Kansas has the most at 134. As you can see, the Poisson for the Final Game Key doesn't quite reach statistical significance, however it is close. Also, to get to this statistic, you have to throw out 3 games - the teams in those games had an identical number of Final Game Keys.

Table 3 includes the same results shown in Table 1 minus the major bowls highlighted in Table 2. As you can see, the significance of the Net Key Player Metric has improved and is now p<0.01. As with


Table 3. Effectiveness of Network Centrality & Key Player Measures at Predicting Minor Bowl Games

Table 1, the Weighted Key Player approaches significance, but doesn't do as well as the unweighted version.

Well, that's it. It's amazing, and counter intuitive, but this method seems to work when applied to data from a different sport's post season; data that has been organized along entirely different lines, and in which the players motivation is entirely separate from basketball.

<--Chapter 9 Chapter 11-->

#BetweennessCentrality #KeyPlayer #NCAATournament #NCAA #MensCollegeBasketball #NetworkAnalysis #EvolutionaryGameTheory #MonteCarloSimulations #SpeciesFitnessPlots #SpeciesCompetitionPlots #football #NCAABowlGames #20162017NCAABowlSeason #Seaborn #Matplotlib