by Jess Behrens
© 2005-2018 Jess Behrens, All Rights Reserved
Now begins the fun part - the results! Up to this point, I've been trying to present a bit about my methods while providing a little bit of explanation about the figures & tables I'll be presenting. From here on, I'll be taking a different tact - presenting some of the evidence that suggests Evolutionary Game Theory, built on network analysis, is a good tool for understanding some of the more seemingly stochastic elements of the NCAA tournament. I'm going to present evidence that I've distilled from my exploratory analysis of the tournament data set and simulation output.
The first major structural factor impacting tournament results is whether or not a given year is a linear or a less linear year. Figure 1, which was first included in Chapter 4, shows the number of wins by team species type as well as the Poisson significance of those wins across all Tournament years. As Table 1
Table 1. Poisson Significance of Tournament Results by Round & Species Type
clearly shows, both the Hawk & Owl type dominate the Doves & DoveOwls. In fact, they are the only two species types to have won a tournament championship. The Hawks are, relatively speaking, stronger when you look across all 14 Tournaments simultaneously, having won 6 against an expected 2.06. However, the Owls have won more tournaments. The difference is that the Hawk species type is relatively rare; it's relatively rare to have a season in which the experience, as measured in the Win & Loss Networks, produces a Hawk. However, since these two types dominate the later rounds of the tournament, one of the easiest ways to understand if Evolutionary Game Theory is a good tool for understanding tournament is to ask if a given tournament structure predisposes one or the other to winning the tournament championships. Furthermore, the Poisson significance of the number of wins by species type is a good way to evaluate if the simulation results presented here correlate with either Hawks or Owls in a given tournament structure type.
Figure 1 shows the Hawk vs. Owl population competition figure first presented in Chapter 5. The relationship is obviously a very strong and linear one (p<0.001; r-value - 0.85). In searching for patterns
Figure 1. Hawks vs. Owl, All Tournament Years
that would help explain the relative strength of Owls vs. Hawks in a given year, I manually grouped all possible combinations of the tournament simulation data and re-ran the linaear model on each. I wish there was a fancy way I could tell you I found the results I'm about to present, but there is isn't.
What I discovered is that some of the tournament years have a much stronger linear relationship than even Figure 1 shows when they are grouped together without the remaining years. Figure 2 shows the
Figure 2. Hawks vs. Owls, Linear Tournament Years
same linear model (p<0.001; r-value -0.91) competition plot as seen in Figure 2, but only includes 2006, 2007, 2008, 2009, 2011, 2014, 2017, & 2018. Furthermore, when the other years - 2005, 2010, 2012, 2013, 2015, & 2016 - are grouped together, as seen in Figure 3, the linear model still holds, but is nowhere
Figure 3. Hawks vs. Owls, Less Linear Tournament Years
near as strong (p<0.001; r-value -0.55). The simulation results in these 'Less Linear' years, represented as the blue dots in Figure 3, are much more stochastic.
Does this increased stochasticity relate to the relative strength of the Hawks & Owls? Indeed it does. Tables 2 & 3 use the same Poisson significance format as seen in Table 1, but for the Linear & Less
Table 2. Poisson Significance of Tournament Results by Round & Species Type, Linear Years
Linear years separately. Table 2 seems to indicate that in Linear years, the Owls are at an advantage in that the Poisson value for the number of times they win the tournament is significant while the Hawks wins are not.
So what do the Less Linear years look like? As Table 3 shows, the opposite is true. Hawks are at an
Table 3. Poisson Significance of Tournament Results by Round & Species Type, Less Linear Years
advantage in these more stochastic years. Their wins are significant while the Owls fail to reach significance. In fact, the Hawks appear to be relatively stronger in these off years than the Owls in the Linear years (Table 2 Hawk p < 0.15 vs. Table 3 Owl p < 0.4).
Of note is that both of these 2 Owl wins in Less Linear years involved last second heroics and are among the most memorable games ever. In 2010, Butler, a Hawk, lost to Duke, an Owl, when Gordon Hayward's last second heave barely rimmed out. In 2016, Kris Jenkins of Villanova, the Owl, hit this last second beauty to prevent overtime and win their first National Championship since Rollie Massimino strode the sidelines, sending the Hawk, North Carolina, & their fans into a funk that would last....a year. All's well that ends well, right Roy?
These results beg the question: Why? What causes this shift? I don't have a firm answer, but the population and fitness plots for these two groupings of tournaments do make suggestions. Figures 4 &
Figure 4. Population Fitness by Species, Linear Tournament Years
5 show best linear fit for total population energy, or 'Fitness', by species type vs. species percentage of total population. Figure 4 displays data from Linear years (Fig 4. - All species p<0.001; r-values: Hawks 0.91, Owls 0.97, Doves 0.84, DoveOwls 0.81) while Figure 5 displays the same data, but for Less
Figure 5. Population Fitness by Species, Less Linear Years
Linear years (Fig 5. - All species p<0.001; r-values: Hawks 0.76, Owls 0.88, Doves 0.84, DoveOwls 0.85). One difference in the two plots is the Dove line. In Figure 4, the slope is much sharper than the DoveOwl line, meaning that dove fitness increases at a faster rate as their percentage of the population increases. In Figure 5, the slopes of the two lines, Doves & DoveOwls, is more or less equivalent. Another difference in these plots is where the Owl & Hawk population fitness lines cross. As was noted in Chapter 5, the point where lines cross on these fitness plots represents the point where the fitness for both species types is equivalent, or stable. The lines cross at about 0.23 percent in Figure 4 and at around 0.53 in Figure 5. The dramatic increase in this percentage would indicate that a stable relationship between Hawks & Owls is, effectively, out of reach as no species type reaches the 0.5 threshold in any year. However, this would mean that at lower percentages, the Owls would have more fitness, which contradicts the fact that Hawks have the advantage in these years. Finally, Figure 4 & 5 show the same linear pattern as seen in Figure 2 & 3. The r-values in Figure 5 are smaller than those found in Figure 4, which means they are less linear.
Turning to the individual plots provides potentially better explanations. Figures 6 & 7 show the same fitness plots found in Figures 5 & 6, but for the average energy ('Fitness') by species type. Essentially,
Figure 6. Individual Fitness by Species, Linear Years
Figures 6 & 7 take the total energy for each species accumulated over a simulation iteration and divide it by the number of individuals in each species for a given simulation. Figure 6 shows the linear model
Figure 7. Individual Fitness by Species Less Linear Years
of Average Individual fitness by population percent for linear (Fig. 6 - All Species except Hawk p<0.001, Hawks p<0.57; r-values: Hawks -0.0064, Owls -0.21, Doves 0.54, DoveOwls 0.05) years while Figure 7 does the same for Less Linear years (Fig. 7 - All species p<0.001; r-values: Hawks -0.12, Owls -0.32, Doves 0.50, DoveOwls 0.20).
The first major difference in these plots is that the Hawk regression line in Figure 6 is not significant and has an r-value that is essentially 0. What this means, effectively, is that despite the fact that Hawk population fitness is significant and positive with respect to percent in linear years (Figure 4), there is no relationship between fitness and percent of population at the individual level. The Hawk distribution in Figure 6 is completely stochastic. Could this lack of a relationship between individual fitness and percent of total population mean that individual Hawk teams are weaker in Linear years?
Figure 7, on the other hand, displays a significant relationship for individuals of all species with respect to percent of total population. One thing that is unique about Figure 7 is that the DoveOwl line is positive enough that it crosses both the Hawk & Owl lines, meaning that there is a percentage level where both Hawks & Owls are in equilibrium with DoveOwls. Interestingly, the DoveOwl Average Energy in Figure 7, shown in purple, rarely (2 simulations out of 14,000) crosses above the Hawk or Owl regression line, whereas it does so much more often in Figure 6. This would indicate that results for the DoveOwls, who are demonstrably stronger than their Dove competitors, are much more stochastic in Linear years.
I have a hypothesis about what all of this means: In Less Linear Years, the Hawks and Owls are better able to control the DoveOwls precisely because both have a significant relationship with the percent of total population. In Linear Years, the lack of a relationship between individual hawks and the percent of total population weakens the Hawks & loosens that control. Finally, that lack of control flattens the DoveOwl regression line.
The Linear/Less Linear split also shows up in the number of upsets among highly seeded teams. Table 4 shows the Poisson significance of wins, by round, for 1, 2, & 3 seeds in Linear & Less Linear Years. The number of expected losses, per year, is simply the number of losses during the years that span 2005-2018 divided by the number of years (14). Note: I am counting the 2010 game between Villanova and
Table 4. Poisson Significance of First Round Loss by Seeds 1-3, Linear & Less Linear Years
Robert Morris as a loss for a 2 seed. For those who don't remember, that game went into overtime with Villanova winning by 3. Including that game increases the total number of losses by 1 through 3 seeds to 14 in the years running from 2005-2018. While this may seem like cheating, the inclusion of that game does not change the data in Table 4 for first round losses by seeds 1-3 in Less Linear years. It does effect the magnitude of that significance, but with or without that game, the Poisson is still significant at the p<0.05 level (p<0.04). Also, in order to calculate the expected number of upsets in a given year, I've assumed that the actual upsets over the past 14 tournaments should be spread evenly across those years. If you include this years upset of Virginia, and the Villanova overtime game described above, that's 14 upsets of teams seeded 1-3 over 14 years, or an expected rate of 1 per year.
If one goes through the Less Linear years (2005. 2010, 2012, 2013, 2015, & 2016) and checks the first round scores, they'll see that there are several more candidates for inclusion as 'upsets' than what are listed here. Table 5 shows a few of the Loss Queries containing the major teams who lost early linked with several who won very close games. Every one of the 14 upsets listed in Table 4, as well as some of those other significantly close games, are in Table 5. Of note are the two 2018 teams, Virginia & North Carolina, whose early losses made headlines. In 2013, Gonzaga was down by 6 points late in the game to Southern, before coming back and winning.
Table 5. Major Upset Loss Queries
As noted earlier, in 2010, Villanova went to overtime with Robert Morris. Tennessee beat Winthrop on a buzzer beater in 2006. Of course, as we all know, Virginia was dominated by UMBC this past year. Most of these upsets, as shown in Table 4, occur in Less Linear years. However, those that occur in Linear years appear to be structurally related to the Less Linear losses. Thus, it's not that stochastic upsets don't occur during Linear years, it's that the structural factors impacting these major upsets occur more often in Less Linear years. From Table 4, we can also see that big upsets are significantly lower (p>0.95) than the Poisson would predict. In later posts, it will become clear that the primary driver behind the stochastic nature of big upsets (seeds 1-3) is the relationship between the Dove-Owls & the Owls.
Thus, the linear relationship between Owls & Hawks, the subject of this post, is only one of the factors which impact the number of upsets in a given tournament. In fact, as I will note in a later post, the relationship between the Dove-Owls and Owls is the most important factor in the first weekend of the tournament every year, regardless of any other factors. All of this stems from the fact that the tournament is actually best understood as 3 separate 2 game tournaments.
One final note about Table 5: while falling in these queries has a major impact on these teams' roles in the Loss network, they are not the only time they occur in the Loss network. Furthermore, what isn't shown here is how strongly nested these teams are in the win network. Every team's risk considers a balance between their win & loss network locations.
Predicted Average Energy (or Success)
To help complicate matters even more than they already are, I've included Tables 6 & 7, which show the predicted 'energy' or 'success' for Hawks & Owls during the Linear & Less Linear years. These
Table 6. Average Predicted Energy, Hawks & Owls, Linear Tournament Years
predictions were calculated from the total population energy simulations shown in Figures 4 & 5. So, the chain of events is:
Use the network to classify teams as Hawk, Owls, Dove-Owls, or Doves
Run the evolutionary game theory simulations using the percent of total population represented by the totals calculated in Step 1
Calculate the linear best fit regression model using the simulations from Step 2 for each species type.
Plug the percentages from Step 1 into the model developed in Step 3 to calculate the predicted total population energy (or success) for each species and divide by the number of individuals within that species for the given tournament year.
Table 7. Average Predicted Energy, Hawks & Owls, Less Linear Tournament Years
What makes Tables 6 & 7 difficult is that in virtually every year, regardless of tournament linearity, the Hawks are predicted to do better on a per individual basis than the Owls. The only exception to this is this year, 2018, when the Owls did slightly better. However, as Table 2 & 3 show, according to the Poisson, the Owls definitely do better in Linear years than the Hawks and vice versa. So why aren't the Owls predicted to better than the Hawks on average in the linear years? The answer may be associated with the T-Test statistic included in Tables 6 & 7. The distribution of Hawk/Owl energy (or success) is significantly different in Linear years, but is not significant in Less Linear years. Also, the 'difference' columns do not attain significance either when compared (p<0.19). More about all of this later.
That's it for Linear years. There are other factors affecting the relative strength of Hawks & Owls as well, two others to be precise. The first of those, Separation, will be the subject of my next post!