Search

# Hawks, Doves, & Owls: Predicting Tournament Entropy with Evolutionary Game Theory - Chp. 23

by Jess Behrens

"In popular science, entropy is just viewed as the gradual increase of chaos, randomness, disorder and lack of control in a system. This popular view is often argued to apply to human society to show the need for control. But what resonates with me personally is what entropy really is in thermodynamics. At the most basic level, entropy is directly related to the number of ways you can arrange a system."

--Amanda, Student at the Oxbow School

I found the above quote in an online presentation. The student did not give her last name, but the above quote nails what entropy is, especially with respect to the NCAA Tournament. If you just consider the games after the First Four, there are 9 Quintillion possible combinations for a given tournament bracket. So, tournament entropy really is, as Amanda points out, about the ways you can arrange a system.

And that is why it is so important to be able to quantify the amount of entropy within a system first and foremost; predicting the entropy is secondary, but still really important. Everyone I talk to about this project always asks me if I can predict the tournament upsets. It's definitely possible to do that. But, as I covered in Chapter 21, not all upsets are 'Upsets'. In fact, trying to understand how the tournament will play out using seeding is futile; and it is also why I've put all of this effort into identifying a simpler way of classifying Tournament teams.

In Chapter 21, I went over entropy & how it varies depending on how you classify the teams into categories (1 & 2). Chapter 22 looked at how we can use evolutionary game theory simulations to group the tournament years into 2 separate clusters which have different optimal Game Theory strategies (3 & 4).

1. Introduce the idea of entropy and why it is the ideal tool for quantifying tournament results.

2. Cover the difference in entropy by tournament seed vs. Evolutionary Game Theory (EGT) Strategy.

3. Show how the Tournaments group together into 2 clusters (Seaborn clustermap) based on regression

results from EGT simulations.

4. Examine the 2nd Round Game Theory equilibrium state in these two clusters.

5. Successfully predict the entropy in each of the Tournaments using EGT simulation totals within a

multivariate regression.

In this post, I will cover how it is possible to predict the total adjusted entropy by tournament year for the two clusters identified in Chapter 22 (3 & 4) as well as what it means.

First, though, I have to explain 2 important points. The following results do not include the 2016 Tournament entropy score. If you recall, 2 teams, SMU & Louisville, had exceptionally good seasons in 2015-2016, but decided to stay home as punishment for NCAA rules violations. The method I have developed here required that each of those teams be included in calculations as well as in both networks. The overall impact of these two teams, the 'space' they took up within my multi-dimensional system, was too large to leave them out. One of the major differences between the way in which I create these networks vs. how other people do this sort of work is that teams are not completely separate from each other in my system; I treat them as entities that are part of a system that are competing for spots within that system. Thus, leaving out two teams with records & statistics like SMU & Louisville would be a bit like calculating the location of earth in its orbit, but leaving out the gravitational impact of Saturn & Uranus. Even though 2016 falls in Cluster B, it was not included in the regression results shown below because neither of these two teams actually had an opportunity to help 'create' the entropy score by playing Tournament games.

Secondly, I need to introduce an important concept that I haven't yet addressed. One of the terms used as an independent variable in this regression analysis is actually a combination of 3 EGT Simulation energy totals. I call this total the 'Remainder'. It is, mathematically speaking:

Remainder = Owl Total Energy - Hawk Total Energy + Dove Total Energy

I first looked at 'Remainder' as a concept because of the unique role that Dove-Owls play in my version of the Hawk/Dove game. They are an important group in the tournament. With 68 teams, there are enough participants to include teams who are clearly good enough to win a game, but who won't go much farther. The fact that they are good enough to win a game demands that they be included in the invite list; this isn't an argument for reducing the size of the tournament when there are over 300 division I basketball teams. In the EGT simulation math, these teams play like Owls when they meet a Dove and like a Dove when they meet either an Owl or a Hawk. Thus, they aren't completely dependent on sharing, like Doves are, but they are more dependent on sharing than Owls. Since Evolutionary Game Theory simulations focus on identifying which population benefits the most rather than 'picking the winner', the notion of 'remaining energy', or energy that is left on the table for future use, is vitally important to the success of Dove-Owls & Owls as populations within this system. Because Dove-Owls are more dependent on sharing, the 'remaining' energy is most important to them.

Keep in mind that this evolutionary game theory 'game' was developed to illustrate how cooperation & altruistic behavior is found in biological populations. If evolution is all about survival of the fittest, how is it that many species share food & reproductive opportunities, rather than fighting to the 'death' to keep them for themselves? The Hawk/Dove game was first developed to show how Doves, who don't put up a fight when challenged while feeding, can out compete Hawks (or other predatory birds) as a population by sharing their food. So, the lesson of the Hawk/Dove game is the well known axiom of how it is more important to win the war than a single battle.

Thus, linking the 'Remainder' to an accurate prediction of adjusted tournament entropy (2nd - 5th rounds) would provide evidence that tournament teams exhibit altruistic behavior, even if doing so is unintentional. Said another way, it would suggest that teams may be helping one another, despite the fact that they aren't actually competing on the floor (i.e. they are in different parts of the tournament bracket). Since I showed in Chapter 22 that the two clusters identified in Chapter 22 have different pure, game theory equilibria, being able to successfully predict the total adjusted entropy for these two separately would potentially tie together game theory equilibria & altruism in population dynamics (evolutionary game theory). It would suggest that, as Amanda said above, 'the number of ways you can arrange a system' may be directly tied to altruism and competition, rather than just competition.

Table 1 shows the output from 4 separate regression analyses. There are 2 regression equations, labeled Regression 1 & Regression 2, that have been used to predict the entropy scores derived from using EGT strategy & Tournament seed as separate 'binning' mechanisms. So, 2 regression equations x 2 binning mechanisms = 4 total regression analyses. Within each of these 4, the regression equations were used to predict 3 different data aggregations - 'All Tournament Years', 'Cluster A', & 'Cluster B' - for a total of 12 individual regression equations.

Table 1. Predicting Tournament Entropy, Linear Regression Analysis Tabulation

Table 1 lists each of the 4 analyses as a column, while each of the three aggregation 'levels' is shown as a separate row. Significant values have been highlighted in bold. Note: R-Values included as independent variables were developed from the 10,000 Evolutionary Game Theory Simulations that did not consider tournament entropy.

Regression 1 included only 1 independent variable - the R value from the EGT Simulation Regressions where Dove Total Energy was used as a predictor of Owl Total Energy. It is significant when predicting EGT Strategy Entropy across all tournament years, but is especially accurate when predicting Cluster B (Adj. R-squared = 0.919, p<=0.001). Regression 1 is not significant when applied to Cluster A. Of note is the fact that Cluster B is Hawk purely hawk dominant from a game theory perspective. Regression 1 is loses all significance when predicting Seed Entropy.

Regression 2 is more complex & includes 3 variables -

1. The R value from the EGT Simulation Regressions where Dove Total Energy was used as a

predictor of Owl Total Energy.

2. The R value from the EGT Simulation Regressions where Hawk Total Energy was used as a

predictor of Dove-Owl Total Energy.

3. The R value from the EGT Simulation Regressions where the Remainder Total was used as a

predictor of Dove-Owl Total Energy.

It is not significant when predicting All Tournament Years grouped together. Regression 2 does have a significant f-statistic for Cluster B years, but only variable 1 (from above) remains significant. Where Regression 2 really shines is in Cluster A. Including these two additional variables, one of which is the remainder total described earlier in this post, produces a significant regression (Adj. R-squared = 0.833, p<=0.05). Again, of note is the fact that Cluster A years have a different game theory equilibrium, Owl/Hawk, than Cluster B. As with Regression 1, Regression 2 is of no use in predicting Seed Entropy. In fact, it is almost significantly bad (Adj. R-squared = -0.672 & -0.765).

What Table 1 suggests is that the game theory based Hawk dominant years, Cluster B, are not as dependent on the impact of Dove-Owls & the benefits the EGT sharing strategies receive from that 'sharing'; while Owl/Hawk dominant years are heavily impacted by 'sharing'. This, of course, is what I expected. While a lot more work needs to be done to establish how 'sharing' occurs & what, exactly, that means to the teams playing a game, these results point to a degree of population based altruism within the NCAA Tournament. If that is the case, than the multitude 'of ways you can arrange' the NCAA Tournament may be limited by a degree of cooperation within this competitively based system.

<--Chapter 22 Chapter 24-->

2 views

### Recent Posts

See All

#### Data Visualization Examples

©2018 by jessbehrens.com. Proudly created with Wix.com