©2018 by jessbehrens.com. Proudly created with Wix.com

Hawks, Doves, & Owls: The Loss Network & Girvan-Newman Clusters - Chp. 24

Updated: Jun 4, 2019

by Jess Behrens

© 2005-2018 Jess Behrens, All Rights Reserved​

If we consider, again, the homophily graph first displayed in Chapter 20, reproduced here as Figure 1, it is clear that links & triads in the Loss Network are closer to random (1:1) than the Win Network. I wanted to know more about why this was the case to be sure that using the Loss Network won't confuse the machine learning algorithms I use to classify the tournament.

Figure 1. Out vs. In Connections by Evolutionary Game Theory Strategy

One way to examine the Loss network is to use something called Girvan-Newman clusters, which removes links from a network one at a time to identify clusters. The algorithm starts by calculating the betweenness centrality for the entire network & removing the link(s) with the highest score. Any node(s) that are completely removed from the network are given the same cluster identifier. The betweenness centrality is then re-calculated and the process is repeated. Thus, cluster 1 contains the node(s) that fall out of the network first, cluster 2 second, etc. until all of the nodes have been identified. In this way, the cluster identifier is an indicator of how central a given node, or set of nodes, is to the network, with the first being least central and the last being most central. Thus, the cluster id presented here is a relative measure of 'risk' of loss in the first round, with a low id conveying low risk and a high id conveying high risk.

As you can see in Figure 1, the Owls are very nearly 1:1 as far as out vs. in strategy triad & link formation is concerned. They don't seem to have any preference forming links or triads with other owls vs. the other strategy types. As I said in Chapter 20, this visual cue is reaffirmed by the fact that the owl ratio falls within 95% confidence interval, while the other three types fall well outside their respective confidence intervals.

I was concerned because I didn't know what this meant. The Girvan-Newman clusters may provide some insight. A working null hypothesis about the loss network, and one that is used to classify the teams into EGT strategies, is that the Owls are the least connected, with the Hawks second, Dove-Owls third, & Doves most connected. Thus, if the null hypothesis holds, the majority of early Girvan-Newman clusters should be Owls, followed by Hawks, followed by Dove-Owls, with Doves being the last to go. Likewise, teams that exit the tournament early should also dominate the last clusters while teams that play late into the tournament should be among the first to fall out.

The Girvan-Newman clusters in the Loss network seem to conform to this hypothesis, as Figure 2 shows (FYI: 0.5 are teams who barely lost, but still lost, which is why they are grouped with the zeros). The first 499 Girvan-Newman clusters are single team clusters, which means well over half of the teams are separated from each other by a single link. The final 500 teams fall out all at once. As Figure 2

Figure 2. Girvan-Newman Cluster Tabulation by Tournament Wins & EGT Strategy, Loss Network

shows, the early clusters, 1-210, are clearly dominated by Owls & teams who win at least one game. Significance is measured using the Poisson distribution & expected observations are taken as a percent of the total 934 teams by type. Of note is the fact that 'Hawks', while well represented within these early clusters, are not significant at p < 0.05. That conforms to expectations given that they are categorized as Hawks (using the Key Player metric & not related to the algorithm presented here) because they have additional risk of a first round loss. In the Hawk-Dove game, Hawks are the only strategy to have an extra 'cost' term. The second group, clusters 211-499, also conform to expectations, with first round losses & Doves being significantly high. Also conforming to expectation in this second group is the fact that Owls are significantly low while Hawks are not significant either way. Cluster 500 continues to reinforce these expectations, with first round losses purely dominating the group; doves, dove-owls, & hawks are either not at all significant or nearly significantly low; & Owls are low at extremely significant levels.

The early clusters (1-210) contain all 6 Owl champions as well (Figure 3). In fact, if you look only at the tournament champions, 6 of the first 7 who fall out of the loss network are Owls. Only 2018 Villanova is a Hawk. I know saying 2018 Villanova is a Hawk contradicts one of my earlier posts - as I've been writing this blog over the past 6 months, I've continued to discover new queries that improve the 'picture' provided by both the loss & win networks. None of the early trends I identified have changed.

Figure 3. Tournament Champions, Girvan-Newman Cluster ID & EGT Strategy

However, the EGT strategy designation of some of the teams have. This is to be expected, and has had only a small effect on the simulations. Evolutionary Game Theory measures interactions by population. Thus, if an individual team is recategorized as an 'owl from a 'hawk', the effect is only measurable if another team does not become a 'hawk' from an 'owl'. It's the percentage of the overall population by strategy that is important. While overall population distribution has changed some since my early posts, the relative percentages have been very stable, especially among the doves & dove-owls.

So, as you can see, Figures 2 & 3 seem to confirm that even though the Owl strategy is nearly random for links & triads in the loss network, using the teams seem to fall out of the loss network as expected given how each of the strategy types function within evolutionary game theory.

<--Chapter 23 Chapter 25-->

#Homophily #KeyPlayer #BetweennessCentrality #NCAATournament #NCAA #EvolutionaryGameTheory #NetworkAnalysis #MensCollegeBasketball