Social Network Analysis

Worksheet 8: Conditional Uniform Graph Distributions

Author

Termeh Shafie

Introduction

In this session, we will be using conditional uniform graph distributions to simulate random networks. These random networks correspond to the null model and generate the null distribution to which we can compare our observed features to. Thus, we can conclude whether or not an observed feature of interested is significantly different than those from the null model. Most of the examples here are those presented in the lecture.

Packages needed

library(statnet)
library(igraph)
library(ggraph)
library(intergraph)
library(patchwork)
library(networkdata)

Object types

We will be primarily be working with matrix, network and graph objects. It is important that you can understand and pay attention to these since some functions only work with graph objects, and others with network/matrix objects. We try to keep it clear here by using suffix g, net and mat to clarify object assignment.

The Coleman Data

Load a dataset and extract adjacency matrix

We are going to use a data set, coleman, which is automatically loaded with the package statnet. To get information about it type ?coleman and select Colemans High School Friendship Data. This should open a help file with information about the data set. Read the description of the data in the help file in order to know what you are working with. To load the data in your session:

data(coleman, package = "sna")

As described in the help file, the data set is an array with 2 observations on the friendship nominations of 73 students (one for fall and one for spring). We will start by focusing on the fall network here, and create the adjacency matrix for the network:

fall_mat <- coleman[1,,] 

Q1: How can you check whether the network is directed or undirected?

Q2: How can you calculate the number of ties you have in the fall network?

Visualize the network

Create a graph object from the adjacency matrix and visualize the network:

fall_g <- graph_from_adjacency_matrix(fall_mat, "directed")
fall_p <- ggraph(fall_g , layout = "nicely") + 
          geom_edge_link(edge_colour = "#666060", end_cap = circle(9,"pt"), 
                         n = 2, edge_width = 0.4, edge_alpha = 1, 
                         arrow = arrow(angle = 15, 
                         length = unit(0.1, "inches"), 
                         ends = "last", type = "closed"))  +
            geom_node_point(fill = "#525252",colour = "#FFFFFF", 
                           size = 5, stroke = 1.1, shape = 21) + 
            theme_graph() + 
          ggtitle("fall friendship network") +
            theme(legend.position = "none")
fall_p

Uniform graph distribution given expected density: \({\cal{U}}|E(L)\)

Calculate the density of the Coleman fall network. Density is given as the number of present ties divided by the total number of possible ties in the network. We can use the adjacency matrix to calculate this

sum(fall_mat)/(dim(fall_mat)[1]*(dim(fall_mat)[1]-1))
[1] 0.04623288

but we can also use the graph object and call a function from igraph

edge_density(fall_g)
[1] 0.04623288

To generate one random graph with the same density on average as the observed fall network, we write:

sim1_mat <- rgraph(n = dim(fall_mat)[1], m = 1, 
                 tprob = edge_density(fall_g), mode = "digraph")
sim1_g <- graph_from_adjacency_matrix(sim1_mat, "directed")

Make sure you understand all arguments included. The random network and the observed network may not have the exact same number of edges but stochastically, it has the same density:

sum(sim1_mat)
[1] 267
sum(fall_mat)
[1] 243

Now we can plot the random network we generated next to the observed one to compare them:

sim1_p <- ggraph(sim1_g, layout = "nicely") + 
  geom_edge_link(edge_colour = "#666060", 
                 end_cap = circle(9,"pt"), n = 2, 
                 edge_width = 0.4, edge_alpha = 1, 
                 arrow = arrow(angle = 15,
                               length = unit(0.1, "inches"), 
                               ends = "last", type = "closed"))  +
  geom_node_point(fill = "#525252", colour = "#FFFFFF", 
                  size = 5, stroke = 1.1, shape = 21) + 
  theme_graph() + 
  ggtitle("random network") +
  theme(legend.position = "none")
fall_p + sim1_p # 'patchwork' required for this