data(package = "networkdata")Social Network Analysis
Worksheet 2: Exploring Functions and Graph Theory
Exploring packages and functions
(Interactive Session)
Handling network data
Inbuilt network data
The networkdata package includes around 1000 datsets and more than 2000 networks. Throughout the course will use several examples using data from this package. Spend some time exploring datasets in this package (you will be asked to choose and work on one of them for you empirical study).
- Freeman’s datasets from http://moreno.ss.uci.edu/data (not available anymore)
- Movie networks from https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/T4HBA3
- Covert networks from https://sites.google.com/site/ucinetsoftware/datasets/covert-networks
- Animal networks from https://bansallab.github.io/asnr/
- Shakespeare’s plays networks build with data from https://github.com/mallaham/Shakespeare-Plays
- Some networks from http://konect.uni-koblenz.de/
- Tennis networks compiled from https://github.com/JeffSackmann (please give credit to him if you use this data)
- Star Wars Character Interactions (Episode 1-7) from https://github.com/evelinag/StarWars-social-network
Graph Theory in R with igraph
We’ll use the igraph package to explore key graph-theoretical concepts.
library(igraph)Warning: package 'igraph' was built under R version 4.5.2
Undirected Graph
We’ll use an undirected graph with 8 nodes.
g <- graph_from_literal(
1 -- 2,
1 -- 3,
2 -- 4,
3 -- 5,
4 -- 5,
5 -- 6,
6 -- 7,
6 -- 8,
7 -- 8
)
plot(g, vertex.label.cex = 1.2, vertex.size = 20)Degree & Degree Distribution
deg <- degree(g)
deg1 2 3 4 5 6 7 8
2 2 2 2 3 3 2 2
# Degree distribution
dist <- degree_distribution(g)
plot(dist, type = "h", main = "Degree Distribution", xlab = "Degree", ylab = "Frequency")Identify nodes with the highest and lowest degree. How does the distribution look?
Graph Diameter
diameter(g)[1] 4
What is the longest shortest path in the graph?
Shortest Paths
Find shortest paths from node 1 to all others:
sp <- distances(g, v = 1)
sp 1 2 3 4 5 6 7 8
1 0 1 1 2 2 3 4 4
Do you understand the output? What is the shortest path from node 1 to node 6?
Adjacency Matrix
adj_matrix <- as_adjacency_matrix(g, sparse = FALSE)
adj_matrix 1 2 3 4 5 6 7 8
1 0 1 1 0 0 0 0 0
2 1 0 0 1 0 0 0 0
3 1 0 0 0 1 0 0 0
4 0 1 0 0 1 0 0 0
5 0 0 1 1 0 1 0 0
6 0 0 0 0 1 0 1 1
7 0 0 0 0 0 1 0 1
8 0 0 0 0 0 1 1 0
Use the adjacency matrix to compute the degree of each node. Compare with degree(g).
Cutpoints (Articulation Points)
articulation_points <- articulation_points(g)
articulation_points+ 2/8 vertices, named, from 3a13d41:
[1] 6 5
Bridges (Critical Edges)
bridge_edges <- which(is.na(edge_connectivity(g)))
E(g)[bridge_edges]+ 0/9 edges from 3a13d41 (vertex names):
# Alternatively use:
bridges(g)+ 1/9 edge from 3a13d41 (vertex names):
[1] 5--6
Visualize Cutpoints and Bridges
V(g)$color <- ifelse(V(g) %in% articulation_points, "red", "skyblue")
E(g)$color <- ifelse(E(g) %in% bridges(g), "red", "black")
plot(g, vertex.size = 20, vertex.label.cex = 1.2)Which nodes and edges are critical to keeping the graph connected?
Directed Graph
Now let’s work with a directed graph.
g_dir <- graph_from_literal(
A -+ B, A -+ C,
B -+ D,
C -+ D,
D -+ E,
E -+ F,
F -+ C
)
plot(g_dir, vertex.label.cex = 1.2, vertex.color = "lightcoral", edge.arrow.size = 0.5)In-Degree and Out-Degree
degree(g_dir, mode = "in") # incoming linksA B C D E F
0 1 2 2 1 1
degree(g_dir, mode = "out") # outgoing linksA B C D E F
2 1 1 1 1 1
Strongly Connected Components
components(g_dir, mode = "strong")$membership
A B C D E F
1 2 3 3 3 3
$csize
[1] 1 1 4
$no
[1] 3
Directed Paths and Diameter
diameter(g_dir, directed = TRUE)[1] 4
Explore how cycles and direction affect path lengths.
Exercises
Create some other small unidrected and directed graphs and see how the above measure vary on them
import the Florentine marriage and business network from the
networkdatapackage and compute the appropriate measures from above on it
library(networkdata)
data("flo_marriage")
data("flo_business")