Network analysis is a set of techniques that allow researchers to describe relations among actors and to analyze the social structures that emerge from the recurrence of these relations. The main purpose of NA is to explain the social phenomenon as well as possible, and this analysis is often called as social network analysis (SNA).In here, the term social network refers to the expression of a social relationship among individuals, families, households, villages, communities, regions, and so on. Beside contribution to explanation of quantitative concepts, SNA provide to be understood phenomena better by enabling the quantitative measures of qualitative concepts.
Within the scope of this study, examples of application will be conducted as an introduction to social network analysis by producing synthetic data using R programming language.
In the SNA, data set generally is converted into matrix format before analysis. But, other forms of data set is supported by some programming languages such as R or Python.
Network analysis can be done in many different subject areas. Featured areas can be listed as follows:
- Marketing Analytics
- Social relationships
- Pharma
- Banking Transactions
- Supply Chain
- Telecom
Components of a Network
Node (Vertex): Nodes or vertices are the units that are related to each other. An edge (a set of two elements) is drawn as a line connecting two vertices.
Edge: An edge, a set of two elements, is described as a line linking two nodes.
Components of a network is presented at Figure 1.
Figure 1: Components of a Network
In the next sections, R code blocks will be given step by step, respectively. After that, the results obtained will be presented.
Loading Libraries
sapply(c("dplyr", "tibble", "tidyr", "ggplot2","ggraph", "igraph", "tidygraph"), require, character.only = TRUE)
Indirected Networks
set.seed(1)
edge<-c(sample(1:3, 6, replace=T))#Simple Random Sampling (SRS) With Replacement)
g1 <- graph(edges=edge, n=3, directed=F )
plot(g1)
Directed Networks
set.seed(3)
edge<-c(sample(1:3, 6, replace=T))#Simple Random Sampling (SRS) With Replacement
g1 <- graph(edges=edge, n=3, directed=T)
plot(g1)
Determining number of vertices or nodes
set.seed(5)
edge<-c(sample(1:5, 10, replace=T))
g1 <- graph(edges=edge, n=15, directed=T)
plot(g1)
Creating the edge list with vertex names in directed networks
set.seed(7)
edge<-c(sample(LETTERS[1:5], 10, replace=T))
g1 <- graph(edges=edge, directed=T)
plot(g1)
g1
Names of vertices or nodes are given below depending on the R code block above.
IGRAPH 6ea3d94 DN-- 3 5 --
+ attr: name (v/c)
+ edges from 6ea3d94 (vertex names):
[1] B->C D->B B->C C->B D->C
In netwok, it can be defined isolates by providing a list of their names.
set.seed(9)
edge<-c(sample(LETTERS[1:5], 10, replace=T))
isolated_edge<-c(sample(LETTERS[6:20], 8, replace=F))#SRS without Replacement
name <- graph(edge, isolates=isolated_edge)
plot(name, edge.arrow.size=.5, vertex.color="red", vertex.size=20,
vertex.frame.color="brown", vertex.label.color="black",
vertex.label.cex=0.8, vertex.label.dist=3, edge.curved=0.2)
Determining Directed Networks Manually
plot(graph_from_literal(Artvin--+Trabzon, Trabzon+--Ankara, Trabzon+--
Mardin, Trabzon+--İzmir),
edge.curved=0.8, vertex.color="red", edge.arrow.size=0.5,
vertex.size=50)
Determining Undirected Networks Manually
plot(graph_from_literal(Artvin---Trabzon, Trabzon---Ankara, Trabzon---
Mardin, Trabzon---İzmir),
edge.curved=0.8, vertex.color="gray", edge.arrow.size=0.5,
vertex.size=50)
Identifying Mutual Directions Manually
plot(graph_from_literal(Artvin+-+Trabzon, Trabzon+-+Ankara, Trabzon++Mardin,
Trabzon+-+İzmir),
edge.curved=0.8, vertex.color="gray", edge.arrow.size=0.5,
vertex.size=50)
Network Attributes
E(name)# Edges
+ 5/5 edges from 87b24d1 (vertex names):
[1] C->E C->C C->D C->D E->B
V(name)# Vertices or Nodes
+ 12/12 vertices, named, from 87b24d1:
[1] C E D B J O Q H N K P F
Implementation Examples of Network Analysis
Version 1
set.seed(11)
data<-tibble(A=sample(1:6,400, replace=T),B=sample(1:10,400, replace=T),
C=sample(200:204,400, replace=T))
graph <- graph.data.frame(data)#This function creates igraph graphs from data frames or vice-versa.
plot(graph)
Version 2
set.seed(13)
data<-tibble(A=sample(1:6,400, replace=T),B=sample(1:10,400, replace=T),
C=sample(200:204,400, replace=T))
graph1 <- graph_from_data_frame(data)#graph_from_data_frame creates igraph graphs from one or two data frames. It has two modes of operatation, depending whether the vertices argument is NULL or not.
ggraph(graph1) +
geom_edge_link(aes(colour = factor(C))) +
geom_node_point()
Version 3: Layout “Linear”
set.seed(15)
data<-tibble(A=sample(1:6,400, replace=T),B=sample(1:10,400, replace=T), C=sample(200:204,400, replace=T))
graph1 <- graph_from_data_frame(data)#graph_from_data_frame creates igraph graphs from one or two data frames. It has two modes of operatation, depending whether the vertices argument is NULL or not.
ggraph(graph1, layout = 'linear', circular = TRUE) +
geom_edge_link(aes(colour = factor(C))) +
geom_node_point()
Version 4: Layout “Star”
A simple layout generator, that places one vertex in the center of a circle and the rest of the vertices equidistantly on the perimeter.
set.seed(17)
data<-tibble(A=sample(1:6,400, replace=T),B=sample(1:10,400, replace=T), C=sample(200:204,400, replace=T))
y<-c(rep("red",5),rep("blue", 5))
graph1 <- graph_from_data_frame(data)
ggraph(graph1, layout="star") +
geom_edge_fan(color="gray50", width=0.8, alpha=0.5) +
geom_node_point(color=y, size=10) +
theme_void()
Version 5: Layout “Circle”
Layout Circle is used to describe place vertices on a circle, in the order of their vertex ids.
set.seed(19)
data<-tibble(A=sample(1:6,400, replace=T),B=sample(1:10,400, replace=T), C=sample(200:204,400, replace=T))
y<-c(rep("green",5),rep("blue", 5))
graph1 <- graph_from_data_frame(data)
ggraph(graph1, layout="circle") +
geom_edge_fan(color="gray50", width=0.8, alpha=0.5) +
geom_node_point(color=y, size=10) +
theme_void()
Version 6: Layout “Grid”
This layout places vertices on a rectangulat grid, in two or three dimensions.
set.seed(19)
data<-tibble(A=sample(1:6,400, replace=T),B=sample(1:10,400, replace=T), C=sample(200:204,400, replace=T))
y<-c(rep("green",5),rep("blue", 5))
graph1 <- graph_from_data_frame(data)
ggraph(graph1, layout="grid") +
geom_edge_fan(color="gray50", width=0.8, alpha=0.5) +
geom_node_point(color=y, size=10) +
theme_void()+
ggtitle("Grid Layout")
Version 7: Layout “Sphere”
Layout Sphere is used to define place vertices on a sphere, approximately uniformly, in the order of their vertex ids.
set.seed(19)
data<-tibble(A=sample(1:6,400, replace=T),B=sample(1:10,400, replace=T), C=sample(200:204,400, replace=T))
y<-c(rep("green",5),rep("blue", 5))
graph1 <- graph_from_data_frame(data)
ggraph(graph1, layout="sphere") +
geom_edge_fan(color="gray50", width=0.8, alpha=0.5) +
geom_node_point(color=y, size=10) +
theme_void()+
ggtitle("Layout Sphere")
Version 7: Layout “KK”
kkconst: Numeric scalar, the Kamada-Kawai vertex attraction constant. Typical (and default) value is the number of vertices.
set.seed(19)
data<-tibble(A=sample(1:6,400, replace=T),B=sample(1:10,400, replace=T), C=sample(200:204,400, replace=T))
y<-c(rep("green",5),rep("blue", 5))
graph1 <- graph_from_data_frame(data)
ggraph(graph1, layout="kk") +
geom_edge_fan(color="gray50", width=0.8, alpha=0.5) +
geom_node_point(color=y, size=10) +
theme_void()+
ggtitle("Layout KK")
Version 7: Layout “FR”
Layout FR is used to describe place vertices on the plane using the force-directed layout algorithm by Fruchterman and Reingold.
set.seed(19)
data<-tibble(A=sample(1:6,400, replace=T),B=sample(1:10,400, replace=T), C=sample(200:204,400, replace=T))
y<-c(rep("brown",5),rep("blue", 5))
graph1 <- graph_from_data_frame(data)
ggraph(graph1, layout="fr") +
geom_edge_fan(color="gray50", width=0.8, alpha=0.5) +
geom_node_point(color=y, size=10) +
theme_void()+
ggtitle("Layout FR")
Version 8: Layout “MDS”
Layout MDS is used to describe multidimensional scaling of some distance matrix defined on the vertices of a graph.
set.seed(19)
data<-tibble(A=sample(1:6,400, replace=T),B=sample(1:10,400, replace=T), C=sample(200:204,400, replace=T))
y<-c(rep("brown",5),rep("green", 5))
graph1 <- graph_from_data_frame(data)
ggraph(graph1, layout="mds") +
geom_edge_fan(color="gray50", width=0.8, alpha=0.5) +
geom_node_point(color=y, size=10) +
theme_void()+
ggtitle("Layout MDS")
Version 9: Layout “LGL”
Layout LGL is defines as a layout generator for larger graphs.
set.seed(19)
data<-tibble(A=sample(1:6,400, replace=T),B=sample(1:10,400, replace=T), C=sample(200:204,400, replace=T))
y<-c(rep("brown",5),rep("green", 5))
graph1 <- graph_from_data_frame(data)
ggraph(graph1, layout="lgl") +
geom_edge_fan(color="gray50", width=0.8, alpha=0.5) +
geom_node_point(color=y, size=10) +
theme_void()+
ggtitle("Layout LGL")
Version 10: Layout “Arch”
set.seed(19)
data<-tibble(A=sample(1:6,400, replace=T),B=sample(1:10,400, replace=T), C=sample(200:204,400, replace=T))
graph1 <- graph_from_data_frame(data)
ggraph(graph1, layout = 'linear') +
geom_edge_arc(color = "brown", width=0.7) +
geom_node_point(size=5, color="gray50") +
theme_void()+
ggtitle("Layout Arch")
Version 11: Layout “Dendrogram”
set.seed(1453)
data1 <- data_frame(from=sample(1:15,30, replace=T),to=sample(1:15,30, replace=T))
letter<-sample(LETTERS[1:15],30, replace=T)
data1<-as.matrix(data1)
rownames(data1)<-letter
hc1 <- hclust(dist(data1), "ave")
dendroNetwork(hc1)
Version 12: Layout “Dendrogram” with Link Type “Diagonal“
set.seed(1453)
data1 <- data_frame(from=sample(1:15,30, replace=T),to=sample(1:15,30,
replace=T))
letter<-sample(LETTERS[1:15],30, replace=T)
data1<-as.matrix(data1)
rownames(data1)<-letter
hc1 <- hclust(dist(data1), "ave")
dendroNetwork(hc1, height = 600, linkType = "diagonal")
Version 13: Layout “Dendrogram” with Link Type “Elbow”
set.seed(1461)
data1 <- data_frame(from=sample(1:15,30, replace=T),to=sample(1:15,30,
replace=T))
letter<-sample(LETTERS[1:15],30, replace=T)
data1<-as.matrix(data1)
rownames(data1)<-letter
hc1 <- hclust(dist(data1), "ave")
dendroNetwork(hc1, linkType = "elbow", textColour = c("red", "green", "blue")[cutree(hc1, 3)],treeOrientation = "vertical")
Conclusion
In this study, it is aimed to raise awareness about network analysis (NA or SNA) by analyzing the synthetic data produced after revealing the basic components of network analysis.
We hope it will be useful and raise awareness.
Stay with science and technology.
Your’s respectfully.
References
https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/491572/socnet_howto.pdf
https://dash.harvard.edu/bitstream/handle/1/4276348/Christakis_SocialNetwkVisual.pdf?sequence=2
https://cran.r-project.org/web/packages/tidygraph/index.html
https://sna.stanford.edu/lab.php?l=1
https://sna.stanford.edu/lab.php?l=5
https://www.sciencedirect.com/topics/social-sciences/network-analysis
https://www.slideshare.net/bodacea
https://www.sagepub.com/sites/default/files/upm-binaries/35208_Chapter1.pdf
https://www.rdocumentation.org/packages/networkD3/versions/0.4/topics/dendroNetwork
https://sites.google.com/a/umn.edu/social-network-analysis/terminology
Static and dynamic network visualization with R
https://igraph.org/r/doc/layout_with_kk.html
Kamada, T. and Kawai, S.: An Algorithm for Drawing General Undirected Graphs. Information Processing Letters, 31/1, 7–15, 1989.
https://igraph.org/r/doc/layout_on_grid.html
https://igraph.org/r/doc/layout_with_lgl.html
Fruchterman, T.M.J. and Reingold, E.M. (1991). Graph Drawing by Force-directed Placement. Software – Practice and Experience, 21(11):1129-1164.
https://igraph.org/r/doc/layout_with_fr.html
https://igraph.org/r/doc/layout_on_sphere.html
https://igraph.org/r/doc/layout_in_circle.html