## Statistical Analysis of Networks (syllabus)

I teach a course on the statistical analysis of networks which provides a survey of the various tools for used in network analysis. The overarching goal of the course is that, upon completion, students will be capable of developing research questions from a network perspective and incorporating network-based tools in their own research. The course is designed to be primarily methodological, dedicating the majority of the time to working through the mechanics of network-based tools. Students will also gain experience with R, a commonly used software program for network data management, analysis, and visualization.

Even if you are not taking the course, feel free to browse the lecture notes and labs. As always, I welcome any feedback!

Even if you are not taking the course, feel free to browse the lecture notes and labs. As always, I welcome any feedback!

**Lecture Notes and Labs****Introduction to Network Analysis**

The introductory lecture covers the basics of network analysis: what it is, why we do it, etc. The learning goals of this lecture are to 1) understand the difference between individual vs. network approaches to research and 2) introduce basic data elements in network analysis.

__Files__:

Introduction lecture (pdf)

R script for Introduction lecture (.R)

Raw course survey data (.csv)

Course survey data instrument (pdf)

R script to clean raw course survey data (.R)

Adjacency matrices for course survey data in .csv format (Spend time net) (Talk about course net) (Trust net)

R data file with cleaned course survey data and network objects (.rdata)

**Network Data Structures and Sources of Network Data**

These lectures cover the basics of using matrices to represent networks. Fundamental concepts of graph theory are also introduced. Also, various sources of network data are discussed. Special attention is given to the Prison Inmate Network Study (PINS) and the Boston Special Youth Project (SYP) Affiliation data.

__Files__:

Data structures lecture (pdf)

Sources of data lecture (pdf)

R script for Data structures lecture (.R)

**NOTE: This site is under construction as I am currently teaching the course. All material appearing subsequent to this note is being revised. Everything above, though, is the new stuff!****This lecture covers the basics of using matrices to represent networks. Fundamental concepts of graph theory are also introduced. The accompanying R scripts provides an introduction to R (Lab 1) and an introduction to network analysis in R (Lab 2).**

Matrices, Graph Theory, and R

Matrices, Graph Theory, and R

__Lecture Slides__

graphs_matrices.pdf |

__R Scripts__

syntax_lab_1.r |

syntax_lab_2.r |

__Data Files Referenced in Scripts__

lab_2_undirected_example.csv |

lab_2_directed_example.csv |

**Centrality**

This lecture provides an introduction to centrality measures for individual nodes as well as centralization measures for graphs. In particular, the following centrality measures are described: degree, closeness, and betweenness. The accompanying R script (Lab 3) provides an overview of these measures using the SNA library in R.

__Lecture Slides__

centrality.pdf |

__R Scripts__

syntax_lab_3.r |

**Global Structure from Local Structure**

This lecture examines dyadic and triadic structures in networks. The accompanying R script (Lab 4) provides an overview of the dyad.census and triad.census functions in the SNA library in R. The discussion of undirected networks examines dyads and triads using the Noordin Terrorist Network data, available on Sean Everton's website. The discussion of directed networks examines dyads and triads in a discussion network created in one of my courses. In addition, an introduction to inferential statistics for social networks is presented by examining conditional uniform graph models as discussed in regard to Andrew Papachristos's network study of gang homicide.

__Lecture Slides__

global.pdf |

__R Scripts__

syntax_lab_4.r |

__Data Files Referenced in Scripts__

lab_4_friendship_noordin_terrorist.csv |

lab_4_talked_with_network.csv |

**Brokerage**

This lecture examines brokerage structures in networks. Specifically, raw brokerage and types of mediation structures as discussed in Gould and Hernandez, 1989. The accompanying R script (Lab 6 ) provides an overview of the brokerage function in the SNA library in R. The first part of the lab examines fictitious networks designed to illustrate brokerage in small graphs. The second part of the lab examines brokerage using an inmate network (directed graph) and the Noordin Terrorist Network data (undirected graph). The second network is available on Sean Everton's website.

__Lecture Slides__

brokerage.pdf |

__R Scripts__

syntax_lab_6.r |

__Data Files Referenced in Scripts__

get_along_with_network.csv |

get_along_with_attribute.csv |

get_along_with_attributes.csv |

noordin_terrorist_friendship_network.csv |

noordin_terrorist_education_attribute.csv |

**Bipartite Graphs and Projection**

These lectures examine data that that are two-mode. That is, data in which objects in one modality (e.g. people) are connected through a second modality (e.g. clubs). The accompanying R scripts (Labs 7 & 8) provide an overview of how to create adjacency matrices from two-mode networks and how to analyze bipartite graphs. In addition, projection of two-mode networks to one-mode (i.e. unipartite graphs) and weighted-edge networks are discussed.

__Lecture Slides__

bipartite_networks.pdf |

weighted_networks.pdf |

__R Scripts__

crj_605_networks_syntax_lab_7.r |

syntax_lab_8.r |

__Data Files Referenced in Scripts__

asu_ccj_faculty_coauthor_network.csv |

**Exponential Random Graph Models (ERGMs)**

These lectures examine the logic of generative models for network data. The first lecture introduces the exponential random graph model (ERGM) as a flexible tool for specifying network configurations that generate global network structures. The accompanying R script (Lab 9) provides an overview of different dependence specifications of the model (e.g. dyadic independence) and working with node level attributes using the ergm package. The script "roster.to.adjacency.r" takes network data in

*roster*format and converts it to a sociomatrix. The second lecture discusses model degeneracy, model adequacy, and goodness of fit for ERGMs. The accompanying R script (Lab 10) examines functions for goodness of fit and simulation in the ergm package. A comprehensive archive of materials, discussion, syntax, papers, etc. for the ergm package is available at the statnet website.

__Lecture Slides__

ergm_intro.pdf |

ergm_gof.pdf |

__R Scripts__

syntax_lab_9.r |

syntax_lab_10.r |

roster.to.adjacency.r |

__Data Files Referenced in Scripts__

roster_risk_dem_data.csv |

**Stochastic Actor-Based Models (SABMs)**

These lectures examine the logic of simulation based models for network dynamics. Specifically, the stochastic actor-based model (SABM) is an approach to modeling change in network panel data. The accompanying R script (Lab 11) provides an overview the RSiena package. The evolution of an advice network among 75 MBA students is examined (see http://www.stats.ox.ac.uk/~snijders/siena/ for details). The second lecture discusses co-evolution of networks and behavior. The accompanying R script (Lab 12) examines co-evolution using the RSiena package. A comprehensive archive of materials, discussion, syntax, papers, etc. for the RSiena package is available at the Siena website.

__Lecture Slides__

sabm_intro.pdf |

sabm_coevolution.pdf |

__R Scripts__

syntax_lab_11.r |

syntax_lab_12.r |

__Data Files Referenced in Scripts__

mba-advice1.csv |

mba-advice2.csv |

mba-advice3.csv |

mba-performance.csv |

friend1.csv |

friend2.csv |

friend3.csv |

alcohol.csv |

smoke.csv |

drugs.csv |