---
title: "Coevolution of Networks and Behavior with Stochastic Actor-Based Models (SABMs) in R Lab"
date: "CRJ 605 Statistical Analysis of Networks"
output:
html_document:
df_print: paged
theme: cosmo
highlight: haddock
toc: yes
toc_float: yes
code_fold: show
---
```{r,echo=FALSE,eval=TRUE,message=FALSE,warning=FALSE}
library(devtools)
library(network)
library(sna)
library(RSiena)
```
###
*This lab examines the coevolution of longitudinal networks and behavior using the `RSiena` package. Information about the Siena program can be found at:* http://www.stats.ox.ac.uk/~snijders/siena/
***
##**Getting Started**
As discussed in the [Coevolution of Networks and Behavior](https://www.jacobtnyoung.com/uploads/2/3/4/5/23459640/crj_605_networks_11_sabm_slides.pdf), we are trying to build a model that accurately represents the preferences among actors that generated the observed network between discrete time points. In addition, we would like to incorporate the changes that occur in behavior as a consequence of network position. As a result, we need to examine a *coevolution* model of the network and behavior. In other words, we want to examine the simultaneity of **network dynamics** and **behavior dynamics**.
Let's start by working with our advice network again. Recall that the network is 75 MBA students measured at three waves. At each wave, the students were asked to whom they ask for advice. In the previous lab, [Introduction to SABMs lab](https://www.jacobtnyoung.com/uploads/2/3/4/5/23459640/crj_605_networks_11_sabm_a_lab.html), we looked at how we could incorporate actor covariate terms by using the **performance ratings** of each student conducted at each wave. Since the performance ratings were different at each wave, we treated this is a "time-varying covariate". So, we used the `varCovar()` function (instead of the `coCovar()` function).
But, what if performance is influenced by whom one asks for advice? That is, what if performance is a *dependent* variable? In this case, we would want to model the coevolution of performance and the advice network.
Let's go ahead and load the data and prepare the objects for analysis.
###**Advice Network**
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
# clear the workspace and load the libraries.
rm(list = ls())
library(sna)
library(network)
# Load the advice networks.
advice1 <- as.matrix(read.csv("https://www.jacobtnyoung.com/uploads/2/3/4/5/23459640/mba-advice1.csv",as.is=T,header=T,row.names=1))
advice2 <- as.matrix(read.csv("https://www.jacobtnyoung.com/uploads/2/3/4/5/23459640/mba-advice2.csv",as.is=T,header=T,row.names=1))
advice3 <- as.matrix(read.csv("https://www.jacobtnyoung.com/uploads/2/3/4/5/23459640/mba-advice3.csv",as.is=T,header=T,row.names=1))
```
Take a look at a few plots.
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
# We canlook at a compressed version of the networks over all three waves.
# This can be accomplished by adding the sociomatrices, and then recoding the counts to be binary.
compressed <- advice1 + advice2 + advice3 #add the matrices.
compressed[compressed > 1 ] <- 1 #recode values to be binary.
set.seed(605)
gplot(compressed, edge.col = "grey60", vertex.col="darkred", vertex.cex=1.5, main="Advice Compressed")
# Now, build a plot of the networks.
op <- par(mfrow=c(2,2), mai = c(0,0,0.7,0))
set.seed(605)
coords <- gplot(advice1, edge.col = "grey60", vertex.col="coral",label = seq(1,75), vertex.cex=1.5, label.pos = 5, label.cex=0.5)
title("Advice Network (t1)")
gplot(advice2, edge.col = "grey60", vertex.col="lightblue", coord = coords,label = seq(1,75), vertex.cex=1.5, label.pos = 5, label.cex=0.5)
title("Advice Network (t2)")
gplot(advice3, edge.col = "grey60", vertex.col="lavender", coord = coords,label = seq(1,75), vertex.cex=1.5, label.pos = 5, label.cex=0.5)
title("Advice Network (t3)")
gplot(compressed, edge.col = "grey60", vertex.col="darkred", coord = coords,label = seq(1,75), vertex.cex=1.5, label.pos = 5, label.cex=0.5)
title("Advice Compressed")
par(op)
```
*What does the comparison with the compressed network tell us?*
###**Performance**
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
# load the performance.
performance.data <- as.matrix(read.csv("https://www.jacobtnyoung.com/uploads/2/3/4/5/23459640/mba-performance.csv",as.is=TRUE,header=TRUE,row.names=1))
# look at the summary statistics.
summary(performance.data)
# Look at histograms of the performance data.
op <- par(mfrow=c(2,2))
hist(performance.data[,"perf1"] , col="coral" , main="Performance at time 1", ylim=c(0,30), xlim=c(20,30),xlab="")
abline(v=mean(performance.data[,"perf1"]), col="black")
hist(performance.data[,"perf2"] , col="lightblue", main="Performance at time 2", ylim=c(0,30), xlim=c(20,30),xlab="")
abline(v=mean(performance.data[,"perf2"]), col="black")
hist(performance.data[,"perf3"] , col="lavender" , main="Performance at time 3", ylim=c(0,30), xlim=c(20,30),xlab="")
abline(v=mean(performance.data[,"perf2"]), col="black")
plot(c(0, 1), c(0, 1), ann = F, bty = 'n', type = 'n', xaxt = 'n', yaxt = 'n')
text(x = 0.5, y = 0.5, paste("Plots of\n Performance\n among 75 MBAs"),cex = 1.5, col = "black")
par(op)
```
*What do the plots tell us about changes in performance use over time?*
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
# This might be a bit easier to see by looking at within person change over time.
op <- par(mfrow=c(2,2))
hist(performance.data[,"perf2"] -performance.data[,"perf1"], col="red" , main="Change from t1 to t2", ylim=c(0,25), xlim=c(-5,5),xlab="")
abline(v=mean(performance.data[,"perf2"] -performance.data[,"perf1"]))
plot(c(0, 1), c(0, 1), ann = F, bty = 'n', type = 'n', xaxt = 'n', yaxt = 'n')
text(x = 0.5, y = 0.5, paste("Plots of\n Change to\n Performance\n among 75 MBAs"),cex = 1.5, col = "black")
hist(performance.data[,"perf3"] -performance.data[,"perf2"], col="blue" , main="Change from t2 to t3", ylim=c(0,25), xlim=c(-5,5),xlab="")
abline(v=mean(performance.data[,"perf3"] -performance.data[,"perf2"]))
par(op)
```
*What do these the plots tell us about changes in performance use over time?*
***
##**Coevolution of Advice and Performance**
As we have seen, for an RSiena analysis, we have to build several objects. The objects will serve as variables:
* A data object to examine (using `sienaDependent()` and `sienaDataCreate()`).
* A set of effects to estimate (using `getEffects` and `includeEffects()`).
* A model object that will have the terms we want to estimate (using `sienaAlgorithmCreate` or `sienaModelCreate()`).
* Then, we estimate the model using `siena07()`.
###**Step 1: Building the object to analyze using the `sienaDependent()` and `sienaDataCreate()` functions**
Since we are examining coevolution, we will have two dependent variables:
* The advice network
* The performance measure
So, we will need to create two dependent variables using the `sienaDependent()` function. *Note* that this is a different approach from defining *performance* using the `varCovar()` function.
Let's go ahead and create our dependent variables. We will create our advice dependent variable the same way. But, for the performance dependent variable, we need to use the `type=` argument in the `sienaDependent()` function. Specifically, we need to state `type="behavior"` as the default is `type="oneMode"`.
After we have defined each of these dependent variables, we will bind them together using the `sienaDataCreate()` function.
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
# Load the RSiena package.
library(RSiena)
# Look at the type= argument.
?sienaDependent
# Build the advice array.
advice <- sienaDependent(
array( # define that it is an array.
c(advice1,advice2,advice3), # define the three networks.
dim=c(75,75,3) # 75 x 75 x 3 are the dimensions.
)
)
advice
# Build the performance variable.
# NOTE: Since "sienaDependent" requires a matrix, all we have to do is put each wave together using cbind().
performance <- sienaDependent(
cbind(
performance.data[,"perf1"], # perfomance wave 1.
performance.data[,"perf2"], # perfomance wave 2.
performance.data[,"perf3"]), # perfomance wave 3.
type="behavior" # Define it as a behavior.
)
performance
# Bind data together for Siena analysis:
CoEvolutionData <- sienaDataCreate(advice,performance)
CoEvolutionData
```
###**Step 2: Defining the set of effects to estimate using the `getEffects()` function**
Now, we create an effects object for model specification using the `getEffects()` function. Recall that, by default, the `getEffects` function will estimate the rate of change in each period (i.e. the **rate** function). In this example, we will see two rates estimated, the rate of change from t1:t2 (rate 1) and t2:t3 (rate 2). Also, the model automatically adds the *outdegree* and *reciprocity* terms as these are necessary for estimation.
In addition, since we have included a behavioral dependent variable, the `getEffects()` function includes the rate of chance for performance and terms for the linear and quadratic change in performance over time. As with the rate effects for the network, the rate and shape terms for behavior give a sense of how performance is changing over the periods.
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
# Now, create the effects object.
CoEvolutionEffects <- getEffects(CoEvolutionData)
```
###**Step 3: Create the model using the `sienaAlgorithmCreate` and `sienaModelCreate()` functions**
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
# Create a model that has particulars about estimation, then we estimate the model.
CoEvolutionModel <- sienaModelCreate(projname = "Advice & Performance", useStdInits=TRUE, seed=605)
CoEvolutionModel
# Print the report to look at the Jaccard index.
print01Report(CoEvolutionData, modelname = "Advice & Performance Example")
```
###**Step 4: Estimate the model using the `siena07` function**
Now we are ready to estimate the model! To do this, we pass to the `siena07()` function the model information, the data, and the effects. Recall that we have included the basic effects using the `getEffects()` function. Let's take a look at this very simple model to see how the network and the behavior are changing over the periods.
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
# Estimate the model.
CoEvolutionResults <- siena07(
CoEvolutionModel, # the model estimation information.
data=CoEvolutionData, # the data object we created above.
effects=CoEvolutionEffects # the effects object we created above.
)
# Look at the results.
CoEvolutionResults
```
###**Interpreting the Output**
We estimate a **coevolution model**, we get two sets of estimates:
* Network Dynamics
* Behavior Dynamics
####**Network Dynamics**
The output shows the estimates for the **rate** function and the estimates for the **objective** function. For each coefficient, we see an estimate and a standard error. The **rate** estimates correspond to the estimated number of opportunities for change per actor for each period. The "Other parameters" section shows the **objective** estimates, or the **eval** estimates, which correspond to the relative attractiveness of a particular network state for each actor.
####**Behavior Dynamics**
The output shows the estimates for the **rate** function and the estimates for the **objective** function. For each coefficient, we see an estimate, a standard error, and a convergence *t*-ratio.
The **rate** estimates correspond to the estimated number of opportunities for each actor to change their *behavior* for each period. Keep in mind that these are not units of change (e.g. standard deviation change), but opportunities in the simulation to make a change. Recall that the model is simulating *micro-steps*, and in each micro-step an individual is provided the opportunity to make a decision. The **rate** parameter estimate gives a sense of how many opportunities are provided to each individual. For example, the estimate for period 1 is `3.7827`, meaning that each actor had nearly 4 opportunities to change his/her behavior or maintain their current behavior. In slight contrast, the estimate for period 2 is `2.4910`, meaning that each actor had just over 2 opportunities to change his/her behavior or maintain their current behavior. The *shape* parameters show that over the three waves, performance declined (negative linear coefficient) at an increasing rate (negative quadratic shape). As before, the significance of the effects can be evaluated in a manner similar to a regression coefficient by looking at the ratio of the estimate to the standard error (where a ratio of 1.96 indicates a *p*-value of *0.05*).
*Are these estimates for changes in performance consistent with what we observed in our plots of change in performance above?*
***
###**Specifying Coevolution Terms**
So far, our model has not specified any dependence terms between the advice network and the performance ratings. In other words, we are allowing these variables to evolve **independently**. Let's take a look at including terms where:
* The behavior influences the network (e.g. `egoX`, `alterX`, and `sameX` or `simX`)
* The network influences the behavior (e.g. `indeg`, `outdeg`, and `avSim`)
####**Behavior Effects on the Network**
Now we want to include the effects of `performance` on tie behavior. Recall that for actor covariates, we call these *interaction terms* because the outdegree depends on a covariate. We use the same syntax as we did in the prior lab for specifying these terms.
Let's use the terms we used from the prior lab:
* Individuals with a particular attribute are more likely to *send* ties, a *sender* effect. This effect is called `egoX`.
* Individuals with a particular attribute are more likely to *receive* ties, a *receiver* effect. This effect is called `alterX`.
* Individuals with a particular attribute are more likely to nominate others with the same or similar attribute, a *homophily* effect. This effect is called `sameX` or `simX`.
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
# Recall that we can specify all three on 1 line.
CoEvolutionEffects <- includeEffects(
CoEvolutionEffects, # the existing effects object.
# now the terms:
egoX, # performance influences tie sending.
altX, # performance influences tie receiving.
simX, # homophily in performance.
interaction1="performance" # define the variable of interest.
)
```
####**Network Effects on Behavior**
Now we want to include the effects of the advice network on performance ratings. Recall that for actor covariates, we call these *interaction terms* because the outdegree depends on a covariate. The notion of *interaction* is the same here, but we want to see how behavior depends on a tie (rather than *visa versa*).
Here are some common terms:
* Tie *sending* influences behavior. This effect is called `outdeg`.
* Tie *receiving* influences behavior. This effect is called `indeg`.
* The behavior of those whom you are tied to influences your behavior. This effect is called `avSim`.
Let's add these terms to our existing effects object. The difference is that we have to use the `name=` argument to tell it what the dependent variable is (i.e. that it is not the network).
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
CoEvolutionEffects <- includeEffects(
CoEvolutionEffects, # the existing effects object.
name = "performance", # define the behavior as what is being influenced.
# now the terms:
indeg, # tie sending influences behavior.
outdeg, # tie receiving influences behavior.
avSim, # influence from those in network.
interaction1="advice" # define the network as what is driving behavior.
)
```
Now we are ready to estimate the model (again)!
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
# Estimate the model.
CoEvolutionResults <- siena07(
CoEvolutionModel, # the model estimation information.
data=CoEvolutionData, # the data object we created above.
effects=CoEvolutionEffects # the effects object we created above.
)
# Look at the results.
CoEvolutionResults
```
####**Interpreting the Output (again)**
For the **Network Dynamics**, we added effects for performance influencing tie receiving (`eval performance alter`), tie sending (`eval performance ego`), and homophily (`eval performance similarity`). The results indicate that individuals:
* higher performance ratings are more likely to receive ties (a positive `eval performance alter` coefficient).
* lower performance ratings are more likely to send ties (a negative `eval performance ego` coefficient).
* similar performance levels are more likely to seek advice (a positive `eval performance similarity` coefficient).
*What is the overall interpretation for the effects of performance on advice seeking?*
What about the **Behavior Dynamics**? Recall that we added effects for performance to be influenced by the average performance rating for ego's network (`eval performance average similarity`), ego's indegree (`eval performance indegree`), and ego's Outdegree (`eval performance outdegree`). The results indicate that individuals:
* change their performance rating to align with the average of their advice network (a positive `eval performance average similarity` coefficient). Note that this sounds odd. What it is picking up is that individual's performance rating is influenced by the average performance rating of their advice network.
* who receive more ties increase their performance rating (a positive `eval performance indegree` coefficient). But, it is fairly small and not significantly different from zero.
* who send more ties increase their performance rating (a positive `eval performance outdegree` coefficient). But, it is also fairly small and not significantly different from zero.
*What is the overall interpretation for the effects of advice seeking on performance ratings?*
*What would be the next step in building this model?*
***
##**Coevolution of Networks**
As we saw before, we can model the coevolution of a network with behavior. We can also model the coevolution of the two networks together. Another way of thinking about this is that we are interested in multiplexity: or the overlapping of multiple networks.
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
# ############## #
# Load the data. #
# talked about the class.
talk1 <- as.matrix(read.csv("https://www.jacobtnyoung.com/uploads/2/3/4/5/23459640/crj_605_networks_spring_2019_talk_course_w1_adjancency.csv",as.is=TRUE,header=TRUE,row.names=1))
talk2 <- as.matrix(read.csv("https://www.jacobtnyoung.com/uploads/2/3/4/5/23459640/crj_605_networks_spring_2019_talk_course_w2_adjancency.csv",as.is=TRUE,header=TRUE,row.names=1))
talk <- sienaDependent(array(c(talk1,talk2),dim=c(dim(talk1)[1],dim(talk1)[1],2)))
# trust with damaging information.
trust1 <- as.matrix(read.csv("https://www.jacobtnyoung.com/uploads/2/3/4/5/23459640/crj_605_networks_spring_2019_trust_w1_adjancency.csv",as.is=TRUE,header=TRUE,row.names=1))
trust2 <- as.matrix(read.csv("https://www.jacobtnyoung.com/uploads/2/3/4/5/23459640/crj_605_networks_spring_2019_trust_w2_adjancency.csv",as.is=TRUE,header=TRUE,row.names=1))
trust <- sienaDependent(array(c(trust1,trust2),dim=c(dim(trust1)[1],dim(trust1)[1],2)))
# ############## #
# Create the objects for Siena.
# Dependent variables.
TalkTrustCoevolveData <- sienaDataCreate(talk,trust)
TalkTrustCoevolveData
# Now, create the effects object.
TalkTrustCoevolveEffects <- getEffects(TalkTrustCoevolveData)
# Create a model that has particulars about estimation, then we estimate the model.
TalkTrustCoevolveModel <- sienaModelCreate(projname = "Talk & Trust", useStdInits=TRUE, seed=605)
TalkTrustCoevolveModel
# Print the report to look at the Jaccard index.
print01Report(TalkTrustCoevolveData, modelname = "Talk & Trust Example")
# Estimate the model.
TalkTrustCoevolveResults <- siena07(
TalkTrustCoevolveModel, # the model estimation information.
data=TalkTrustCoevolveData, # the data object we created above.
effects=TalkTrustCoevolveEffects # the effects object we created above.
)
# Look at the results.
TalkTrustCoevolveResults
```
*What is the interpretation for each effect in the model?*
###**Coevolution Terms**
Now we want to include terms for how each network influences the other. First, we need to include structural effects for each network (i.e. `transTrip` and `cycle3`) and then include a term for how trust influences talking and *visa versa* using the `crprod` term. The `crprod` term is the effect if an *i->j* tie in network A on an *i->j* tie in network B. In other words, is a trust tie more likely from *i* to *j* if *i* has talked with *j* about the course? This effect is sometimes called a "main effect" or **entrainment** in that the networks are "growing together".
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
# Add the structural effects for each specific network.
TalkTrustCoevolveEffects <- includeEffects(TalkTrustCoevolveEffects,transTrip,cycle3,name="trust")
TalkTrustCoevolveEffects <- includeEffects(TalkTrustCoevolveEffects,transTrip,cycle3,name="talk")
# Include main effects of the dependent networks on each other's evolution:
TalkTrustCoevolveEffects <- includeEffects(TalkTrustCoevolveEffects,crprod,name="talk",interaction1="trust")
TalkTrustCoevolveEffects <- includeEffects(TalkTrustCoevolveEffects,crprod,name="trust",interaction1="talk")
# Now, re-estimate the model.
TalkTrustCoevolveResults <- siena07(TalkTrustCoevolveModel, data=TalkTrustCoevolveData, effects=TalkTrustCoevolveEffects)
TalkTrustCoevolveResults
```
*What is the interpretation for each effect in the model?*
There are additional effects for these models. For example:
* `crprodRecip` is the tendency for an *j->i* tie in network A on an *i->j* tie in network B. This is reciprocity across the networks.
* `crprodMutual` is the tendency for an *i->j* and an *j->i* tie in network A (or *j<->i*) on an *i->j* tie in network B. This is reciprocity in one network leads to ties in another network.
###**Goodness-of-Fit in Siena**
As with ergms, we saw how we can assess the goodness of fit for a model by simulating networks from a fitted model and then examining the structural properties of the simulated and observed networks. The `sienaGOF()` function can be used for this purpose. To use the function, we have to use the `returnDeps=` argument in the `siena07()` function. This argument, when set to `TRUE`, will keep the simulated data from the estimation to be used by the `sienaGOF()` function.
Let's take our previous example of the advice network and re-estimate the model, telling it to keep the simulations.
```{r,echo=TRUE,eval=TRUE,message=FALSE,include=TRUE}
GOF.CoEvolutionResults <- siena07(CoEvolutionModel, data=CoEvolutionData, effects=CoEvolutionEffects, returnDeps=TRUE)
GOF.CoEvolutionResults
# Now, use the sienaGOF() function to simulate the networks.
gof.indegree <- sienaGOF(GOF.CoEvolutionResults,IndegreeDistribution,verbose=TRUE,varName="advice")
gof.outdegree <- sienaGOF(GOF.CoEvolutionResults,OutdegreeDistribution,verbose=TRUE,varName="advice")
gof.triads <- sienaGOF(GOF.CoEvolutionResults,TriadCensus,verbose=TRUE,varName="advice")
# Now plot them.
triad.keys <- c("003","012","102","021D","021U","021C","111D","111U",
"030T","030C","201","120D","120U","120C","210","300")
op <- par(mfrow=c(2,2), mai = c(0,0,0.7,0))
plot(gof.indegree,key=0:8)
plot(gof.outdegree,key=0:8)
plot(gof.triads,key=triad.keys,center=TRUE,scale=TRUE)
par(op)
```
*What do the results for the goodness of fit tell us about our model?*
***
####***Questions?***####
```{r,echo=FALSE,eval=FALSE}
#END OF LAB
```