In the Introduction to Stochastic Actor-Based Models (SABMs) lecture you were introduced to the basic logic of the Stochastic Actor-Based model. This lab draws on that content, showing you how to estimate SABMs using the RSiena package. For WAY MORE details about RSiena, please review the RSiena manual.

**SABM for the PINS Get Along With Networks**

Recall from the Introduction to Stochastic Actor-Based Models (SABMs) lecture that we are trying to build a model that accurately represents the preferences among actors that generated the observed network between discrete time points. Let’s take two waves of data from the PINS study to examine how get along with nominations change over two waves. There are two network objects available on the SNA Textbook website:

# load the network package so it recognizes the objects as networks
library( network )

# define the path location for the file
loc1 <- "https://github.com/jacobtnyoung/sna-textbook/raw/main/data/data-pins-ga-panel-w1-net.rds"
gaNetT1 <- readRDS( url( loc1 ) )

loc2 <- "https://github.com/jacobtnyoung/sna-textbook/raw/main/data/data-pins-ga-panel-w2-net.rds"
gaNetT2 <- readRDS( url( loc2 ) )

# look at the network for wave 1
gaNetT1

##  Network attributes:
##   vertices = 73 
##   directed = TRUE 
##   hyper = FALSE 
##   loops = FALSE 
##   multiple = FALSE 
##   bipartite = FALSE 
##   total edges= 213 
##     missing edges= 0 
##     non-missing edges= 213 
## 
##  Vertex attribute names: 
##     depression.w1 depression.w2 race smoker.w1 smoker.w2 socdist.w1 socdist.w2 vertex.names 
## 
## No edge attributes

# look at the network for wave 2
gaNetT2

##  Network attributes:
##   vertices = 73 
##   directed = TRUE 
##   hyper = FALSE 
##   loops = FALSE 
##   multiple = FALSE 
##   bipartite = FALSE 
##   total edges= 186 
##     missing edges= 0 
##     non-missing edges= 186 
## 
##  Vertex attribute names: 
##     depression.w1 depression.w2 race smoker.w1 smoker.w2 socdist.w1 socdist.w2 vertex.names 
## 
## No edge attributes

Let’s take a look at the plots of each network to see what patterns we can discern over time.

# set the margins
par( mfrow=c( 1,2 ), 
     mar=c( 0.1, 0.5, 2, 0.5 ) )

# set the seed
set.seed( 605 )

# plot the first wave
gplot(
  gaNetT1, 
  edge.col = "grey60", 
  vertex.col="coral",
  label = seq( 1, dim( as.matrix( gaNetT1 ) )[1] ),  # set the labels to be 1 to the number of nodes
  vertex.cex=1.5, 
  label.pos = 5, 
  label.cex=0.5
  )
title( "Get Along With Network (t1)" )

# plot the second wave
gplot(
  gaNetT2, 
  edge.col = "grey60", 
  vertex.col="lavender",
  label = seq( 1, dim( as.matrix( gaNetT2 ) )[1] ),  # set the labels to be 1 to the number of nodes
  vertex.cex=1.5, 
  label.pos = 5, 
  label.cex=0.5
  )
title( "Get Along With Network (t2)" )

What do the plots show us about the evolution of the network over time?

This is a bit hard to see. One way we can get at this is to force the individuals to “stay” in the same position in the plot and see how individual actors’ indegree and outdegree change over time.

# Set the coordinates
coords <- gplot( gaNetT1 )

Now, we can plot individuals based on their position in the first wave:

# set the margins
par( mfrow=c( 1,2 ), 
     mar=c( 0.1, 0.5, 2, 0.5 ) )

# plot the first wave
gplot(
  gaNetT1, 
  edge.col = "grey60", 
  vertex.col="coral",
  label = seq( 1, dim( as.matrix( gaNetT1 ) )[1] ),  # set the labels to be 1 to the number of nodes
  vertex.cex=1.5, 
  label.pos = 5, 
  label.cex=0.5,
  coord = coords
  )
title( "Get Along With Network (t1)" )

# plot the second wave
gplot(
  gaNetT2, 
  edge.col = "grey60", 
  vertex.col="lavender",
  label = seq( 1, dim( as.matrix( gaNetT2 ) )[1] ),  # set the labels to be 1 to the number of nodes
  vertex.cex=1.5, 
  label.pos = 5, 
  label.cex=0.5,
  coord = coords
  )
title( "Get Along With Network (t2)" )

Take a look at a few specific nodes. What do the plots show us about the evolution of the network over time?

Now, we want to think about modeling the evolution of this network over time. To do that, we need to look through the RSiena package.

The `RSiena` package

The RSiena package performs simulation-based estimation of stochastic actor-based models for longitudinal network data collected as panel data. Dependent variables can be single or multivariate networks, which can be directed, non-directed, or two-mode. There are also functions for testing parameters and checking goodness of fit.

If you have not already done so, install the package using install.packages( "RSiena" ). If you have not installed the RSiena package recently, make sure you update using the update.packages( "RSiena" ) command.

Note that if you are using the MacOS, that RSiena will require the XQuartz software. Please download the XQuartz software at: https://www.xquartz.org/ before running RSiena.

# Load the RSiena package.
library( RSiena )

# Take a look at the help.
help( package="RSiena" )

For an RSiena analysis, we have to build several objects. The objects will serve as variables:

A data object to examine (using sienaDependent() and sienaDataCreate()).
A set of effects to estimate (using getEffects and includeEffects()).
A model object that will have the terms we want to estimate (using sienaAlgorithmCreate or sienaModelCreate()).
Then, we estimate the model using siena07().

Step 1: Building the object to analyze using the `sienaDependent()` and `sienaDataCreate()` functions

First, we want to create an object that is an array of the networks that will be the dependent variable. To do this, we use the sienaDependent() function. To see how this function works, look at the help with ?sienaDependent.

In our example, we will create an object called getalong which is the two networks constructed as an array. The array will have dimensions 73 x 73 x 2. That is, the number of individuals in each network represented over two time points ( i.e. n x n x t). If, for example, we had a network with fifty individuals and four waves of data, then our array would be 50 x 50 x 4. Think of the object we have created as a cube that has 73 rows and 73 columns and each slice of the cube is a cross-section of the network.

Also, our gaNetT1 and gaNetT2 objects are of class network. For the sienaDependent() function, we need to coerce them to matrices using the as.matrix() function.

# take a look at the help
?sienaDependent

# create the objects of class matrix
gaMatT1 <- as.matrix( gaNetT1 )
gaMatT2 <- as.matrix( gaNetT2 )

# build the network object to examine
getalong <- sienaDependent(
  array(                        # define that it is an array
    c( gaMatT1, gaMatT2 ),      # define the two networks (matrices)
    dim=c( 73, 73, 2 )          # n x n x t are the dimensions
    )
  )

# we see that getalong is a one mode network, with 2 observations, and 73 actors
getalong

## Type         oneMode             
## Observations 2                   
## Nodeset      Actors (73 elements)

# getalong is an object of class "sienaDependent"
class( getalong )

## [1] "sienaDependent"

Now that we have created our dependent variable (i.e. getalong which is an object of class sienaDependent), we can create an object for RSiena to examine. We do this using the sienaDataCreate function. To see how this function works, look at the help with ?sienaDataCreate. This may seem excessive now, but when we examine more networks simultaneously, the program architecture will make more sense.

# take a look at the help
?sienaDataCreate

# build the object to analyze
getalong.data <- sienaDataCreate( getalong )
getalong.data

## Dependent variables:  getalong 
## Number of observations: 2 
## 
## Nodeset                  Actors 
## Number of nodes              73 
## 
## Dependent variable getalong   
## Type               oneMode    
## Observations       2          
## Nodeset            Actors     
## Densities          0.041 0.035

# getalong is of class "siena"
class( getalong.data )

## [1] "siena"

Step 2: Defining the set of effects to estimate using the `getEffects()` function

Now, we create an effects object for model specification using the getEffects() function. Basically, we are going to create an object with some effects, then as we continue to build our model, we can add or remove effects from that object. To see how this function works, look at the help with ?getEffects.

By default, the getEffects function will estimate the rate of change in each period (i.e. the rate function). In this example, we will see one rate estimated, the rate of change from t1:t2 (rate 1). If we had more waves (i.e. time points), then we would more rates. Also, the model automatically adds the outdegree and reciprocity terms as these are necessary for estimation.

# now, create the effects object
getalong.effects <- getEffects( getalong.data )

The effectsDocumentation() function will bring up a list of terms we can include in the model. For example, effectsDocumentation( getalong.effects ) will call an .html file that lists different terms we can use. We will come back to this in a bit.

Step 3: Create the model using the `sienaAlgorithmCreate` and `sienaModelCreate()` functions

Now that our dependent variable is defined (i.e. getalong.data) and the effects we want to estimate on that object are defined (i.e. getalong.effects), we can create a model object using the sienaAlgorithmCreate() or sienaModelCreate() functions. To see how these functions work, look at the help with ?sienaAlgorithmCreate and ?sienaModelCreate. These functions allow us to specify various properties of the estimation algorithm and the model.

# take a look at the help
?sienaAlgorithmCreate
?sienaModelCreate

# create a model that has particulars about estimation, then we estimate the model
getalong.model <- sienaModelCreate( 
  projname = "GetAlong",            # the output on model fit and convergence will be stored in a "GetAlong.txt" file
  seed=605                          # set the seed to reproduce model results
  )

## If you use this algorithm object, siena07 will create/use an output file GetAlong.txt .

Taking Stock of the Steps (so far) and Checking Network Stability

We have gone through three steps in detail. But, note that we really only have to run several lines of command to get us to where we are right now. That is, ready to estimate a model. Here are all the lines we needed to get to the estimating stage:

getalong         <- sienaDependent(  array( c( gaMatT1, gaMatT2 ), dim=c( 73, 73, 2 ) ) )
getalong.data    <- sienaDataCreate( getalong )
getalong.effects <- getEffects( getalong.data )
getalong.model   <- sienaModelCreate( projname = "GetAlong", seed=605 )

In addition, we need to check whether there is sufficient stability in our network prior to estimating a model. If there is a lot of instability (i.e. there are dramatic changes between cross-sections), then it may provide difficult to model the evolution of the network. We can get a sense of how much change there is by examining the Jaccard index, which is a description of network change. A general rule of thumb is that the values should be above 0.3. The print01Report() function will return this information as a text file. Let’s take a look:

# take a look at the help
?print01Report

# the report will have the file extension .out.
print01Report( getalong.data, modelname = "Get Along Example" )

Step 4: Estimate the model using the `siena07` function

Now we are ready to estimate the model! To do this, we pass to the siena07() function the model information, the data, and the effects. To see how this function works, look at the help with ?siena07. Now, let’s estimate our model.

# take a look at the help
?siena07

# estimate the model
getalong.results <- siena07(
    getalong.model,           # the model estimation information
    data=getalong.data,       # the data object we created above
    effects=getalong.effects  # the effects object we created above
)

# print the results
getalong.results

## Estimates, standard errors and convergence t-ratios
## 
##                               Estimate   Standard   Convergence 
##                                            Error      t-ratio   
## 
## Rate parameters: 
##   0       Rate parameter       6.3817  ( 0.6676   )             
## 
## Other parameters: 
##   1. eval outdegree (density) -2.1360  ( 0.0779   )   -0.0171   
##   2. eval reciprocity          1.7035  ( 0.1745   )    0.0083   
## 
## Overall maximum convergence ratio:    0.0304 
## 
## 
## Total of 1911 iteration steps.

Interpreting the Output

The output shows the estimates for the rate function and the estimates for the objective function. For each coefficient, we see an estimate and a standard error.

For the objective function coefficients, we also see the Convergence t-ratio reported. NOTE: these do not represent conventional t-ratios in the sense of a t-statistic assessing the size of the parameter estimate. Rather, they represent tests of the lack of convergence for each estimate, so small values indicate good convergence. Absolute values less than 0.10 indicate excellent convergence and absolute values less than 0.15 indicate reasonable convergence.

The rate estimates correspond to the estimated number of opportunities for change per actor for each period. Recall that the model is simulating micro-steps, and in each micro-step an individual is provided the opportunity to make a decision. The rate parameter estimate gives a sense of how many opportunities are provided to each individual. For example, the estimate is 6.3817, meaning that each actor had just over 6 opportunities to change a tie or maintain their current tie configuration between t0 and t1 (i.e. wave 1 and wave 2). Put differently, to get from the first network to the second network, there are about 6 x 73 = 438 “decisions” made over the actors.

The “Other parameters” section shows the objective estimates, or the eval estimates, which correspond to the relative attractiveness of a particular network state for each actor. Recall that individuals are given a choice about what to do. These estimates represent the extent to which individuals preferred a particular state.

The eval outdegree term is negative, suggesting that individuals prefer not to send ties. Put differently, the agent, when given the opportunity in a micro-step, says “huh, what can I do? I can send a tie or not. I prefer not to”.
The eval reciprocity term is positive, indicating that individuals prefer to reciprocate ties. Note that this preference is for maintaining an existing reciprocated relationship, for creating a reciprocated relationship from an asymmetric relationship where alter has nominated ego (i.e. ego wants to reciprocate), and for dissolving an asymmetric relationship where alter did not nominated ego after ego nominated alter (i.e. alter didn’t reciprocate).

The significance of the effects can be evaluated in a manner similar to a regression coefficient by looking at the ratio of the estimate to the standard error (where a ratio of 1.96 indicates a p-value of 0.05).

Step 5: Adding Terms to the Model

Now that our model is created, we may want to add some additional terms. When the effects object has been created, we can add to it using the includeEffects() function. This function allows us to take the existing effects object and add a specific term that we would like to estimate. To see how this function works, look at the help with ?includeEffects. To see the effects, use the effectsDocumentation() function to generate an .html file that shows all of the effects.

Triadic Structures

Let’s add the following effects to our model:

An effect for a preference for transitivity (i->j is preferred if i->k and k->j exists). This effect is called transTrip.
An effect for a preference for 3-cycle triads (i->j is preferred if k->i and j->k exists). This effect is called cycle3.

# use the includeEffects() function to add these terms

getalong.effects <- includeEffects(
  getalong.effects,                 # the object to add term to
  transTrip                         # the name of the term
  )

##   effectName          include fix   test  initialValue parm
## 1 transitive triplets TRUE    FALSE FALSE          0   0

getalong.effects <- includeEffects( getalong.effects, cycle3 )

##   effectName include fix   test  initialValue parm
## 1 3-cycles   TRUE    FALSE FALSE          0   0

Now that we have added new terms, we re-estimate the model.

# estimate the model
getalong.results2 <- siena07( getalong.model, data = getalong.data, effects = getalong.effects )

# print the results
getalong.results2

## Estimates, standard errors and convergence t-ratios
## 
##                               Estimate   Standard   Convergence 
##                                            Error      t-ratio   
## 
## Rate parameters: 
##   0       Rate parameter       6.9358  ( 0.7485   )             
## 
## Other parameters: 
##   1. eval outdegree (density) -2.3144  ( 0.0834   )   0.0455    
##   2. eval reciprocity          1.6550  ( 0.1870   )   0.0797    
##   3. eval transitive triplets  0.4923  ( 0.0841   )   0.0296    
##   4. eval 3-cycles            -0.2834  ( 0.1811   )   0.0758    
## 
## Overall maximum convergence ratio:    0.1392 
## 
## 
## Total of 2109 iteration steps.

What is the interpretation of the transTrip and cycle3 estimates? Well, recall that individuals are given a choice about what to do. These estimates represent the extent to which individuals preferred a particular state.

The eval transTrip term is positive indicating that i is more likely to form an outgoing friendship with j if i and j both perceive k to get along with each other. Note the use of the term perceive here. This is an important assumption of the model. Actors are looking over the network and making decisions based on their beliefs about ties in the network. These does not have to be accurate, but the do reflect the idea that actors have beliefs about all ties among actors in the network.
- The eval cycle3 term is negative indicating that i is less likely to form an outgoing friendship with j if j is perceived to get along with k is perceived to get along with i.

Degree Effects

Let’s add some additional effects to the model:

An effect for seeking relationships with popular others (i->j is preferred if k->j exists. This effect is inPop.
An effect for seeking relationships with active others (i->j is preferred if j->k exists). This effect is outPop.

# use the includeEffects() function to add these terms
getalong.effects <- includeEffects( getalong.effects, inPop )

##   effectName            include fix   test  initialValue parm
## 1 indegree - popularity TRUE    FALSE FALSE          0   0

getalong.effects <- includeEffects( getalong.effects, outPop )

##   effectName             include fix   test  initialValue parm
## 1 outdegree - popularity TRUE    FALSE FALSE          0   0

# estimate the model
getalong.results3 <- siena07( getalong.model, data = getalong.data, effects = getalong.effects )

# print the results
getalong.results3

## Estimates, standard errors and convergence t-ratios
## 
##                                  Estimate   Standard   Convergence 
##                                               Error      t-ratio   
## 
## Rate parameters: 
##   0       Rate parameter          6.9749  ( 0.7828   )             
## 
## Other parameters: 
##   1. eval outdegree (density)    -2.1636  ( 0.1579   )   -0.0273   
##   2. eval reciprocity             1.7722  ( 0.2167   )   -0.0053   
##   3. eval transitive triplets     0.5517  ( 0.1152   )   -0.0141   
##   4. eval 3-cycles               -0.2272  ( 0.2360   )   -0.0156   
##   5. eval indegree - popularity   0.0017  ( 0.0484   )   -0.0695   
##   6. eval outdegree - popularity -0.0751  ( 0.0549   )   -0.0091   
## 
## Overall maximum convergence ratio:    0.2030 
## 
## 
## Total of 2210 iteration steps.

What is the interpretation of the inPop and outPop estimates?

The eval inPop term indicates whether increases in received get along with nominations (i.e. indegree, and called inPop because it represents “popularity”) make individuals more likely to send a get along with nomination. It is positive, but very small and not significantly different from zero.
The eval outPop term indicates whether increases in sent get along with nominations (i.e. outdegree, the term outPop is used for consistency with the terms) make individuals more likely to send a get along with nomination. It is negative, but very small and not significantly different from zero.

Working with Covariates

So far, we have specified terms that are primarily structural in the sense that they are concerned with endogenous network processes. However, we can also include covariates in the model to capture differential tie behavior based on the attributes of the actors. In the ERGMs, we used the nodefactor and nodecov terms to capture differential tie behavior based on covariates. We want to use the same logic here: how does a covariate influence whether actors send a tie?

In RSiena, covariates can be either:

Time-varying in that the variable can vary across waves (e.g. smoking behavior).
Constant in that the variable does not change over time because it is immutable (e.g. sex) or because we only have two waves of data.

Time-varying covariates are incorporated into the sienaDataCreate() function by defining them with the varCovar() function and constant covariates are incorporated using the coCovar() function.

Let’s take a look at how we incorporate actor covariate terms by using the social distance scale from the PINS survey. The variable is calculated by taking the average of four items asking participants to state their level of agreement with statements about people of another race: 1) “I would share a table with them”; 2) “I would cooperate with them if needed”; 3) “I would accept them as personal friends”; and 4) “I would avoid them if I could” (reverse-coded). Responses to each statement were on a 5-point Likert scale ranging from “strongly agree” to “strongly disagree”. Therefore, higher values indicate greater feelings of social distance from people of another race.

Since we only have two wave of data, social distance scale is treated as a “constant covariate”. Why? Since we are only observing changes from t0 to t1 (i.e. period 1), we are looking at how your t0 covariate influences your t1 tie behavior. So, we need to use the coCovar() function.

IF we had more waves, we could use the varCovar function.

Identifying the covariate

First, we need to pull off the social distance variable for the gaNetT1 object. This is a vertex attribute called socdist.w1. Note that if we had three waves of data, we would want the measure of the covariate from the first and second wave. And so on for more waves of data.

# see the description of the coCovar() function
?coCovar

# pull off the social distance scale
socDist <- gaNetT1 %v% "socdist.w1"

# now, use the coCovar() function to define the object
socDistCovar <- coCovar( socDist )

Rebuilding the objects

Now that we have the social distance covariate, let’s rebuild the objects we want. Since the getalong object is data used in our analysis, we need to redefine the object created by the sienaDataCreate() function. Also, since we will be creating a new data object for analysis, we need to redefine the effects we want to estimate using the getEffects() function.

# add the social distance data and create a new object
GA.SD.data <- sienaDataCreate( Network = getalong, socDistCovar )
GA.SD.data

## Dependent variables:  Network 
## Number of observations: 2 
## 
## Nodeset                  Actors 
## Number of nodes              73 
## 
## Dependent variable Network    
## Type               oneMode    
## Observations       2          
## Nodeset            Actors     
## Densities          0.041 0.035
## 
## Constant covariates:  socDistCovar

# we also need a new effects object 
GA.SD.effects <- getEffects( GA.SD.data )

# let's use the terms we already had in the prior model
# mote: we can just include all the effects in one statement, not four lines
GA.SD.effects <- includeEffects( GA.SD.effects, transTrip, cycle3, inPop, outPop )

##   effectName             include fix   test  initialValue parm
## 1 transitive triplets    TRUE    FALSE FALSE          0   0   
## 2 3-cycles               TRUE    FALSE FALSE          0   0   
## 3 indegree - popularity  TRUE    FALSE FALSE          0   0   
## 4 outdegree - popularity TRUE    FALSE FALSE          0   0

Note that the includeEffects() function includes the argument include= which can be used to remove terms. For example, say we estimate a model and want to remove the cycle3 effect. We would use the following command: includeEffects( GA.SD.effects, cycle3, include=FALSE ). If we did that and wanted to add it back in, we could use: includeEffects( GA.SD.effects, cycle3, include=TRUE ) (note that the include=TRUE argument is not necessary as the default is TRUE for this argument).

Identifying terms for covariates

Now we want to include the effects of social distance on tie behavior. For actor covariates, we call these interaction terms because the outdegree depends on a covariate.

The typical effects we might be interested in are:

Individuals with a particular attribute are more likely to send ties, a sender effect. This effect is called egoX.
Individuals with a particular attribute are more likely to receive ties, a receiver effect. This effect is called alterX.
Individuals with a particular attribute are more likely to nominate others with the same or similar attribute, a homophily effect. This effect is called sameX or simX.

Let’s take a look at how this looks in the syntax and what the effects mean.

GA.SD.effects <- includeEffects( 
  GA.SD.effects, 
  egoX,                         # sender effect: more social distance, more ties sent
  altX,                         # receiver effect: more social distance, more tie received
  simX,                         # homophily: similar social distance, more likely to send them a tie
  interaction1 = "socDistCovar" # outdegree depends on performance.
  )

##   effectName              include fix   test  initialValue parm
## 1 socDistCovar alter      TRUE    FALSE FALSE          0   0   
## 2 socDistCovar ego        TRUE    FALSE FALSE          0   0   
## 3 socDistCovar similarity TRUE    FALSE FALSE          0   0

Now that we have our effects object described, let’s estimate the model.

# set up the estimation
GA.SD.model   <- sienaModelCreate( projname = "SocDist", seed=605 )

## If you use this algorithm object, siena07 will create/use an output file SocDist.txt .

# estimate the model
GA.SD.results <- siena07( GA.SD.model, data=GA.SD.data, effects= GA.SD.effects )

# take a look at the output
GA.SD.results

## Estimates, standard errors and convergence t-ratios
## 
##                                   Estimate   Standard   Convergence 
##                                                Error      t-ratio   
## 
## Rate parameters: 
##   0       Rate parameter           6.9484  ( 0.7697   )             
## 
## Other parameters: 
##   1. eval outdegree (density)     -2.1820  ( 0.1630   )    0.0275   
##   2. eval reciprocity              1.7853  ( 0.2190   )    0.0216   
##   3. eval transitive triplets      0.5429  ( 0.0919   )    0.0769   
##   4. eval 3-cycles                -0.2133  ( 0.1966   )    0.0748   
##   5. eval indegree - popularity    0.0113  ( 0.0484   )    0.0330   
##   6. eval outdegree - popularity  -0.0828  ( 0.0543   )    0.0378   
##   7. eval socDistCovar alter       0.0452  ( 0.0972   )   -0.0607   
##   8. eval socDistCovar ego         0.0166  ( 0.1027   )   -0.0889   
##   9. eval socDistCovar similarity  0.3261  ( 0.3554   )    0.0523   
## 
## Overall maximum convergence ratio:    0.1346 
## 
## 
## Total of 2483 iteration steps.

Now let’s interpret the effects:

For eval socDistCovar alter we see a positive coefficient, indicating that individuals with higher values on the social distance scale were more likely to receive ties.
For eval socDistCovar ego we see a positive coefficient, indicating that individuals with higher values on their the social distance scale were more likely to send ties.
For eval socDistCovar similarity we see a positive coefficient, indicating that ego prefers sending ties to alters who are more similar to ego on the social distance scale.

For all these effects, however, they are not significantly different from zero.

Step 6: Assessing Goodness of Fit

As we did in the Simulation and Goodness of Fit with Exponential Random Graph Models (ERGMs) lab, we discussed how to simulate networks from our model and use these simulations to assess how well our model is fitting the data. We can do the same thing with SABMs!

To do this, we simply use the sienaGOF() function. First, we have to use the returnDeps=TRUE argument in the siena07() function. This tells the function to keep the simulated networks.

# estimate the model
GA.SD.results <- siena07( 
  GA.SD.model, 
  data=GA.SD.data, 
  effects= GA.SD.effects, 
  returnDeps = TRUE       # here is the additional option we have specified
  )

Now, we can use the sienaGOF() function. Let’s simulate the indegree and outdegree distributions, and then the triad census.

# indegree
indegGOF <- sienaGOF( 
  GA.SD.results,        # the model we want GOF stats for
  IndegreeDistribution, # the distribution we want info on
  varName="Network"     # which variable we are using for GOF (the model can handle multiple DVs)
  )

# outdegree
outdegGOF <- sienaGOF(  GA.SD.results, OutdegreeDistribution, varName="Network" )

# triad census
TriadGOF <- sienaGOF(  GA.SD.results, TriadCensus, varName="Network" )

Now, let’s take these GOF objects and plot them.

plot( indegGOF )

plot( outdegGOF )

plot( TriadGOF )

How well is our model fitting the data?

Wrapping up…

In this lab, we worked with SABMs in R using the RSiena package. For WAY MORE details about RSiena, please review the RSiena manual.

Questions?

Please report any needed corrections to the Issues page. Thanks!

Back to SAND main page

Lab 12 - Introduction to Stochastic Actor-Based Models (SABMs) in R

CRJ 605 Statistical Analysis of Networks

**SABM for the PINS Get Along With Networks**

The `RSiena` package

Step 1: Building the object to analyze using the `sienaDependent()` and `sienaDataCreate()` functions

Step 2: Defining the set of effects to estimate using the `getEffects()` function

Step 3: Create the model using the `sienaAlgorithmCreate` and `sienaModelCreate()` functions

Taking Stock of the Steps (so far) and Checking Network Stability

Step 4: Estimate the model using the `siena07` function

Interpreting the Output

Step 5: Adding Terms to the Model

Triadic Structures

Degree Effects

Working with Covariates

Identifying the covariate

Rebuilding the objects

Identifying terms for covariates

Step 6: Assessing Goodness of Fit

Wrapping up…

Questions?

Please report any needed corrections to the Issues page. Thanks!

Lab 12 - Introduction to Stochastic Actor-Based Models (SABMs) in R

CRJ 605 Statistical Analysis of Networks

SABM for the PINS Get Along With Networks

The RSiena package

Step 1: Building the object to analyze using the sienaDependent() and sienaDataCreate() functions

Step 2: Defining the set of effects to estimate using the getEffects() function

Step 3: Create the model using the sienaAlgorithmCreate and sienaModelCreate() functions

Taking Stock of the Steps (so far) and Checking Network Stability

Step 4: Estimate the model using the siena07 function

Interpreting the Output

Step 5: Adding Terms to the Model

Triadic Structures

Degree Effects

Working with Covariates

Identifying the covariate

Rebuilding the objects

Identifying terms for covariates

Step 6: Assessing Goodness of Fit

Wrapping up…

Questions?

Please report any needed corrections to the Issues page. Thanks!

**SABM for the PINS Get Along With Networks**

The `RSiena` package

Step 1: Building the object to analyze using the `sienaDependent()` and `sienaDataCreate()` functions

Step 2: Defining the set of effects to estimate using the `getEffects()` function

Step 3: Create the model using the `sienaAlgorithmCreate` and `sienaModelCreate()` functions

Step 4: Estimate the model using the `siena07` function