Coevolution for the PINS Get Along With
Networks
Recall that with SABMs, we are trying to build a model that
accurately represents the preferences among actors that generated the
observed network between discrete time points. In addition, we would
like to incorporate the changes that occur in behavior as a consequence
of network position. As a result, we need to examine a
coevolution model of the network and behavior. In other words,
we want to examine the simultaneity of network dynamics
and behavior dynamics.
The Networks
Let’s again use two waves of data from the PINS study to examine how
get along with nominations change over two waves. There are two
network objects available on the SNA Textbook website:
# clear the workspace
rm( list = ls() )
# libraries needed
library( sna ) # for sna functions
library( network ) # for working with network objects
library( RSiena ) # for working with SABMs
# define the path location for the file
loc1 <- "https://github.com/jacobtnyoung/sna-textbook/raw/main/data/data-pins-ga-panel-w1-net.rds"
gaNetT1 <- readRDS( url( loc1 ) )
loc2 <- "https://github.com/jacobtnyoung/sna-textbook/raw/main/data/data-pins-ga-panel-w2-net.rds"
gaNetT2 <- readRDS( url( loc2 ) )
# look at the network for wave 1
gaNetT1
## Network attributes:
## vertices = 73
## directed = TRUE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 213
## missing edges= 0
## non-missing edges= 213
##
## Vertex attribute names:
## depression.w1 depression.w2 race smoker.w1 smoker.w2 socdist.w1 socdist.w2 vertex.names
##
## No edge attributes
# look at the network for wave 2
gaNetT2
## Network attributes:
## vertices = 73
## directed = TRUE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 186
## missing edges= 0
## non-missing edges= 186
##
## Vertex attribute names:
## depression.w1 depression.w2 race smoker.w1 smoker.w2 socdist.w1 socdist.w2 vertex.names
##
## No edge attributes
The Attributes
Now, let’s pull in the attributes. We will use the social
distance scale from the PINS survey. The variable is calculated
by taking the average of four items asking participants to state their
level of agreement with statements about people of another race: 1) “I
would share a table with them”; 2) “I would cooperate with them if
needed”; 3) “I would accept them as personal friends”; and 4) “I would
avoid them if I could” (reverse-coded). Responses to each statement were
on a 5-point Likert scale ranging from “strongly agree” to “strongly
disagree”. Therefore, higher values indicate greater feelings of social
distance from people of another race.
In the prior lab, we only used one wave of the data, treating this as
a “constant covariate”. Here, we are going to use both waves since it is
a dependent variable.
# pull off the social distance scale for each wave
socDist1 <- gaNetT1 %v% "socdist.w1"
socDist2 <- gaNetT2 %v% "socdist.w2"
# build a plot of these
par( mfrow = c( 2,2 ) )
# plot the first wave with a line for the mean
hist( socDist1,
col = "#fc7544",
main = "Social Distance at time 1",
ylim = c( 0, 50 ), # set the limits for the y-axis
xlim = c( min( socDist1 ), max( socDist1 ) ), # set the limits for the x-axis
xlab = "" # supress the label for the x-axis
)
abline( v = mean( socDist1 ),
lwd = 5,
col = "black"
)
# plot the first wave with a line for the mean
hist( socDist2,
col = "#7285f2",
main = "Social Distance at time 2",
ylim = c( 0, 50 ), # set the limits for the y-axis
xlim = c( min( socDist2 ), max( socDist2 ) ), # set the limits for the x-axis
xlab = "" # supress the label for the x-axis
)
abline( v = mean( socDist2 ),
lwd = 5,
col = "black"
)
# plot a scatterplot
plot( socDist1, socDist2, # plot the two waves of data
col = "#28734a",
pch = 19, # make the character solid circles
main = "Scatterplot of Social Distance"
)
abline( lm( socDist2 ~ socDist1 ), # the line is the OLS
col = "grey40",
lwd = 3
)
What do the plots tell us about changes in social distance over
time?
This is a bit tough to see. Note that the the wave 1 measures
explains 65.95 percent of the variance in the wave 2 measure (so, 34.05
of the variance is not explained, meaning it changes across the waves).
We can see this visually by creating within person changes and plotting
those.
# reset the window partition
par( mfrow = c( 1, 1 ) )
# create the deviations
socDistD <- socDist2 - socDist1
# plot the within person deviations
hist( socDistD,
col = "#d7fa28",
main = "Within Person Change in Social Distance",
ylim = c( 0, 40 ), # set the limits for the y-axis
xlim = c( min( socDistD ), max( socDistD ) ), # set the limits for the x-axis
xlab = "" # supress the label for the x-axis
)
abline( v = mean( socDistD ),
lwd = 5,
col = "black"
)
What’s the take-away?
Mainly, that some people go up, some go down, and a lot stay the
same. So, let’s think about: why some go up? down?
Coevolution of the Get Along With Network and the
Social Distance Scale
As we have seen, for an RSiena analysis, we have to build several
objects. The objects will serve as variables:
- A data object to examine (using
sienaDependent()
and
sienaDataCreate()
).
- A set of effects to estimate (using
getEffects
and
includeEffects()
).
- A model object that will have the terms we want to estimate (using
sienaAlgorithmCreate
or
sienaModelCreate()
).
- Then, we estimate the model using
siena07()
.
Step 1: Building the object to analyze using the
sienaDependent()
and sienaDataCreate()
functions
Since we are examining coevolution, we will have two dependent
variables:
- The get along with network
- The social distance scale
We will need to create two dependent variables using the
sienaDependent()
function. Note that this is a
different approach from defining performance using the
coCovar()
function.
Let’s go ahead and create our dependent variables. We will create our
get along with dependent variable the same way. But, for the social
distance dependent variable, we need to use the type=
argument in the sienaDependent()
function. Specifically, we
need to state type="behavior"
as the default is
type="oneMode"
.
After we have defined each of these dependent variables, we will bind
them together using the sienaDataCreate()
function.
# look at the type= argument
?sienaDependent
# build the network object to examine
getalong <- sienaDependent(
array( # define that it is an array
c( as.matrix( gaNetT1 ),
as.matrix( gaNetT2 ) ), # define the two networks (matrices)
dim=c( 73, 73, 2 ) # n x n x t are the dimensions
)
)
# build the social distance variable
socDist <- sienaDependent(
cbind( socDist1, socDist2 ), # sienaDependent requires a matrix, so we use cbind
type = "behavior"
)
# bind data together for Siena analysis
CoEvolutionData <- sienaDataCreate( getalong, socDist )
Step 2: Defining the set of effects to estimate using the
getEffects()
function
Now, we create an effects object for model specification using the
getEffects()
function. Basically, we are going to create an
object with some effects, then as we continue to build our model, we can
add or remove effects from that object. To see how this function works,
look at the help with ?getEffects
.
By default, the getEffects
function will estimate the
rate of change in each period (i.e. the rate function).
In this example, we will see one rate estimated, the rate of change from
t1:t2 (rate 1). If we had more waves (i.e. time points), then we would
more rates. Also, the model automatically adds the outdegree
and reciprocity terms as these are necessary for
estimation.
# create the effects object
CoEvolutionEffects <- getEffects( CoEvolutionData )
Step 3: Create the model using the
sienaAlgorithmCreate
and sienaModelCreate()
functions
Now that our dependent variable is defined
(i.e. getalong.data
) and the effects we want to estimate on
that object are defined (i.e. getalong.effects
), we can
create a model object using the sienaAlgorithmCreate()
or
sienaModelCreate()
functions. To see how these functions
work, look at the help with ?sienaAlgorithmCreate
and
?sienaModelCreate
. These functions allow us to specify
various properties of the estimation algorithm and the model.
# create a model that has particulars about estimation, then we estimate the model
CoEvolutionModel <- sienaModelCreate(
projname = "Get Along & Social Distance", # the output on model fit and convergence will be stored in a text file with this name
seed=605 # set the seed to reproduce model results
)
## If you use this algorithm object, siena07 will create/use an output file Get Along & Social Distance.txt .
Step 4: Estimate the model using the siena07
function
Now we are ready to estimate the model! To do this, we pass to the
siena07()
function the model information, the data, and the
effects. Recall that we have included the basic effects using the
getEffects()
function. Let’s take a look at this very
simple model to see how the network and the behavior are changing over
the periods.
# estimate the model
CoEvolutionResults <- siena07(
CoEvolutionModel, # the model estimation information
data=CoEvolutionData, # the data object we created above
effects=CoEvolutionEffects # the effects object we created above
)
# look at the results
CoEvolutionResults
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
## Network Dynamics
## 1. rate basic rate parameter getalong 6.4098 ( 0.6023 ) 0.0354
## 2. eval outdegree (density) -2.1271 ( 0.0817 ) -0.0423
## 3. eval reciprocity 1.6905 ( 0.1800 ) -0.0242
##
## Behavior Dynamics
## 4. rate rate socDist period 1 1.3912 ( 0.3580 ) 0.0116
## 5. eval socDist linear shape -0.2200 ( 0.2499 ) 0.0686
## 6. eval socDist quadratic shape -0.5121 ( 0.3893 ) 0.0415
##
## Overall maximum convergence ratio: 0.0950
##
##
## Total of 2669 iteration steps.
Interpreting the Output
We estimate a coevolution model, we get two sets of
estimates:
- Network Dynamics
- Behavior Dynamics
Network Dynamics
The rate estimates correspond to the estimated
number of opportunities for change per actor for each period. The
estimate is 6.4098
, meaning that each actor had just over 6
opportunities to change a tie or maintain their current tie
configuration between t0 and t1 (i.e. wave 1 and wave 2). Put
differently, to get from the first network to the second network, there
are about 6 x 73 = 438 “decisions” made over the actors.
The eval outdegree
term is negative, suggesting that
individuals prefer not to send ties. Put differently, the
agent, when given the opportunity in a micro-step, says “huh, what can I
do? I can send a tie or not. I prefer not to”.
The eval reciprocity
term is positive, indicating that
individuals prefer to reciprocate ties. Note that this preference is for
maintaining an existing reciprocated relationship, for
creating a reciprocated relationship from an asymmetric
relationship where alter has nominated ego (i.e. ego wants to
reciprocate), and for dissolving an asymmetric relationship
where alter did not nominated ego after ego nominated alter (i.e. alter
didn’t reciprocate).
Behavior Dynamics
The output shows the estimates for the rate function
and the estimates for the objective function. For each
coefficient, we see an estimate, a standard error, and a convergence
t-ratio.
The rate estimates correspond to the estimated
number of opportunities for each actor to change their behavior
for each period. Keep in mind that these are not units of change
(e.g. standard deviation change), but opportunities in the simulation to
make a change. Recall that the model is simulating micro-steps,
and in each micro-step an individual is provided the opportunity to make
a decision. The rate parameter estimate gives a sense
of how many opportunities are provided to each individual. For example,
the estimate for period 1 is 1.3912
, meaning that each
actor had nearly slight more than 1 opportunity to change his/her social
distance score or maintain their current social distance score. The
shape parameters show that between waves, social distance did
not decline (the terms are not significantly different from zero.
As before, the significance of the effects can be evaluated in a
manner similar to a regression coefficient by looking at the ratio of
the estimate to the standard error (where a ratio of 1.96 indicates a
p-value of 0.05).
Are these estimates of no average change in the social distance
variable consistent with what we observed in our plots above?
Step 5: Adding Terms to the Model
Specifying Coevolution Terms
So far, our model has not specified any dependence terms between the
get along with network and the social distance
variable. In other words, we are allowing these variables to evolve
independently. Let’s take a look at including terms
where:
The behavior influences the network (e.g. egoX
,
alterX
, and sameX
or
simX
)
The network influences the behavior (e.g. indeg
,
outdeg
, and avSim
)
Behavior Effects on the Network
Now we want to include the effects of socDist
on tie
behavior. Recall that for actor covariates, we call these
interaction terms because the outdegree depends on a covariate.
We use the same syntax as we did in the prior lab for specifying these
terms.
Let’s use the terms we used from the prior lab:
- Individuals with a particular attribute are more likely to
send ties, a sender effect (this effect is called
egoX
).
- Individuals with a particular attribute are more likely to
receive ties, a receiver effect (this effect is called
altX
).
- Individuals with a particular attribute are more likely to nominate
others with the same or similar attribute, a homophily effect
(this effect is called
sameX
or simX
).
CoEvolutionEffects <- includeEffects(
CoEvolutionEffects, # the existing effects object.
# now the terms:
egoX, # social distance influences tie sending
altX, # social distance influences tie receiving
simX, # homophily for social distance
interaction1 = "socDist" # define the variable of interest
)
## effectName include fix test initialValue parm
## 1 socDist alter TRUE FALSE FALSE 0 0
## 2 socDist ego TRUE FALSE FALSE 0 0
## 3 socDist similarity TRUE FALSE FALSE 0 0
Network Effects on Behavior
Now we want to include the effects of the get along with
network on the measure of social distance. The notion of
interaction is the same here, but we want to see how behavior
depends on a tie (rather than visa versa).
Here are some common terms:
- Tie sending influences behavior (this effect is called
outdeg
).
- Tie receiving influences behavior (this effect is called
indeg
).
- The behavior of those whom you are tied to influences your behavior
(this effect is called
avSim
).
Let’s add these terms to our existing effects object. The difference
is that we have to use the name=
argument to tell it what
the dependent variable is (i.e. that it is not the network).
CoEvolutionEffects <- includeEffects(
CoEvolutionEffects, # the existing effects object
name = "socDist", # define the behavior as what is being influenced
# now the terms:
indeg, # tie sending influences behavior
outdeg, # tie receiving influences behavior
avSim, # influence from those in network
interaction1="getalong" # define the network as what is driving behavior
)
## effectName include fix test initialValue parm
## 1 socDist average similarity TRUE FALSE FALSE 0 0
## 2 socDist indegree TRUE FALSE FALSE 0 0
## 3 socDist outdegree TRUE FALSE FALSE 0 0
Now we are ready to estimate the model (again)!
CoEvolutionResults <- siena07(
CoEvolutionModel, # the model estimation information
data = CoEvolutionData, # the data object we created above
effects = CoEvolutionEffects # the effects object we created above
)
CoEvolutionResults
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
## Network Dynamics
## 1. rate basic rate parameter getalong 6.4389 ( 0.6772 ) 0.0661
## 2. eval outdegree (density) -2.1483 ( 0.0834 ) -0.0243
## 3. eval reciprocity 1.6986 ( 0.1905 ) -0.0338
## 4. eval socDist alter 0.0154 ( 0.1603 ) -0.0521
## 5. eval socDist ego -0.1503 ( 0.1822 ) -0.0417
## 6. eval socDist similarity 0.4819 ( 0.6187 ) 0.0646
##
## Behavior Dynamics
## 7. rate rate socDist period 1 1.3967 ( 0.3572 ) 0.0903
## 8. eval socDist linear shape -0.2572 ( 0.4994 ) 0.0202
## 9. eval socDist quadratic shape -0.5720 ( 0.6472 ) -0.0428
## 10. eval socDist average similarity 0.2405 ( 4.2183 ) 0.0632
## 11. eval socDist indegree 0.1605 ( 0.2207 ) 0.0265
## 12. eval socDist outdegree -0.1454 ( 0.2121 ) 0.0298
##
## Overall maximum convergence ratio: 0.1682
##
##
## Total of 3060 iteration steps.
Interpreting the Output (again)
For the Network Dynamics, we added effects for
social distance influencing tie receiving
(eval socDist alter
), tie sending
(eval socDist ego
), and homophily
(eval socDist similarity
). The results indicate that:
For eval socDistCovar alter
we see a positive
coefficient, indicating that individuals with higher values on the
social distance scale were more likely to
receive ties.
For eval socDistCovar ego
we see a negative
coefficient, indicating that individuals with higher values on their the
social distance scale were less likely to send
ties.
For eval socDistCovar similarity
we see a positive
coefficient, indicating that ego prefers sending ties to alters who are
more similar to ego on the social distance scale.
For all these effects, however, they are not significantly different
from zero.
What about the Behavior Dynamics? Recall that we
added effects for social distance to be influenced by the
get along with ties in ego’s network
(eval socDist average similarity
), ego’s indegree
(eval socDist indegree
), and ego’s outdegree
(eval socDist outdegree
). The results indicate that:
For eval socDist average similarity
, the coefficient
is positive, indicating that individuals change their social
distance score to align with the average of their get along
with network. (What would a negative coefficient
mean?).
For eval socDist indegree
, a positive coefficient
indicates that those who receive more ties increase their social
distance score.
For eval socDist outdegree
, a negative coefficient
indicates that those who send more ties decrease their social
distance score.
For all these effects, however, they are not significantly different
from zero.
More Attributes
Ok, that was not very exciting. Let’s work through this again, but
look at some different attributes. Specifically, let’s add two:
- Depression measures whether the respondent reports feeling
depressed (a scale of several items)
- Smoker a binary variable indicating whether the respondent
is a smoker
Now, let’s build these into our data object. But first, we should
take a look at the change in the variables.
# pull off the attributes for each wave
dep1 <- gaNetT1 %v% "depression.w1"
dep2 <- gaNetT2 %v% "depression.w2"
smk1 <- gaNetT1 %v% "smoker.w1"
smk2 <- gaNetT2 %v% "smoker.w2"
# create the deviations
depD <- dep2 - dep1
smkD <- smk2 - smk1
# set the partition
par( mfrow = c( 1, 2 ) )
# plot the within person deviations
hist( depD,
col = "#e62090",
main = "Within Person Change\n in Depression",
ylim = c( 0, 40 ),
xlim = c( min( depD )-1, max( depD )+1 ),
xlab = ""
)
abline( v = mean( depD ),
lwd = 5,
col = "black"
)
# plot the within person deviations
hist( smkD,
col = "#26bcde",
main = "Within Person Change\n in Smoking",
ylim = c( 0, 70 ),
xlim = c( min( smkD ), max( smkD ) ),
xlab = ""
)
abline( v = mean( smkD ),
lwd = 5,
col = "black"
)
What do the plots tell us about changes to depression and
smoking?
Steps 1-5
Ok, now that we have our attributes, let’s build our model:
# build the dependent variables
depression <- sienaDependent( cbind( dep1, dep2 ), type = "behavior" )
smoking <- sienaDependent( cbind( smk1, smk2 ), type = "behavior" )
# bind data together for Siena analysis
CoEvolutionData2 <- sienaDataCreate(
getalong, # our network
depression, # depression
smoking # smoking
)
# create the effects object
CoEvolutionEffects2 <- getEffects( CoEvolutionData2 )
# create a model that has particulars about estimation, then we estimate the model
CoEvolutionModel2 <- sienaModelCreate( projname = "Get Along & Depression & Smoking", seed=605 )
## If you use this algorithm object, siena07 will create/use an output file Get Along & Depression & Smoking.txt .
# estimate the model
CoEvolutionResults2 <- siena07(
CoEvolutionModel2,
data=CoEvolutionData2,
effects=CoEvolutionEffects2
)
# look at the results
CoEvolutionResults2
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
## Network Dynamics
## 1. rate basic rate parameter getalong 6.4107 ( 0.5707 ) 0.0752
## 2. eval outdegree (density) -2.1242 ( 0.0842 ) -0.0483
## 3. eval reciprocity 1.6857 ( 0.1879 ) -0.0748
##
## Behavior Dynamics
## 4. rate <1> rate depression period 1 1.2991 ( 0.3703 ) 0.0028
## 5. eval <1> depression linear shape -0.3841 ( 0.3048 ) 0.0343
## 6. eval <1> depression quadratic shape -1.0757 ( 0.8004 ) 0.0440
## 7. rate <2> rate smoking period 1 0.2047 ( 0.0918 ) 0.0028
## 8. eval <2> smoking linear shape 0.2597 ( 0.7906 ) -0.0898
##
## Overall maximum convergence ratio: 0.1563
##
##
## Total of 2803 iteration steps.
What are the interpretation of the rate and shape terms for
depression and smoking?
Adding Terms
Behavior Effects on the Network
Now we want to include the effects of depression
and
smoking
on tie behavior. We will use the same
egoX
, altX
, and sameX
or
simX
terms.
CoEvolutionEffects2 <- includeEffects( CoEvolutionEffects2, egoX, altX, simX, interaction1 = "depression" )
## effectName include fix test initialValue parm
## 1 depression alter TRUE FALSE FALSE 0 0
## 2 depression ego TRUE FALSE FALSE 0 0
## 3 depression similarity TRUE FALSE FALSE 0 0
CoEvolutionEffects2 <- includeEffects( CoEvolutionEffects2, egoX, altX, sameX, interaction1 = "smoking" )
## effectName include fix test initialValue parm
## 1 smoking alter TRUE FALSE FALSE 0 0
## 2 smoking ego TRUE FALSE FALSE 0 0
## 3 same smoking TRUE FALSE FALSE 0 0
In the last line above, why did we use sameX
and not
simX
?
Network Effects on Behavior
Now we want to include the effects of the get along with
network on the measures of depression and smoking. We
will again use outdeg
, indeg
, and
avSim
.
CoEvolutionEffects2 <- includeEffects( CoEvolutionEffects2, name = "depression", indeg, outdeg, avSim, interaction1="getalong" )
## effectName include fix test initialValue parm
## 1 depression average similarity TRUE FALSE FALSE 0 0
## 2 depression indegree TRUE FALSE FALSE 0 0
## 3 depression outdegree TRUE FALSE FALSE 0 0
CoEvolutionEffects2 <- includeEffects( CoEvolutionEffects2, name = "smoking", indeg, outdeg, avSim, interaction1="getalong" )
## effectName include fix test initialValue parm
## 1 smoking average similarity TRUE FALSE FALSE 0 0
## 2 smoking indegree TRUE FALSE FALSE 0 0
## 3 smoking outdegree TRUE FALSE FALSE 0 0
Now we are ready to estimate the model (again)!
CoEvolutionEffects2 <- siena07( CoEvolutionModel2, data = CoEvolutionData2, effects = CoEvolutionEffects2 )
CoEvolutionEffects2
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
## Network Dynamics
## 1. rate basic rate parameter getalong 5.9039 ( 0.7052 ) -0.6571
## 2. eval outdegree (density) -2.1238 ( 0.3819 ) -0.2049
## 3. eval reciprocity 1.7252 ( 1.0327 ) 0.0698
## 4. eval depression alter -0.2013 ( 0.5205 ) -0.4096
## 5. eval depression ego 0.0217 ( 0.9264 ) -0.0124
## 6. eval depression similarity 1.2172 ( 4.3319 ) 0.3179
## 7. eval smoking alter -0.3665 ( 0.6584 ) 0.4133
## 8. eval smoking ego -0.5656 ( 0.3618 ) 0.3094
## 9. eval same smoking -0.1900 ( 0.5350 ) -0.3831
##
## Behavior Dynamics
## 10. rate <1> rate depression period 1 1.5777 ( 2.0742 ) 0.3801
## 11. eval <1> depression linear shape 0.2478 ( 2.9723 ) 0.3566
## 12. eval <1> depression quadratic shape -0.8268 ( 4.5132 ) 0.1996
## 13. eval <1> depression average similarity 1.8083 ( 11.0245 ) 0.7328
## 14. eval <1> depression indegree -0.0373 ( 0.9911 ) 0.5794
## 15. eval <1> depression outdegree -0.1123 ( 0.6647 ) 0.6793
## 16. rate <2> rate smoking period 1 0.1528 ( 0.2235 ) -0.1967
## 17. eval <2> smoking linear shape 25.6622 ( 21251.8960 ) 0.0512
## 18. eval <2> smoking average similarity 123.7333 ( 102386.9365 ) -0.3415
## 19. eval <2> smoking indegree 55.0565 ( 45360.6141 ) -0.0907
## 20. eval <2> smoking outdegree -21.9846 ( 18348.9309 ) 0.2756
##
## Overall maximum convergence ratio: 1.9345
##
##
## Total of 3619 iteration steps.
What are the interpretations of the coefficients?