ERGMs for Rank-Order Relational Data


Coverage

This vignette covers modelling of rank-order relational data in the ergm framework. The reader is strongly encouraged to first work through the vignette on valued ERGMs in the ergm.count package and read the article by Krivitsky and Butts (2017).

Modeling ordinal relational data using ergm.rank

library(ergm.rank)

Note that the implementations so far are very slow, so we will only do a short example.

References

Suppose that we reprsent ranking (or ordinal rating) of j by i by the value of yi, j. What reference can we use for ranks?

help("ergm-references", "ergm.rank")

Terms

For details, see Krivitsky and Butts (2017). It’s not meaningful to

  • compare ranks across different egos.
  • take rank difference within an ego.

The only thing we are allowed to do is to ask if i has ranked j over k.

Therefore, ordinal relational data call for their own sufficient statistics. These will depend on $$ \begin{equation*} \boldsymbol{y}_{i:\,j\succ k}\equiv\begin{cases} 1 & \text{if $j\stackrel{i}{\succ}k$ i.e., $i$ ranks $j$ above $k$;} \\ 0 & \text{otherwise.} \end{cases} \end{equation*} $$ We may interpret them using the promotion statistic Δi, jg(y) ≡ g(yi: j ⇄ j+) − g(y).

Let Nk be the set of possible k-tuples of actor indices where no actors are repeated. Then,

  • rank.deference: Deference (aversion): Measures the amount of “deference” in the network: configurations where an ego i ranks an alter j over another alter k, but j, in turn, ranks k over i: gD(y) = ∑(i, j, l) ∈ N3≠yl: j ≻ iyi: l ≻ j Δi, jgD(y) = 2(yj+: i ≻ j + yj: j+ ≻ i − 1). A lower-than-chance value of this statistic and/or a negative coefficient implies a form of mutuality in the network.

  • rank.edgecov(x, attrname): Dyadic covariates: Models the effect of a dyadic covariate on the propensity of an ego i to rank alter j highly: gA(y; x) = ∑(i, j, k) ∈ N3≠yi: j ≻ k(xj − xk). Δi, jgA(y; x) = 2(xj − xj+), See the ?rank.edgecov ERGM term documentation for arguments.

  • rank.inconsistency(x, attrname, weights, wtname, wtcenter): (Weighted) Inconsistency: Measures the amount of disagreement between rankings of the focus network and a fixed covariate network x, by couting the number of pairwise comparisons for which the two networks disagree. x can be a network with an edge attribute attrname containing the ranks or a matrix of appropriate dimension containing the ranks. If x is not given, it defaults to the LHS network, and if attrname is not given, it defaults to the response edge attribute. gI(y; y′) = ∑(i, j, k) ∈ Ns3≠[yi: j ≻ k(1 − yi: j ≻ k) + (1 − yi: j ≻ k)yi: j ≻ k], with promotion statistic being simply Δi, jgI(y; y′) = 2(yi: j+ ≻ j − yi: j ≻ j+). Optionally, the count can be weighted by the weights argument, which can be either a 3D n × n × n-array whose (i, j, k)th element gives the weight for the comparison by i of j and k or a function taking three arguments, i, j, and k, and returning the weight of this comparison. If wtcenter=TRUE, the calculated weights will be centered around their mean. wtname can be used to label this term.

  • rank.nodeicov(attrname, transform, transformname): Attractiveness/Popularity covariates: Models the effect of a nodal covariate on the propensity of an actor to be ranked highly by the others. gA(y; x) = ∑(i, j, k) ∈ N3≠yi: j ≻ k(xj − xk). Δi, jgA(y; x) = 2(xj − xj+), See the ?nodeicov ERGM term documentation for arguments.

  • rank.nonconformity(to, par): Nonconformity: Measures the amount of ``nonconformity’’ in the network: configurations where an ego i ranks an alter j over another alter k, but ego l ranks k over j.

    This statistic has an argument to, which controls to whom an ego may conform:

    • "all" (the default) Nonconformity to all egos is counted: gGNC(y) = ∑(i, j, k, l) ∈ N4≠yl: j ≻ k(1 − yi: j ≻ k) Δi, jgGNC(y) = 2∑l ∈ N ∖ {i, j, j+}(yl: j+ ≻ j − yl: j ≻ j+). A lower-than-chance value of this statistic and/or a negative coefficient implies a degree of consensus in the network.

    • "localAND" (Local nonconformity) Nonconformity of i to ego l regarding the relative ranking of j and k is only counted if i ranks l over both j and k: gLNC(y) = ∑(i, j, k, l) ∈ N4≠yi: l ≻ jyi: l ≻ kyl: j ≻ k(1 − yi: j ≻ k) $$ \begin{align*} \Delta_{i,j}^\nearrow\boldsymbol{g}_{\text{LNC}}(\boldsymbol{y})=\sum_{k\in {N}\backslash\left\{i,j,j^+\right\}}(& \boldsymbol{y}_{i:\,k\succ j^+}\boldsymbol{y}_{k:\,j^+\succ j}-\boldsymbol{y}_{i:\,k\succ j^+}\boldsymbol{y}_{k:\,j\succ j^+}\\ \vphantom{\sum_{k\in {N}\backslash\left\{i,j,j^+\right\}}}&+\boldsymbol{y}_{k:\,i\succ j^+}\boldsymbol{y}_{k:\,j^+\succ j}-\boldsymbol{y}_{k:\,i\succ j}\boldsymbol{y}_{k:\,j\succ j^+}\\ \vphantom{\sum_{k\in {N}\backslash\left\{i,j,j^+\right\}}}&+\boldsymbol{y}_{j:\,k\succ j^+}\boldsymbol{y}_{i:\,j^+\succ k}-\boldsymbol{y}_{j^+:\,k\succ j}\boldsymbol{y}_{i:\,j\succ k}). \end{align*} $$ A lower-than-chance value of this statistic and/or a negative coefficient implies a form of hierarchical transitivity in the network.

Example

Consider the Newcomb’s fraternity data:

data(newcomb)
as.matrix(newcomb[[1]], attrname="rank")
##     1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## 1   0  7 12 11 10  4 13 14 15 16  3  9  1  5  8  6  2
## 2   8  0 16  1 11 12  2 14 10 13 15  6  7  9  5  3  4
## 3  13 10  0  7  8 11  9 15  6  5  2  1 16 12  4 14  3
## 4  13  1 15  0 14  4  3 16 12  7  6  9  8 11 10  5  2
## 5  14 10 11  7  0 16 12  4  5  6  2  3 13 15  8  9  1
## 6   7 13 11  3 15  0 10  2  4 16 14  5  1 12  9  8  6
## 7  15  4 11  3 16  8  0  6  9 10  5  2 14 12 13  7  1
## 8   9  8 16  7 10  1 14  0 11  3  2  5  4 15 12 13  6
## 9   6 16  8 14 13 11  4 15  0  7  1  2  9  5 12 10  3
## 10  2 16  9 14 11  4  3 10  7  0 15  8 12 13  1  6  5
## 11 12  7  4  8  6 14  9 16  3 13  0  2 10 15 11  5  1
## 12 15 11  2  6  5 14  7 13 10  4  3  0 16  8  9 12  1
## 13  1 15 16  7  4  2 12 14 13  8  6 11  0 10  3  9  5
## 14 14  5  8  6 13  9  2 16  1  3 12  7 15  0  4 11 10
## 15 16  9  4  8  1 13 11 12  6  2  3  5 10 15  0 14  7
## 16  8 11 15  3 13 16 14 12  1  9  2  6 10  7  5  0  4
## 17  9 15 10  2  4 11  5 12  3  7  8  1  6 16 14 13  0
as.matrix(newcomb[[1]], attrname="descrank")
##     1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
## 1   0 10  5  6  7 13  4  3  2  1 14  8 16 12  9 11 15
## 2   9  0  1 16  6  5 15  3  7  4  2 11 10  8 12 14 13
## 3   4  7  0 10  9  6  8  2 11 12 15 16  1  5 13  3 14
## 4   4 16  2  0  3 13 14  1  5 10 11  8  9  6  7 12 15
## 5   3  7  6 10  0  1  5 13 12 11 15 14  4  2  9  8 16
## 6  10  4  6 14  2  0  7 15 13  1  3 12 16  5  8  9 11
## 7   2 13  6 14  1  9  0 11  8  7 12 15  3  5  4 10 16
## 8   8  9  1 10  7 16  3  0  6 14 15 12 13  2  5  4 11
## 9  11  1  9  3  4  6 13  2  0 10 16 15  8 12  5  7 14
## 10 15  1  8  3  6 13 14  7 10  0  2  9  5  4 16 11 12
## 11  5 10 13  9 11  3  8  1 14  4  0 15  7  2  6 12 16
## 12  2  6 15 11 12  3 10  4  7 13 14  0  1  9  8  5 16
## 13 16  2  1 10 13 15  5  3  4  9 11  6  0  7 14  8 12
## 14  3 12  9 11  4  8 15  1 16 14  5 10  2  0 13  6  7
## 15  1  8 13  9 16  4  6  5 11 15 14 12  7  2  0  3 10
## 16  9  6  2 14  4  1  3  5 16  8 15 11  7 10 12  0 13
## 17  8  2  7 15 13  6 12  5 14 10  9 16 11  1  3  4  0

Let’s fit a model for the two types of nonconformity and deference at the first time point:

newc.fit1<- ergm(newcomb[[1]]~rank.nonconformity+rank.nonconformity("localAND")+rank.deference,response="descrank",reference=~CompleteOrder,control=control.ergm(MCMC.burnin=4096, MCMC.interval=32, CD.conv.min.pval=0.05),eval.loglik=FALSE)
## Warning: 'glpk' selected as the solver, but package 'Rglpk' is not available;
## falling back to 'lpSolveAPI'. This should be fine unless the sample size and/or
## the number of parameters is very big.
summary(newc.fit1)
## Call:
## ergm(formula = newcomb[[1]] ~ rank.nonconformity + rank.nonconformity("localAND") + 
##     rank.deference, response = "descrank", reference = ~CompleteOrder, 
##     eval.loglik = FALSE, control = control.ergm(MCMC.burnin = 4096, 
##         MCMC.interval = 32, CD.conv.min.pval = 0.05))
## 
## Monte Carlo Maximum Likelihood Results:
## 
##                         Estimate Std. Error MCMC % z value Pr(>|z|)    
## nonconformity          -0.003910   0.003159      0  -1.238    0.216    
## nonconformity.localAND -0.009680   0.009471      0  -1.022    0.307    
## deference              -0.152165   0.035963      0  -4.231   <1e-04 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Check diagnostics:

mcmc.diagnostics(newc.fit1)
newc.fit15 <- ergm(newcomb[[15]]~rank.nonconformity+rank.nonconformity("localAND")+rank.deference,response="descrank",reference=~CompleteOrder,control=control.ergm(MCMC.burnin=4096, MCMC.interval=32, CD.conv.min.pval=0.05),eval.loglik=FALSE)
summary(newc.fit15)
## Call:
## ergm(formula = newcomb[[15]] ~ rank.nonconformity + rank.nonconformity("localAND") + 
##     rank.deference, response = "descrank", reference = ~CompleteOrder, 
##     eval.loglik = FALSE, control = control.ergm(MCMC.burnin = 4096, 
##         MCMC.interval = 32, CD.conv.min.pval = 0.05))
## 
## Monte Carlo Maximum Likelihood Results:
## 
##                          Estimate Std. Error MCMC % z value Pr(>|z|)    
## nonconformity           0.0007324  0.0025546      0   0.287    0.774    
## nonconformity.localAND -0.0430711  0.0074902      0  -5.750   <1e-04 ***
## deference              -0.3275382  0.0756276      0  -4.331   <1e-04 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Check diagnostics:

mcmc.diagnostics(newc.fit15)

References

Krivitsky, Pavel N., and Carter T. Butts. 2017. “Exponential-Family Random Graph Models for Rank-Order Relational Data.” Sociological Methodology 47 (1): 68–112. https://doi.org/10.1177/0081175017692623.