Title: | Fit, Simulate and Diagnose Exponential-Family Models for Networks |
---|---|
Description: | An integrated set of tools to analyze and simulate networks based on exponential-family random graph models (ERGMs). 'ergm' is a part of the Statnet suite of packages for network analysis. See Hunter, Handcock, Butts, Goodreau, and Morris (2008) <doi:10.18637/jss.v024.i03> and Krivitsky, Hunter, Morris, and Klumb (2023) <doi:10.18637/jss.v105.i06>. |
Authors: | Mark S. Handcock [aut], David R. Hunter [aut], Carter T. Butts [aut], Steven M. Goodreau [aut], Pavel N. Krivitsky [aut, cre] , Martina Morris [aut], Li Wang [ctb], Kirk Li [ctb], Skye Bender-deMoll [ctb], Chad Klumb [ctb], Michał Bojanowski [ctb] , Ben Bolker [ctb], Christian Schmid [ctb], Joyce Cheng [ctb], Arya Karami [ctb], Adrien Le Guillou [ctb] |
Maintainer: | Pavel N. Krivitsky <[email protected]> |
License: | GPL-3 + file LICENSE |
Version: | 4.8.1-7560 |
Built: | 2025-01-21 03:24:24 UTC |
Source: | https://github.com/statnet/ergm |
This is a flag in the proposal table indicating that the proposal can enforce arbitrary combinations of dyadic constraints. It cannot be invoked directly by the user.
ergmConstraint
for index of constraints and hints currently visible to the package.
None
This term adds one network statistic to the model equaling
the sum of abs(attr[i]-attr[j])^pow
for all edges (i,j)
in
the network.
# binary: absdiff(attr, # pow=1) # valued: absdiff(attr, # pow=1, # form="sum")
# binary: absdiff(attr, # pow=1) # valued: absdiff(attr, # pow=1, # form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
pow |
power to which to take the absolute difference |
form |
character how to aggregate tie values in a valued ERGM |
ergm versions 3.9.4 and earlier used different arguments for this
term. See ergm-options
for how to invoke the old behaviour.
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, quantitative nodal attribute, undirected, binary, valued
This term adds one statistic for every possible nonzero distinct
value of abs(attr[i]-attr[j])
in the network. The value of each such
statistic is the number of edges in the network with the corresponding
absolute difference.
# binary: absdiffcat(attr, # base=NULL, # levels=NULL) # valued: absdiffcat(attr, # base=NULL, # levels=NULL, # form="sum")
# binary: absdiffcat(attr, # base=NULL, # levels=NULL) # valued: absdiffcat(attr, # base=NULL, # levels=NULL, # form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
base |
deprecated |
levels |
specifies which nonzero difference to include in or exclude from the model. (See Specifying Vertex
attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
ergm versions 3.9.4 and earlier used different arguments for this
term. See ergm-options
for how to invoke the old behaviour.
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels
are passed,
levels
overrides base
.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, dyad-independent, undirected, binary, valued
-starAdd one network statistic to the model equal to a weighted alternating
sequence of -star statistics with weight parameter
lambda
.
# binary: altkstar(lambda, # fixed=FALSE)
# binary: altkstar(lambda, # fixed=FALSE)
lambda |
weight parameter to model |
fixed |
indicates whether the |
This is the version given in Snijders et al. (2006). The gwdegree
and
altkstar
produce mathematically equivalent models, as long as they are used
together with the edges
(or kstar(1)
) term, yet the interpretation of the
gwdegree
parameters is slightly more straightforward than the interpretation
of the altkstar
parameters. For this reason, we recommend the use of the
gwdegree
instead of altkstar
. See Section 3 and especially equation (13)
of Hunter (2007) for details.
This term can only be used with undirected networks.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, curved, undirected, binary
Compute an analysis of variance table for one or more ERGM fits.
## S3 method for class 'ergm' anova(object, ..., eval.loglik = FALSE) ## S3 method for class 'ergmlist' anova(object, ..., eval.loglik = FALSE)
## S3 method for class 'ergm' anova(object, ..., eval.loglik = FALSE) ## S3 method for class 'ergmlist' anova(object, ..., eval.loglik = FALSE)
object , ...
|
|
eval.loglik |
a logical specifying whether the log-likelihood will be evaluated if missing. |
Specifying a single object gives a sequential analysis of variance table for that fit. That is, the reductions in the residual sum of squares as each term of the formula is added in turn are given in the rows of a table, plus the residual sum of squares.
The table will contain F statistics (and P values) comparing the mean square for the row to the residual mean square.
If more than one object is specified, the table has a row for the residual degrees of freedom and sum of squares for each model. For all but the first model, the change in degrees of freedom and sum of squares is also given. (This only make statistical sense if the models are nested.) It is conventional to list the models from smallest to largest, but this is up to the user.
If any of the objects do not have estimated log-likelihoods, produces an
error, unless eval.loglik=TRUE
.
An object of class "anova"
inheriting from class
"data.frame"
.
The comparison between two or more models will only be
valid if they are fitted to the same dataset. This may be a problem if there
are missing values and 's default of na.action = na.omit
is used, and
anova.ergmlist()
will detect this with an error.
The model fitting function ergm()
, anova()
,
logLik.ergm()
for adding the log-likelihood to an existing
ergm
object.
data(molecule) molecule %v% "atomic type" <- c(1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3) fit0 <- ergm(molecule ~ edges) anova(fit0) fit1 <- ergm(molecule ~ edges + nodefactor("atomic type")) anova(fit1) fit2 <- ergm(molecule ~ edges + nodefactor("atomic type") + gwesp(0.5, fixed=TRUE), eval.loglik=TRUE) # Note the eval.loglik argument. anova(fit0, fit1) anova(fit0, fit1, fit2)
data(molecule) molecule %v% "atomic type" <- c(1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3) fit0 <- ergm(molecule ~ edges) anova(fit0) fit1 <- ergm(molecule ~ edges + nodefactor("atomic type")) anova(fit1) fit2 <- ergm(molecule ~ edges + nodefactor("atomic type") + gwesp(0.5, fixed=TRUE), eval.loglik=TRUE) # Note the eval.loglik argument. anova(fit0, fit1) anova(fit0, fit1, fit2)
A multivariate hypothesis test for a single population mean or a difference between them. This version attempts to adjust for multivariate autocorrelation in the samples.
approx.hotelling.diff.test( x, y = NULL, mu0 = 0, assume.indep = FALSE, var.equal = FALSE, ... )
approx.hotelling.diff.test( x, y = NULL, mu0 = 0, assume.indep = FALSE, var.equal = FALSE, ... )
x |
a numeric matrix of data values with cases in rows and variables in columns. |
y |
an optinal matrix of data values with cases in rows and variables in columns for a 2-sample test. |
mu0 |
an optional numeric vector: for a 1-sample test, the poulation mean under the null hypothesis; and for a 2-sample test, the difference between population means under the null hypothesis; defaults to a vector of 0s. |
assume.indep |
if |
var.equal |
for a 2-sample test, perform the pooled test: assume population variance-covariance matrices of the two variables are equal. |
... |
additional arguments, passed on to |
An object of class htest
with the following information:
statistic |
The |
parameter |
Degrees of freedom. |
p.value |
P-value. |
method |
Method specifics. |
null.value |
Null hypothesis mean or mean difference. |
alternative |
Always |
estimate |
Sample difference. |
covariance |
Estimated variance-covariance matrix of the estimate of the difference. |
covariance.x |
Estimated variance-covariance matrix of the estimate of the mean of |
covariance.y |
Estimated variance-covariance matrix of the estimate of the mean of |
It has a print method print.htest()
.
For mcmc.list
input, the variance for this test is
estimated with unpooled means. This is not strictly correct.
Hotelling, H. (1947). Multivariate Quality Control. In C. Eisenhart, M. W. Hastay, and W. A. Wallis, eds. Techniques of Statistical Analysis. New York: McGraw-Hill.
as.network.numeric()
creates a random Bernoulli network of the
given size as an object of class network
.
## S3 method for class 'numeric' as.network( x, directed = TRUE, hyper = FALSE, loops = FALSE, multiple = FALSE, bipartite = FALSE, ignore.eval = TRUE, names.eval = NULL, edge.check = FALSE, density = NULL, init = NULL, numedges = NULL, ... )
## S3 method for class 'numeric' as.network( x, directed = TRUE, hyper = FALSE, loops = FALSE, multiple = FALSE, bipartite = FALSE, ignore.eval = TRUE, names.eval = NULL, edge.check = FALSE, density = NULL, init = NULL, numedges = NULL, ... )
x |
count; the number of nodes in the network |
directed |
logical; should edges be interpreted as directed? |
hyper |
logical; are hyperedges allowed? Currently ignored. |
loops |
logical; should loops be allowed? Currently ignored. |
multiple |
logical; are multiplex edges allowed? Currently ignored. |
bipartite |
count; should the network be interpreted as bipartite? If present (i.e., non-NULL) it is the count of the number of actors in the bipartite network. In this case, the number of nodes is equal to the number of actors plus the number of events (with all actors preceding all events). The edges are then interpreted as nondirected. |
ignore.eval |
logical; ignore edge values? Currently ignored. |
names.eval |
optionally, the name of the attribute in which edge values should be stored. Currently ignored. |
edge.check |
logical; perform consistency checks on new edges? |
density |
numeric; the probability of a tie for Bernoulli networks. If
neither density nor |
init |
numeric; the log-odds of a tie for Bernoulli networks. It is only used if density is not specified. |
numedges |
count; if present, sample the Bernoulli network conditional on this number of edges (rather than independently with the specified probability). |
... |
additional arguments |
The network will not have vertex, edge or network attributes. These
can be added with operators such as %v%
, %n%
, %e%
.
An object of class network
Butts, C.T. 2002. “Memory Structures for Relational Data in R: Classes and Interfaces” Working Paper.
# Draw a random directed network with 25 nodes g <- network(25) # Draw a random undirected network with density 0.1 g <- network(25, directed=FALSE, density=0.1) # Draw a random bipartite network with 4 actors and 6 events and density 0.1 g <- network(10, bipartite=4, directed=FALSE, density=0.1) # Draw a random directed network with 25 nodes and 50 edges g <- network(25, numedges=50)
# Draw a random directed network with 25 nodes g <- network(25) # Draw a random undirected network with density 0.1 g <- network(25, directed=FALSE, density=0.1) # Draw a random bipartite network with 4 actors and 6 events and density 0.1 g <- network(10, bipartite=4, directed=FALSE, density=0.1) # Draw a random directed network with 25 nodes and 50 edges g <- network(25, numedges=50)
This term adds one network statistic to the model equal to the
number of pairs of actors for which exactly one of
or
exists.
# binary: asymmetric(attr=NULL, diff=FALSE, keep=NULL, levels=NULL)
# binary: asymmetric(attr=NULL, diff=FALSE, keep=NULL, levels=NULL)
attr |
quantitative attribute (see Specifying Vertex attributes and Levels ( |
diff |
Used in the same way as for the |
keep |
deprecated |
level |
Used in the same way as for the |
This term can only be used with directed networks.
The argument keep
is retained for backwards compatibility and may be
removed in a future version. When both keep
and levels
are passed,
levels
overrides keep
.
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, triad-related, binary
Adds the number of
statistics equal to the length of threshold
equaling to the number of dyads whose values equal or exceed the
corresponding element of threshold
.
# valued: atleast(threshold=0)
# valued: atleast(threshold=0)
threshold |
vector of numerical values |
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, undirected, valued
Adds the number of statistics equal to the length of threshold
equaling to the number of dyads whose values equal or are exceeded by the
corresponding element of threshold
.
# valued: atmost(threshold=0)
# valued: atmost(threshold=0)
threshold |
a vector of numerical values |
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, undirected, valued
This term adds one statistic to the model, equal to the sum of the covariate values
for each edge appearing in the network, where the covariate value for a given edge is determined by its mixing type on
attr
. Undirected networks are regarded as having undirected mixing, and it is assumed that mat
is symmetric
in that case.
This term can be useful for simulating large networks with many mixing types, where nodemix
would be slow due to
the large number of statistics, and edgecov
cannot be used because an adjacency matrix would be too big.
# binary: attrcov(attr, mat)
# binary: attrcov(attr, mat)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
mat |
a matrix of covariates with the same dimensions as a mixing matrix for |
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, undirected, binary
Wraps binary ergm
terms for use in valued models, with formula
specifying which terms
are to be wrapped and form
specifying how they are to be
used and how the binary network they are evaluated on is to be constructed.
# valued: B(formula, form)
# valued: B(formula, form)
formula |
a one-sided |
form |
One of three values:
|
For example, B(~nodecov("a"), form="sum")
is equivalent to
nodecov("a", form="sum")
and similarly with
form="nonzero"
.
When a valued implementation is available, it should be preferred, as it is likely to be faster.
ergmTerm
for index of model terms currently visible to the package.
operator, valued
This term adds one network statistic to the model, equal to the number of nodes in the first mode of the network with degree 2 or higher. The first mode of a bipartite network object is sometimes known as the "actor" mode. This term can only be used with undirected bipartite networks.
# binary: b1concurrent(by=NULL, levels=NULL)
# binary: b1concurrent(by=NULL, levels=NULL)
by |
optional argument specifying a vertex attribute (see Specifying
Vertex attributes and Levels ( |
levels |
TODO (See Specifying Vertex
attributes and Levels ( |
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, undirected, binary
This term adds a single network statistic for each quantitative attribute or matrix column to the model equaling the total
value of attr(i)
for all edges in the network. This
term may only be used with bipartite networks. For categorical attributes,
see
b1factor
.
# binary: b1cov(attr) # valued: b1cov(attr, form="sum")
# binary: b1cov(attr) # valued: b1cov(attr, form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
ergm versions 3.9.4 and earlier used different arguments for this
term. See ergm-options
for how to invoke the old behaviour.
ergmTerm
for index of model terms currently visible to the package.
bipartite, dyad-independent, frequently-used, quantitative nodal attribute, undirected, binary, valued
This term adds a single network statistic equalling the sum over the nodes of the range over of its neighbors' values.
# binary: nodecovrange(attr)
# binary: nodecovrange(attr)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
This is a network analogue of the statistic introduced by Hoffman et al. (2023).
Hoffman M, Block P, Snijders TAB (2023). “Modeling Partitions of Individuals.” Sociological Methodology, 53(1), 1–41. ISSN 1467-9531, doi:10.1177/00811750221145166.
ergmTerm
for index of model terms currently visible to the package.
bipartite, quantitative nodal attribute, binary
This term adds one
network statistic to the model for each element of from
(or to
); the th
such statistic equals the number of nodes of the first mode
("actors") in the network of degree greater than or equal to
from[i]
but strictly less than to[i]
, i.e. with edge count
in semiopen interval [from,to)
.
This term can only be used with bipartite networks; for directed networks
see idegrange
and odegrange
. For undirected networks,
see degrange
, and see b2degrange
for degrees of the second mode ("events").
# binary: b1degrange(from, to=`+Inf`, by=NULL, homophily=FALSE, levels=NULL)
# binary: b1degrange(from, to=`+Inf`, by=NULL, homophily=FALSE, levels=NULL)
from , to
|
vectors of distinct integers. If one of the vectors have length 1, it is recycled to the length of the other. Otherwise, it must have the same length. |
by , levels , homophily
|
the optional argument |
ergmTerm
for index of model terms currently visible to the package.
bipartite, undirected, binary
This term adds one network statistic to the model for
each element in d
; the th such statistic equals the number of
nodes of degree
d[i]
in the first mode of a bipartite network, i.e.
with exactly d[i]
edges. The first mode of a bipartite network object
is sometimes known as the "actor" mode.
# binary: b1degree(d, by=NULL, levels=NULL)
# binary: b1degree(d, by=NULL, levels=NULL)
d |
a vector of distinct integers. |
by , levels , homophily
|
the optional argument |
This term can only be used with undirected bipartite networks.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, frequently-used, undirected, binary
For bipartite networks, preserve the degree for the first mode of each vertex of the given network, while allowing the degree for the second mode to vary.
# b1degrees
# b1degrees
ergmConstraint
for index of constraints and hints currently visible to the package.
bipartite
This term adds one
network statistic to the model for each element in d
; the th
such statistic equals the number of dyads in the first bipartition with exactly
d[i]
shared partners. (Those shared partners, of course, must be members
of the second bipartition.) This term can only be used with bipartite networks.
# binary: b1dsp(d)
# binary: b1dsp(d)
d |
a vector of distinct integers. |
This term takes an additional term option (see
options?ergm
), cache.sp
, controlling whether
the implementation will cache the number of shared partners for
each dyad in the network; this is usually enabled by default.
ergmTerm
for index of model terms currently visible to the package.
bipartite, undirected, binary
This term adds multiple network statistics to the model, one for each of (a subset of) the
unique values of the attr
attribute. Each of these statistics
gives the number of times a node with that attribute in the first mode of
the network appears in an edge. The first mode of a bipartite network object
is sometimes known as the "actor" mode.
# binary: b1factor(attr, base=1, levels=-1) # valued: b1factor(attr, base=1, levels=-1, form="sum")
# binary: b1factor(attr, base=1, levels=-1) # valued: b1factor(attr, base=1, levels=-1, form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
base |
deprecated |
levels |
this optional argument controls which levels of the attribute
attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
To include all attribute values is usually not a good idea, because
the sum of all such statistics equals the number of edges and hence a linear
dependency would arise in any model also including edges
. The default,
levels=-1
, is therefore to omit the first (in lexicographic order)
attribute level. To include all levels, pass either levels=TRUE
(i.e., keep all levels) or levels=NULL
(i.e., do not filter levels).
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels
are passed,
levels
overrides base
.
This term can only be used with undirected bipartite networks.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, dyad-independent, frequently-used, undirected, binary, valued
This term adds a single network statistic to the model, counting, for each node, the number of distinct values of the attribute found among its neighbors.
# binary: b1factordistinct(attr, levels=TRUE)
# binary: b1factordistinct(attr, levels=TRUE)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
levels |
this optional argument controls which levels of the attribute
attributes and Levels ( |
This is a network analogue of the statistic introduced by Hoffman et al. (2023).
Hoffman M, Block P, Snijders TAB (2023). “Modeling Partitions of Individuals.” Sociological Methodology, 53(1), 1–41. ISSN 1467-9531, doi:10.1177/00811750221145166.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, binary
This term adds one network statistic to the model for
each element in d
; the th such statistic equals the number of
nodes in the first mode of a bipartite network with at least degree
d[i]
.
The first mode of a bipartite network object is sometimes known as the "actor" mode.
# binary: b1mindegree(d)
# binary: b1mindegree(d)
d |
a vector of distinct integers. |
This term can only be used with undirected bipartite networks.
ergmTerm
for index of model terms currently visible to the package.
bipartite, undirected, binary
This term is introduced
in Bomiriya et al (2014). With the default alpha
and beta
values, this term will
simply be a homophily based two-star statistic. This term adds one statistic to the model
unless diff
is set to TRUE
, in which case the term adds multiple network
statistics to the model, one for each of (a subset of) the unique values of the attr
attribute.
# binary: b1nodematch(attr, diff=FALSE, keep=NULL, alpha=1, beta=1, byb2attr=NULL, # levels=NULL)
# binary: b1nodematch(attr, diff=FALSE, keep=NULL, alpha=1, beta=1, byb2attr=NULL, # levels=NULL)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
diff |
by default, one statistic will be added to the model. If |
keep |
deprecated |
alpha , beta
|
optional discount parameters both of which take values from |
byb2attr |
specifies a
second mode categorical attribute. Setting this argument
will separate the orginal statistics based on the values of the set second mode attribute—
i.e. for example, if |
levels |
select a subset of |
If an alpha
discount parameter is used, each of these statistics gives the sum of
the number of common second-mode nodes raised to the power alpha
for each pair of
first-mode nodes with that attribute. If a beta
discount parameter is used, each
of these statistics gives half the sum of the number of two-paths with two first-mode nodes
with that attribute as the two ends of the two path raised to the power beta
for each
edge in the network.
This term can only be used with undirected bipartite networks.
The argument keep
is retained for backwards compatibility and may be
removed in a future version. When both keep
and levels
are passed,
levels
overrides keep
.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, dyad-independent, frequently-used, undirected, binary
This term adds one network statistic for each node in the first bipartition, equal to the number of
ties of that node. This term can only be used with bipartite networks. For directed networks, see sender
and
receiver
. For unipartite networks, see sociality
.
# binary: b1sociality(nodes=-1) # valued: b1sociality(nodes=-1, form="sum")
# binary: b1sociality(nodes=-1) # valued: b1sociality(nodes=-1, form="sum")
nodes |
By default, |
form |
character how to aggregate tie values in a valued ERGM |
ergmTerm
for index of model terms currently visible to the package.
bipartite, dyad-independent, undirected, binary, valued
-stars for the first mode in a bipartite networkThis term adds one network statistic to the model for
each element in k
. The th such statistic counts the number of
distinct
k[i]
-stars whose center node is in the first mode of the
network. The first mode of a bipartite network object is sometimes known as
the "actor" mode. A -star is defined to be a center node
and
a set of
different nodes
such that the
ties
exist for
.
This term can only be used for
undirected bipartite networks.
# binary: b1star(k, attr=NULL, levels=NULL)
# binary: b1star(k, attr=NULL, levels=NULL)
k |
a vector of distinct integers |
attr , levels
|
a vertex attribute specification; if |
b1star(1)
is equal to b2star(1)
and to edges
.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, undirected, binary
-stars centered on the first mode of a bipartite networkThis term counts all -stars in which
the b2 nodes (called events in some contexts) are homophilous in the sense
that they all share the same value of
attr
. However, the b1 node
(in some contexts, the actor) at the center of the -star does NOT have to
have the same value as the b2 nodes; indeed, the values taken by the b1
nodes may be completely distinct from those of the b2 nodes, which allows
for the use of this term in cases where there are two separate nodal
attributes, one for the b1 nodes and another for the b2 nodes (in this case,
however, these two attributes should be combined to form a single nodal
attribute,
attr
). A different statistic is created for each
value of attr
seen in a b1 node, even if no -stars are observed
with this value.
# binary: b1starmix(k, attr, base=NULL, diff=TRUE)
# binary: b1starmix(k, attr, base=NULL, diff=TRUE)
k |
only a single value of |
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
base |
deprecated |
diff |
whether a different statistic is created for each value seen in a b2 node. When |
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels
are passed,
levels
overrides base
.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, undirected, binary
This term takes two nodal attributes. Assuming that there are
values of
b1attr
among the b1 nodes and
values of
b2attr
among the b2 nodes, then the total number of
distinct categories of two stars according to these two attributes is
. By default, this model term creates a distinct statistic
counting each of these categories.
# binary: b1twostar(b1attr, b2attr, base=NULL, b1levels=NULL, b2levels=NULL, levels2=NULL)
# binary: b1twostar(b1attr, b2attr, base=NULL, b1levels=NULL, b2levels=NULL, levels2=NULL)
b1attr |
b1 nodes (actors in some contexts) (see Specifying Vertex attributes and Levels ( |
b2attr |
b2 nodes (events in some contexts). If |
b1levels , b2levels , base , levels2
|
used to leave some of the categories out (see Specifying Vertex attributes and Levels ( |
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels
are passed,
levels
overrides base
.
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels2
are passed,
levels2
overrides base
.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, undirected, binary
This term adds one
network statistic to the model, equal to the number of nodes in the second
mode of the network with degree 2 or higher. The second mode of a bipartite
network object is sometimes known as the "event" mode.
Without the optional argument, this statistic is equivalent to b2mindegree(2)
.
# binary: b2concurrent(by=NULL)
# binary: b2concurrent(by=NULL)
by |
This optional argument specifie a vertex attribute (see Specifying Vertex attributes and Levels ( |
This term can only be used with undirected bipartite networks.
ergmTerm
for index of model terms currently visible to the package.
bipartite, frequently-used, undirected, binary
This term adds a single network statistic for each quantitative attribute or matrix column to the model equaling the total
value of attr(j)
for all edges in the network. This
term may only be used with bipartite networks. For categorical attributes, see
b2factor
.
# binary: b2cov(attr) # valued: b2cov(attr, form="sum")
# binary: b2cov(attr) # valued: b2cov(attr, form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
ergm versions 3.9.4 and earlier used different arguments for this
term. See ergm-options
for how to invoke the old behaviour.
ergmTerm
for index of model terms currently visible to the package.
bipartite, dyad-independent, frequently-used, quantitative nodal attribute, undirected, binary, valued
This term adds a single network statistic equalling the sum over the nodes of the range over of its neighbors' values.
# binary: nodecovrange(attr)
# binary: nodecovrange(attr)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
This is a network analogue of the statistic introduced by Hoffman et al. (2023).
Hoffman M, Block P, Snijders TAB (2023). “Modeling Partitions of Individuals.” Sociological Methodology, 53(1), 1–41. ISSN 1467-9531, doi:10.1177/00811750221145166.
ergmTerm
for index of model terms currently visible to the package.
bipartite, quantitative nodal attribute, binary
This term adds one
network statistic to the model for each element of from
(or to
); the th
such statistic equals the number of nodes of the second mode
("events") in the network of degree greater than or equal to
from[i]
but strictly less than to[i]
, i.e. with edge count
in semiopen interval [from,to)
.
This term can only be used with bipartite networks; for directed networks
see idegrange
and odegrange
. For undirected networks,
see degrange
, and see b1degrange
for degrees of the first mode ("actors").
# binary: b2degrange(from, to=+Inf, by=NULL, homophily=FALSE, levels=NULL)
# binary: b2degrange(from, to=+Inf, by=NULL, homophily=FALSE, levels=NULL)
from , to
|
vectors of distinct integers. If one of the vectors have length 1, it is recycled to the length of the other. Otherwise, it must have the same length. |
by , levels , homophily
|
the optional argument |
ergmTerm
for index of model terms currently visible to the package.
bipartite, undirected, binary
This term adds one network statistic to the model for
each element in d
; the th such statistic equals the number of
nodes of degree
d[i]
in the second mode of a bipartite network, i.e.
with exactly d[i]
edges. The second mode of a bipartite network
object is sometimes known as the "event" mode.
# binary: b2degree(d, by=NULL)
# binary: b2degree(d, by=NULL)
d |
a vector of distinct integers |
by |
this optional term specifies
a vertex attribute (see Specifying Vertex attributes and Levels ( |
This term can only be used with undirected bipartite networks.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, frequently-used, undirected, binary
For bipartite networks, preserve the degree for the second mode of each vertex of the given network, while allowing the degree for the first mode to vary.
# b2degrees
# b2degrees
ergmConstraint
for index of constraints and hints currently visible to the package.
bipartite
This term adds one network statistic to the model for each element in d
; the th
such statistic equals the number of dyads in the second bipartition with exactly
d[i]
shared partners. (Those shared partners, of course, must be members
of the first bipartition.) This term can only be used with bipartite networks.
# binary: b2dsp(d)
# binary: b2dsp(d)
d |
a vector of distinct integers |
This term takes an additional term option (see
options?ergm
), cache.sp
, controlling whether
the implementation will cache the number of shared partners for
each dyad in the network; this is usually enabled by default.
ergmTerm
for index of model terms currently visible to the package.
bipartite, undirected, binary
This term adds multiple network statistics to the model, one for each of (a subset of) the
unique values of the attr
attribute. Each of these statistics
gives the number of times a node with that attribute in the second mode of
the network appears in an edge. The second mode of a bipartite network
object is sometimes known as the "event" mode.
# binary: b2factor(attr, base=1, levels=-1) # valued: b2factor(attr, base=1, levels=-1, form="sum")
# binary: b2factor(attr, base=1, levels=-1) # valued: b2factor(attr, base=1, levels=-1, form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
base |
deprecated |
levels |
this optional argument controls which levels of the attribute
attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
To include all attribute values is usually not a good idea, because
the sum of all such statistics equals the number of edges and hence a linear
dependency would arise in any model also including edges
. The default,
levels=-1
, is therefore to omit the first (in lexicographic order)
attribute level. To include all levels, pass either levels=TRUE
(i.e., keep all levels) or levels=NULL
(i.e., do not filter levels).
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels
are passed,
levels
overrides base
.
This term can only be used with undirected bipartite networks.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, dyad-independent, frequently-used, undirected, binary, valued
This term adds a single network statistic to the model, counting, for each node, the number of distinct values of the attribute found among its neighbors.
# binary: b2factordistinct(attr, levels=TRUE)
# binary: b2factordistinct(attr, levels=TRUE)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
levels |
this optional argument controls which levels of the attribute
attributes and Levels ( |
This is a network analogue of the statistic introduced by Hoffman et al. (2023).
Hoffman M, Block P, Snijders TAB (2023). “Modeling Partitions of Individuals.” Sociological Methodology, 53(1), 1–41. ISSN 1467-9531, doi:10.1177/00811750221145166.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, binary
This term adds one network statistic to the model for
each element in d
; the th such statistic equals the number of
nodes in the second mode of a bipartite network with at least degree
d[i]
.
The second mode of a bipartite network object is sometimes known as the "event" mode.
# binary: b2mindegree(d)
# binary: b2mindegree(d)
d |
a vector of distinct integers |
This term can only be used with undirected bipartite networks.
ergmTerm
for index of model terms currently visible to the package.
bipartite, undirected, binary
This term is introduced in Bomiriya et al (2014).
With the default alpha
and beta
values, this term will
simply be a homophily based two-star statistic. This term adds one statistic to the model
unless diff
is set to TRUE
, in which case the term adds multiple network
statistics to the model, one for each of (a subset of) the unique values of the attr
attribute.
# binary: b2nodematch(attr, diff=FALSE, keep=NULL, alpha=1, beta=1, byb1attr=NULL, # levels=NULL)
# binary: b2nodematch(attr, diff=FALSE, keep=NULL, alpha=1, beta=1, byb1attr=NULL, # levels=NULL)
diff |
by default, one statistic will be added to the model. If |
keep |
deprecated |
alpha , beta
|
optional discount parameters both of which take values from |
byb2attr |
specifies a
second mode categorical attribute. Setting this argument
will separate the orginal statistics based on the values of the set second mode attribute—
i.e. for example, if |
levels |
select a subset of |
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
If an alpha
discount parameter is used, each of these statistics gives the sum of
the number of common first-mode nodes raised to the power alpha
for each pair of
second-mode nodes with that attribute. If a beta
discount parameter is used, each
of these statistics gives half the sum of the number of two-paths with two second-mode nodes
with that attribute as the two ends of the two path raised to the power beta
for each
edge in the network.
This term can only be used with undirected bipartite networks.
The argument keep
is retained for backwards compatibility and may be
removed in a future version. When both keep
and levels
are passed,
levels
overrides keep
.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, dyad-independent, frequently-used, undirected, binary
This term adds one network statistic for each node in the second bipartition, equal to the number of
ties of that node. For directed networks, see sender
and
receiver
. For unipartite networks, see sociality
.
# binary: b2sociality(nodes=-1) # valued: b2sociality(nodes=-1, form="sum")
# binary: b2sociality(nodes=-1) # valued: b2sociality(nodes=-1, form="sum")
nodes |
By default, |
form |
character how to aggregate tie values in a valued ERGM |
This term can only be used with undirected bipartite networks.
ergmTerm
for index of model terms currently visible to the package.
bipartite, dyad-independent, undirected, binary, valued
-stars for the second mode in a bipartite networkThis term adds one network statistic to the model for
each element in k
. The th such statistic counts the number of
distinct
k[i]
-stars whose center node is in the second mode of the
network. The second mode of a bipartite network object is sometimes known as
the "event" mode. A -star is defined to be a center node
and
a set of
different nodes
such that the
ties
exist for
. This term can only be used for
undirected bipartite networks.
# binary: b2star(k, attr=NULL, levels=NULL)
# binary: b2star(k, attr=NULL, levels=NULL)
k |
a vector of distinct integers |
attr , levels
|
a vertex attribute specification; if |
b2star(1)
is equal to b1star(1)
and to edges
.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, undirected, binary
-stars centered on the second mode of a bipartite networkThis term is exactly the same as b1starmix
except that the roles of
b1 and b2 are reversed.
# binary: b2starmix(k, attr, base=NULL, diff=TRUE)
# binary: b2starmix(k, attr, base=NULL, diff=TRUE)
k |
only a single value of |
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
base |
deprecated |
diff |
whether a different statistic is created for each value seen in a b1 node. When |
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels
are passed,
levels
overrides base
.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, undirected, binary
This term is exactly the same as b1twostar
except that the
roles of b1 and b2 are reversed.
# binary: b2twostar(b1attr, b2attr, base=NULL, b1levels=NULL, b2levels=NULL, levels2=NULL)
# binary: b2twostar(b1attr, b2attr, base=NULL, b1levels=NULL, b2levels=NULL, levels2=NULL)
b1attr |
b1 nodes (actors in some contexts) (see Specifying Vertex attributes and Levels ( |
b2attr |
b2 nodes (events in some contexts). If |
b1levels , b2levels , base , levels2
|
used to leave some of the categories out (see Specifying Vertex attributes and Levels ( |
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels
are passed,
levels
overrides base
.
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels2
are passed,
levels2
overrides base
.
ergmTerm
for index of model terms currently visible to the package.
bipartite, categorical nodal attribute, undirected, binary
This term adds one network statistic to the model equal to the number of
triads in the network that are balanced. The balanced triads are those of
type 102
or 300
in the categorization of Davis and Leinhardt (1972). For details on the 16 possible triad types, see
?triad.classify
in the {sna}
package. For an undirected
network, the balanced triads are those with an odd number of ties (i.e., 1
and 3).
# binary: balance
# binary: balance
ergmTerm
for index of model terms currently visible to the package.
directed, triad-related, undirected, binary
Condition on the number of inedge or outedges posessed by a node.
See Placing Bounds on Degrees section for more information. (?ergmConstraint
)
# bd(attribs, maxout, maxin, minout, minin)
# bd(attribs, maxout, maxin, minout, minin)
attribs |
a matrix of logicals with dimension |
maxout , maxin , minout , minin
|
matrices of alter attributes with the same dimension as |
ergmConstraint
for index of constraints and hints currently visible to the package.
directed, undirected
Specifies each
dyad's baseline distribution to be Bernoulli with probability of
the tie being . This is the only reference measure used
in binary mode.
# Bernoulli
# Bernoulli
ergmReference
for index of reference distributions currently visible to the package.
binary, discrete, finite, nonnegative
Force a block-diagonal structure (and its bipartite analogue) on
the network. Only dyads for which
attr(i)==attr(j)
can have edges.
Note that the current implementation requires that blocks be contiguous for unipartite graphs, and for bipartite graphs, they must be contiguous within a partition and must have the same ordering in both partitions. (They do not, however, require that all blocks be represented in both partitions, but those that overlap must have the same order.)
If multiple block-diagonal constraints are given, or if
attr
is a vector with multiple attribute names, blocks
will be constructed on all attributes matching.
# blockdiag(attr)
# blockdiag(attr)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
ergmConstraint
for index of constraints and hints currently visible to the package.
directed, dyad-independent, undirected
Any dyad whose toggle would produce a nonzero change statistic
for a nodemix
term with the same arguments will be fixed. Note
that the levels2
argument has a different default value for
blocks
than it does for nodemix
.
# blocks(attr=NULL, levels=NULL, levels2=FALSE, b1levels=NULL, b2levels=NULL)
# blocks(attr=NULL, levels=NULL, levels2=FALSE, b1levels=NULL, b2levels=NULL)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
b1levels , b2levels , levels , level2
|
control what mixing types are fixed.
|
ergmConstraint
for index of constraints and hints currently visible to the package.
directed, dyad-independent, undirected
Helper functions for implementing ergm()
terms, to check whether the term can be used with the specified
network. For information on ergm terms, see
ergmTerm
. ergm.checkargs
,
ergm.checkbipartite
, and ergm.checkderected
are
helper functions for an old API and are deprecated. Use
check.ErgmTerm
.
check.ErgmTerm( nw, arglist, directed = NULL, bipartite = NULL, nonnegative = FALSE, varnames = NULL, vartypes = NULL, defaultvalues = list(), required = NULL, dep.inform = rep(FALSE, length(required)), dep.warn = rep(FALSE, length(required)), argexpr = NULL )
check.ErgmTerm( nw, arglist, directed = NULL, bipartite = NULL, nonnegative = FALSE, varnames = NULL, vartypes = NULL, defaultvalues = list(), required = NULL, dep.inform = rep(FALSE, length(required)), dep.warn = rep(FALSE, length(required)), argexpr = NULL )
nw |
the network that term X is being checked against |
arglist |
the list of arguments for term X |
directed |
logical, whether term X requires a directed network; default=NULL |
bipartite |
whether term X requires a bipartite network (T or F); default=NULL |
nonnegative |
whether term X requires a network with only nonnegative weights; default=FALSE |
varnames |
the vector of names of the possible arguments for term X; default=NULL |
vartypes |
the vector of types of the possible arguments for
term X, separated by commas; an empty string ( |
defaultvalues |
the list of default values for the possible arguments of term X; default=list() |
required |
the logical vector of whether each possible argument is required; default=NULL |
dep.inform , dep.warn
|
a list of length equal to the number of
arguments the term can take; if the corresponding element of the
list is not |
argexpr |
optional call typically obtained by calling
|
The check.ErgmTerm
function ensures for the
InitErgmTerm.X
function that the term X:
is applicable given the 'directed' and 'bipartite' attributes of the given network
is not applied to a directed bipartite network
has an appropiate number of arguments
has correct argument types if arguments where provided
has default values assigned if defaults are available
by halting execution if any of the first 3 criteria are not met.
As a convenience, if an argument is optional and its
default is NULL
, then NULL
is assumed to be an acceptable
argument type as well.
A list of the values for each possible argument of term X;
user provided values are used when given, default values
otherwise. The list also has an attr(,"missing")
attribute
containing a named logical vector indicating whether a particular
argument had been set to its default. If argexpr=
argument is
provided, attr(,"exprs")
attribute is also returned, containing
expressions.
This dataset consists of three objects, each based on data from King County, Washington, USA (where Seattle is located) derived from the National Survey of Family Growth (NSFG) (https://www.cdc.gov/nchs/nsfg/index.htm). The full dataset cannot be released publicly, so some aspects of these objects are simulated based on the real data. These objects may be used to illustrate that network modeling may be performed using data that are collected on egos only, i.e., without directly observing information about alters in a network except for information reported from egos. The hypothetical population reepresented by this dataset consists of only a subset of individuals, as categorized by their age, race / ethnicity / immigration status, and gender and sexual identity.
data(cohab)
data(cohab)
The three objects are
Mixing matrix on 'race'. Based on ego reports of the race / ethnicity / immigration status of their cohabiting partners, this matrix gives counts of ego-alter ties by the race of each individual for a hypothetical population. These counts are based on the NSFG mixing matrix. Only five categories of the 'race' variable are included here: Black, Black immigrant, Hispanic, Hispanic immigrant, and White.
Data frame of demographic characteristics together with relative counts (weights) in a hypothetical population. Individuals are classified according to five variables: age in years, race (same five categories of race / ethnicity / immigration status as above), sex (Male or Female), sexual identity (Female, Male who has sex with Females, or Male who has sex with Males or Females), and number of model-predicted persistent partnerships with non-cohabiting partners (0 or 1, where 1 means any nonzero value; the number is capped at 3), and number of partners (0 or 1).
Vector of target (expected) statistics for a 15-term ERGM applied
to a network of 50,000 nodes in which a tie represents a cohabitation relationship between
two nodes. It is assumed for the purposes of these statistics that only male-female
cohabitation relationships are allowed and that no individual may have such a relationship
with more than one person. That is, each node must have degree zero or one. The ergm formula
is: ~ edges + nodefactor("sex.ident", levels = 3) + nodecov("age") + nodecov("agesq") + nodefactor("race", levels = -5) + nodefactor("othr.net.deg", levels = -1) + nodematch("race", diff = TRUE) + absdiff("sqrt.age.adj")
Krivitsky, P.N., Hunter, D.R., Morris, M., and Klumb, C. (2021). ergm 4.0: New Features and Improvements. arXiv
National Center for Health Statistics (NCHS). (2020). 2006-2015 National Survey of Family Growth Public-Use Data and Documentation. Hyattsville, MD: CDC National Center for Health Statistics. Retrieved from https://www.cdc.gov/nchs/nsfg/index.htm
ergm
By default this term adds one network statistic to the model for each pair of nodes of mode two. It is equal to the number of (first mode) mutual partners of that pair. The first mode of a bipartite network object is sometimes known as the "actor" mode and the seconds as the "event" mode. So this is the number of actors going to both events in the pair. This term can only be used with undirected bipartite networks.
# binary: coincidence(levels=NULL,active=0)
# binary: coincidence(levels=NULL,active=0)
levels |
specifies which pairs of nodes in mode two to include. (See Specifying Vertex
attributes and Levels ( |
active |
selects pairs for which the observed count is at least |
ergm versions 3.9.4 and earlier used different arguments for this
term. See ergm-options
for how to invoke the old behaviour.
ergmTerm
for index of model terms currently visible to the package.
bipartite, undirected, binary
This term adds one network statistic to the model, equal to the number of nodes in the network with degree 2 or higher. This term can only be used with undirected networks.
# binary: concurrent(by=NULL, levels=NULL)
# binary: concurrent(by=NULL, levels=NULL)
by |
this optional argument specifies a vertex attribute (see Specifying Vertex attributes and Levels ( |
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, undirected, binary
This term adds one network statistic to the model, equal to the number of ties incident on each actor beyond the first. This term can only be used with undirected networks.
# binary: concurrentties(by=NULL, levels=NULL)
# binary: concurrentties(by=NULL, levels=NULL)
by |
a vertex attribute (see Specifying Vertex attributes and Levels ( |
levels |
TODO (See Specifying Vertex
attributes and Levels ( |
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, undirected, binary
This function is only used within a call to the ergm()
function.
See the Usage section in ergm()
for details. Also see the
Details section about some of the interactions between its
arguments.
control.ergm( drop = TRUE, init = NULL, init.method = NULL, main.method = c("MCMLE", "Stochastic-Approximation"), force.main = FALSE, main.hessian = TRUE, checkpoint = NULL, resume = NULL, MPLE.samplesize = .Machine$integer.max, init.MPLE.samplesize = function(d, e) max(sqrt(d), e, 40) * 8, MPLE.type = c("glm", "penalized", "logitreg"), MPLE.maxit = 10000, MPLE.nonvar = c("warning", "message", "error"), MPLE.nonident = c("warning", "message", "error"), MPLE.nonident.tol = 1e-10, MPLE.covariance.samplesize = 500, MPLE.covariance.method = "invHess", MPLE.covariance.sim.burnin = 1024, MPLE.covariance.sim.interval = 1024, MPLE.check = TRUE, MPLE.constraints.ignore = FALSE, MCMC.prop = trim_env(~sparse + .triadic), MCMC.prop.weights = "default", MCMC.prop.args = list(), MCMC.interval = NULL, MCMC.burnin = EVL(MCMC.interval * 16), MCMC.samplesize = NULL, MCMC.effectiveSize = NULL, MCMC.effectiveSize.damp = 10, MCMC.effectiveSize.maxruns = 16, MCMC.effectiveSize.burnin.pval = 0.2, MCMC.effectiveSize.burnin.min = 0.05, MCMC.effectiveSize.burnin.max = 0.5, MCMC.effectiveSize.burnin.nmin = 16, MCMC.effectiveSize.burnin.nmax = 128, MCMC.effectiveSize.burnin.PC = FALSE, MCMC.effectiveSize.burnin.scl = 32, MCMC.effectiveSize.order.max = NULL, MCMC.return.stats = 2^12, MCMC.runtime.traceplot = FALSE, MCMC.maxedges = Inf, MCMC.addto.se = TRUE, MCMC.packagenames = c(), SAN.maxit = 4, SAN.nsteps.times = 8, SAN = control.san(term.options = term.options, SAN.maxit = SAN.maxit, SAN.prop = MCMC.prop, SAN.prop.weights = MCMC.prop.weights, SAN.prop.args = MCMC.prop.args, SAN.nsteps = EVL(MCMC.burnin, 16384) * SAN.nsteps.times, SAN.samplesize = EVL(MCMC.samplesize, 1024), SAN.packagenames = MCMC.packagenames, parallel = parallel, parallel.type = parallel.type, parallel.version.check = parallel.version.check), MCMLE.termination = c("confidence", "Hummel", "Hotelling", "precision", "none"), MCMLE.maxit = 60, MCMLE.conv.min.pval = 0.5, MCMLE.confidence = 0.99, MCMLE.confidence.boost = 2, MCMLE.confidence.boost.threshold = 1, MCMLE.confidence.boost.lag = 4, MCMLE.NR.maxit = 100, MCMLE.NR.reltol = sqrt(.Machine$double.eps), obs.MCMC.mul = 1/4, obs.MCMC.samplesize.mul = sqrt(obs.MCMC.mul), obs.MCMC.samplesize = EVL(round(MCMC.samplesize * obs.MCMC.samplesize.mul)), obs.MCMC.effectiveSize = NVL3(MCMC.effectiveSize, . * obs.MCMC.mul), obs.MCMC.interval.mul = sqrt(obs.MCMC.mul), obs.MCMC.interval = EVL(round(MCMC.interval * obs.MCMC.interval.mul)), obs.MCMC.burnin.mul = sqrt(obs.MCMC.mul), obs.MCMC.burnin = EVL(round(MCMC.burnin * obs.MCMC.burnin.mul)), obs.MCMC.prop = MCMC.prop, obs.MCMC.prop.weights = MCMC.prop.weights, obs.MCMC.prop.args = MCMC.prop.args, obs.MCMC.impute.min_informative = function(nw) network.size(nw)/4, obs.MCMC.impute.default_density = function(nw) 2/network.size(nw), MCMLE.min.depfac = 2, MCMLE.sampsize.boost.pow = 0.5, MCMLE.MCMC.precision = if (startsWith("confidence", MCMLE.termination[1])) 0.1 else 0.005, MCMLE.MCMC.max.ESS.frac = 0.1, MCMLE.metric = c("lognormal", "logtaylor", "Median.Likelihood", "EF.Likelihood", "naive"), MCMLE.method = c("BFGS", "Nelder-Mead"), MCMLE.dampening = FALSE, MCMLE.dampening.min.ess = 20, MCMLE.dampening.level = 0.1, MCMLE.steplength.margin = 0.05, MCMLE.steplength = NVL2(MCMLE.steplength.margin, 1, 0.5), MCMLE.steplength.parallel = c("observational", "never"), MCMLE.sequential = TRUE, MCMLE.density.guard.min = 10000, MCMLE.density.guard = exp(3), MCMLE.effectiveSize = 64, obs.MCMLE.effectiveSize = NULL, MCMLE.interval = 1024, MCMLE.burnin = MCMLE.interval * 16, MCMLE.samplesize.per_theta = 32, MCMLE.samplesize.min = 256, MCMLE.samplesize = NULL, obs.MCMLE.samplesize.per_theta = round(MCMLE.samplesize.per_theta * obs.MCMC.samplesize.mul), obs.MCMLE.samplesize.min = 256, obs.MCMLE.samplesize = NULL, obs.MCMLE.interval = round(MCMLE.interval * obs.MCMC.interval.mul), obs.MCMLE.burnin = round(MCMLE.burnin * obs.MCMC.burnin.mul), MCMLE.steplength.solver = c("glpk", "lpsolve"), MCMLE.last.boost = 4, MCMLE.steplength.esteq = TRUE, MCMLE.steplength.miss.sample = function(x1) c(max(ncol(rbind(x1)) * 2, 30), 10), MCMLE.steplength.min = 1e-04, MCMLE.effectiveSize.interval_drop = 2, MCMLE.save_intermediates = NULL, MCMLE.nonvar = c("message", "warning", "error"), MCMLE.nonident = c("warning", "message", "error"), MCMLE.nonident.tol = 1e-10, SA.phase1_n = function(q, ...) max(200, 7 + 3 * q), SA.initial_gain = 0.1, SA.nsubphases = 4, SA.min_iterations = function(q, ...) (7 + q), SA.max_iterations = function(q, ...) (207 + q), SA.phase3_n = 1000, SA.interval = 1024, SA.burnin = SA.interval * 16, SA.samplesize = 1024, CD.samplesize.per_theta = 128, obs.CD.samplesize.per_theta = 128, CD.nsteps = 8, CD.multiplicity = 1, CD.nsteps.obs = 128, CD.multiplicity.obs = 1, CD.maxit = 60, CD.conv.min.pval = 0.5, CD.NR.maxit = 100, CD.NR.reltol = sqrt(.Machine$double.eps), CD.metric = c("naive", "lognormal", "logtaylor", "Median.Likelihood", "EF.Likelihood"), CD.method = c("BFGS", "Nelder-Mead"), CD.dampening = FALSE, CD.dampening.min.ess = 20, CD.dampening.level = 0.1, CD.steplength.margin = 0.5, CD.steplength = 1, CD.adaptive.epsilon = 0.01, CD.steplength.esteq = TRUE, CD.steplength.miss.sample = function(x1) ceiling(sqrt(ncol(rbind(x1)))), CD.steplength.min = 1e-04, CD.steplength.parallel = c("observational", "always", "never"), CD.steplength.solver = c("glpk", "lpsolve"), loglik = control.logLik.ergm(), term.options = NULL, seed = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... )
control.ergm( drop = TRUE, init = NULL, init.method = NULL, main.method = c("MCMLE", "Stochastic-Approximation"), force.main = FALSE, main.hessian = TRUE, checkpoint = NULL, resume = NULL, MPLE.samplesize = .Machine$integer.max, init.MPLE.samplesize = function(d, e) max(sqrt(d), e, 40) * 8, MPLE.type = c("glm", "penalized", "logitreg"), MPLE.maxit = 10000, MPLE.nonvar = c("warning", "message", "error"), MPLE.nonident = c("warning", "message", "error"), MPLE.nonident.tol = 1e-10, MPLE.covariance.samplesize = 500, MPLE.covariance.method = "invHess", MPLE.covariance.sim.burnin = 1024, MPLE.covariance.sim.interval = 1024, MPLE.check = TRUE, MPLE.constraints.ignore = FALSE, MCMC.prop = trim_env(~sparse + .triadic), MCMC.prop.weights = "default", MCMC.prop.args = list(), MCMC.interval = NULL, MCMC.burnin = EVL(MCMC.interval * 16), MCMC.samplesize = NULL, MCMC.effectiveSize = NULL, MCMC.effectiveSize.damp = 10, MCMC.effectiveSize.maxruns = 16, MCMC.effectiveSize.burnin.pval = 0.2, MCMC.effectiveSize.burnin.min = 0.05, MCMC.effectiveSize.burnin.max = 0.5, MCMC.effectiveSize.burnin.nmin = 16, MCMC.effectiveSize.burnin.nmax = 128, MCMC.effectiveSize.burnin.PC = FALSE, MCMC.effectiveSize.burnin.scl = 32, MCMC.effectiveSize.order.max = NULL, MCMC.return.stats = 2^12, MCMC.runtime.traceplot = FALSE, MCMC.maxedges = Inf, MCMC.addto.se = TRUE, MCMC.packagenames = c(), SAN.maxit = 4, SAN.nsteps.times = 8, SAN = control.san(term.options = term.options, SAN.maxit = SAN.maxit, SAN.prop = MCMC.prop, SAN.prop.weights = MCMC.prop.weights, SAN.prop.args = MCMC.prop.args, SAN.nsteps = EVL(MCMC.burnin, 16384) * SAN.nsteps.times, SAN.samplesize = EVL(MCMC.samplesize, 1024), SAN.packagenames = MCMC.packagenames, parallel = parallel, parallel.type = parallel.type, parallel.version.check = parallel.version.check), MCMLE.termination = c("confidence", "Hummel", "Hotelling", "precision", "none"), MCMLE.maxit = 60, MCMLE.conv.min.pval = 0.5, MCMLE.confidence = 0.99, MCMLE.confidence.boost = 2, MCMLE.confidence.boost.threshold = 1, MCMLE.confidence.boost.lag = 4, MCMLE.NR.maxit = 100, MCMLE.NR.reltol = sqrt(.Machine$double.eps), obs.MCMC.mul = 1/4, obs.MCMC.samplesize.mul = sqrt(obs.MCMC.mul), obs.MCMC.samplesize = EVL(round(MCMC.samplesize * obs.MCMC.samplesize.mul)), obs.MCMC.effectiveSize = NVL3(MCMC.effectiveSize, . * obs.MCMC.mul), obs.MCMC.interval.mul = sqrt(obs.MCMC.mul), obs.MCMC.interval = EVL(round(MCMC.interval * obs.MCMC.interval.mul)), obs.MCMC.burnin.mul = sqrt(obs.MCMC.mul), obs.MCMC.burnin = EVL(round(MCMC.burnin * obs.MCMC.burnin.mul)), obs.MCMC.prop = MCMC.prop, obs.MCMC.prop.weights = MCMC.prop.weights, obs.MCMC.prop.args = MCMC.prop.args, obs.MCMC.impute.min_informative = function(nw) network.size(nw)/4, obs.MCMC.impute.default_density = function(nw) 2/network.size(nw), MCMLE.min.depfac = 2, MCMLE.sampsize.boost.pow = 0.5, MCMLE.MCMC.precision = if (startsWith("confidence", MCMLE.termination[1])) 0.1 else 0.005, MCMLE.MCMC.max.ESS.frac = 0.1, MCMLE.metric = c("lognormal", "logtaylor", "Median.Likelihood", "EF.Likelihood", "naive"), MCMLE.method = c("BFGS", "Nelder-Mead"), MCMLE.dampening = FALSE, MCMLE.dampening.min.ess = 20, MCMLE.dampening.level = 0.1, MCMLE.steplength.margin = 0.05, MCMLE.steplength = NVL2(MCMLE.steplength.margin, 1, 0.5), MCMLE.steplength.parallel = c("observational", "never"), MCMLE.sequential = TRUE, MCMLE.density.guard.min = 10000, MCMLE.density.guard = exp(3), MCMLE.effectiveSize = 64, obs.MCMLE.effectiveSize = NULL, MCMLE.interval = 1024, MCMLE.burnin = MCMLE.interval * 16, MCMLE.samplesize.per_theta = 32, MCMLE.samplesize.min = 256, MCMLE.samplesize = NULL, obs.MCMLE.samplesize.per_theta = round(MCMLE.samplesize.per_theta * obs.MCMC.samplesize.mul), obs.MCMLE.samplesize.min = 256, obs.MCMLE.samplesize = NULL, obs.MCMLE.interval = round(MCMLE.interval * obs.MCMC.interval.mul), obs.MCMLE.burnin = round(MCMLE.burnin * obs.MCMC.burnin.mul), MCMLE.steplength.solver = c("glpk", "lpsolve"), MCMLE.last.boost = 4, MCMLE.steplength.esteq = TRUE, MCMLE.steplength.miss.sample = function(x1) c(max(ncol(rbind(x1)) * 2, 30), 10), MCMLE.steplength.min = 1e-04, MCMLE.effectiveSize.interval_drop = 2, MCMLE.save_intermediates = NULL, MCMLE.nonvar = c("message", "warning", "error"), MCMLE.nonident = c("warning", "message", "error"), MCMLE.nonident.tol = 1e-10, SA.phase1_n = function(q, ...) max(200, 7 + 3 * q), SA.initial_gain = 0.1, SA.nsubphases = 4, SA.min_iterations = function(q, ...) (7 + q), SA.max_iterations = function(q, ...) (207 + q), SA.phase3_n = 1000, SA.interval = 1024, SA.burnin = SA.interval * 16, SA.samplesize = 1024, CD.samplesize.per_theta = 128, obs.CD.samplesize.per_theta = 128, CD.nsteps = 8, CD.multiplicity = 1, CD.nsteps.obs = 128, CD.multiplicity.obs = 1, CD.maxit = 60, CD.conv.min.pval = 0.5, CD.NR.maxit = 100, CD.NR.reltol = sqrt(.Machine$double.eps), CD.metric = c("naive", "lognormal", "logtaylor", "Median.Likelihood", "EF.Likelihood"), CD.method = c("BFGS", "Nelder-Mead"), CD.dampening = FALSE, CD.dampening.min.ess = 20, CD.dampening.level = 0.1, CD.steplength.margin = 0.5, CD.steplength = 1, CD.adaptive.epsilon = 0.01, CD.steplength.esteq = TRUE, CD.steplength.miss.sample = function(x1) ceiling(sqrt(ncol(rbind(x1)))), CD.steplength.min = 1e-04, CD.steplength.parallel = c("observational", "always", "never"), CD.steplength.solver = c("glpk", "lpsolve"), loglik = control.logLik.ergm(), term.options = NULL, seed = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... )
drop |
Logical: If TRUE, terms whose observed statistic values are at the extremes of their possible ranges are dropped from the fit and their corresponding parameter estimates are set to plus or minus infinity, as appropriate. This is done because maximum likelihood estimates cannot exist when the vector of observed statistic lies on the boundary of the convex hull of possible statistic values. |
init |
numeric or
Passing |
init.method |
A chatacter vector or Valid initial methods for a given reference are set by the
|
main.method |
One of "MCMLE" (default) or
"Stochastic-Approximation". Chooses the estimation method used
to find the MLE. |
force.main |
Logical: If TRUE, then force MCMC-based estimation method, even if the exact MLE can be computed via maximum pseudolikelihood estimation. |
main.hessian |
Logical: If TRUE, then an approximate Hessian matrix is used in the MCMC-based estimation method. |
checkpoint |
At the start of every iteration, save the state
of the optimizer in a way that will allow it to be resumed. The
name is passed through |
resume |
If given a file name of an |
MPLE.samplesize , init.MPLE.samplesize
|
These parameters control the maximum number of dyads (potential ties) that will be used by the MPLE to construct the predictor matrix for its logistic regression. In general, the algorithm visits dyads in a systematic sample that, if it does not hit one of these limits, will visit every informative dyad. If a limit is exceeded, case-control approximation to the likelihood, comprising all edges and those non-edges that have been visited by the algorithm before the limit was exceeded will be used.
|
MPLE.type |
One of |
MPLE.maxit |
Maximum number of iterations for |
MPLE.nonident , MPLE.nonident.tol , MPLE.nonvar , MCMLE.nonident , MCMLE.nonident.tol , MCMLE.nonvar
|
A rudimentary nonidentifiability/multicollinearity diagnostic. If
|
MPLE.covariance.method , MPLE.covariance.samplesize , MPLE.covariance.sim.burnin , MPLE.covariance.sim.interval
|
Controls for estimating the MPLE covariance
matrix. |
MPLE.check |
If |
MPLE.constraints.ignore |
If |
MCMC.prop |
Specifies the proposal (directly) and/or
a series of "hints" about the structure of the model being
sampled. The specification is in the form of a one-sided formula
with hints separated by A common and default "hint" is |
MCMC.prop.weights |
Specifies the proposal
distribution used in the MCMC Metropolis-Hastings algorithm. Possible
choices depending on selected |
MCMC.prop.args |
An alternative, direct way of specifying additional arguments to proposal. |
MCMC.interval |
Number of proposals between sampled statistics. Increasing interval will reduces the autocorrelation in the sample, and may increase the precision in estimates by reducing MCMC error, at the expense of time. Set the interval higher for larger networks. |
MCMC.burnin |
Number of proposals before any MCMC sampling is done. It typically is set to a fairly large number. |
MCMC.samplesize |
Number of network statistics, randomly drawn from a given distribution on the set of all networks, returned by the Metropolis-Hastings algorithm. Increasing sample size may increase the precision in the estimates by reducing MCMC error, at the expense of time. Set it higher for larger networks, or when using parallel functionality. |
MCMC.effectiveSize , MCMC.effectiveSize.damp , MCMC.effectiveSize.maxruns , MCMC.effectiveSize.burnin.pval , MCMC.effectiveSize.burnin.min , MCMC.effectiveSize.burnin.max , MCMC.effectiveSize.burnin.nmin , MCMC.effectiveSize.burnin.nmax , MCMC.effectiveSize.burnin.PC , MCMC.effectiveSize.burnin.scl , MCMC.effectiveSize.order.max
|
Set After each run, the returned statistics are mapped to the
estimating function scale, then an exponential decay model is fit
to the scaled statistics to find that burn-in which would reduce
the difference between the initial values of statistics and their
equilibrium values by a factor of A Geweke diagnostic is then run, after thinning the sample to
If The effective size of the post-burn-in sample is computed via
Vats et al. (2019), and compared to the target
effective size. If it is not matched, the MCMC run is resumed,
with the additional draws needed linearly extrapolated but
weighted in favor of the baseline Lastly, if |
MCMC.return.stats |
Numeric: If positive, include an
|
MCMC.runtime.traceplot |
Logical: If |
MCMC.maxedges |
The maximum number of edges that may occur during the MCMC sampling. If this number is exceeded at any time, sampling is stopped immediately. |
MCMC.addto.se |
Whether to add the standard errors induced by the MCMC algorithm to the estimates' standard errors. |
MCMC.packagenames |
Names of packages in which to look for change statistic functions in addition to those autodetected. This argument should not be needed outside of very strange setups. |
SAN.maxit |
When |
SAN.nsteps.times |
Multiplier for |
SAN |
Control arguments to |
MCMLE.termination |
The criterion used for terminating MCMLE estimation:
Note that this criterion is incompatible with
Note that this criterion is incompatible with
|
MCMLE.maxit |
Maximum number of times the parameter for the MCMC should be updated by maximizing the MCMC likelihood. At each step the parameter is changed to the values that maximizes the MCMC likelihood based on the current sample. |
MCMLE.conv.min.pval |
The P-value used in the Hotelling test for early termination. |
MCMLE.confidence |
The confidence level for declaring
convergence for |
MCMLE.confidence.boost |
The maximum increase factor in sample
size (or target effective size, if enabled) when the
|
MCMLE.confidence.boost.threshold , MCMLE.confidence.boost.lag
|
Sample size or target effective size will be increaed if the distance from the tolerance region fails to decrease more than MCMLE.confidence.boost.threshold in this many successive iterations. |
MCMLE.NR.maxit , MCMLE.NR.reltol
|
The method, maximum number of
iterations and relative tolerance to use within the |
obs.MCMC.prop , obs.MCMC.prop.weights , obs.MCMC.prop.args , obs.MCMLE.effectiveSize , obs.MCMC.samplesize , obs.MCMC.burnin , obs.MCMC.interval , obs.MCMC.mul , obs.MCMC.samplesize.mul , obs.MCMC.burnin.mul , obs.MCMC.interval.mul , obs.MCMC.effectiveSize , obs.MCMLE.burnin , obs.MCMLE.interval , obs.MCMLE.samplesize , obs.MCMLE.samplesize.per_theta , obs.MCMLE.samplesize.min
|
Corresponding MCMC parameters and settings used for the constrained sample when
unobserved data are present in the estimation routine. By default, they are controlled by the These can, in turn, be controlled by Lastly, if |
obs.MCMC.impute.min_informative , obs.MCMC.impute.default_density
|
Controls for imputation of missing dyads for initializing MCMC
sampling. If numeric, |
MCMLE.min.depfac , MCMLE.sampsize.boost.pow
|
When using adaptive MCMC effective size, and methods that increase the MCMC sample size, use |
MCMLE.MCMC.precision , MCMLE.MCMC.max.ESS.frac
|
If effective sample size is used (see |
MCMLE.metric |
Method to calculate the loglikelihood approximation. See Hummel et al (2010) for an explanation of "lognormal" and "naive". |
MCMLE.method |
Deprecated. By default, ergm uses |
MCMLE.dampening |
(logical) Should likelihood dampening be used? |
MCMLE.dampening.min.ess |
The effective sample size below which dampening is used. |
MCMLE.dampening.level |
The proportional distance from boundary of the convex hull move. |
MCMLE.steplength.margin |
The extra margin required for a Hummel step
to count as being inside the convex hull of the sample. Set this to 0 if
the step length gets stuck at the same value over several iteraions. Set it
to |
MCMLE.steplength |
Multiplier for step length (on the mean-value parameter scale), which may (for values less than one) make fitting more stable at the cost of computational efficiency. If |
MCMLE.steplength.parallel |
Whether parallel multisection
search (as opposed to a bisection search) for the Hummel step
length should be used if running in multiple threads. Possible
values (partially matched) are |
MCMLE.sequential |
Logical: If TRUE, the next iteration of the fit uses
the last network sampled as the starting network. If FALSE, always use the
initially passed network. The results should be similar (stochastically),
but the TRUE option may help if the |
MCMLE.density.guard.min , MCMLE.density.guard
|
A simple heuristic to
stop optimization if it finds itself in an overly dense region, which
usually indicates ERGM degeneracy: if the sampler encounters a network
configuration that has more than |
MCMLE.effectiveSize , MCMLE.effectiveSize.interval_drop , MCMLE.burnin , MCMLE.interval , MCMLE.samplesize , MCMLE.samplesize.per_theta , MCMLE.samplesize.min
|
Sets the corresponding |
MCMLE.steplength.solver |
The linear program solver to use for
MCMLE step length calculation. Can be either |
MCMLE.last.boost |
For the Hummel termination criterion, increase the MCMC sample size of the last iteration by this factor. |
MCMLE.steplength.esteq |
For curved ERGMs, should the estimating function values be used to compute the Hummel step length? This allows the Hummel stepping algorithm converge when some sufficient statistics are at 0. |
MCMLE.steplength.miss.sample |
In fitting the missing data MLE, the rules for step length become more complicated. In short, it is necessary for all points in the constrained sample to be in the convex hull of the unconstrained (though they may be on the border); and it is necessary for their centroid to be in its interior. This requires checking a large number of points against whether they are in the convex hull, so to speed up the procedure, a sample is taken of the points most likely to be outside it. This parameter specifies the sample size or a function of the unconstrained sample matrix to determine the sample size. If the parameter or the return value of the function has a length of 2, the first element is used as the sample size, and the second element is used in an early-termination heuristic, only continuing the tests until this many test points in a row did not yield a change in the step length. |
MCMLE.steplength.min |
Stops MCMLE estimation when the step length gets stuck below this minimum value. |
MCMLE.save_intermediates |
Every iteration, after MCMC
sampling, save the MCMC sample and some miscellaneous information
to a file with this name. This is mainly useful for diagnostics
and debugging. The name is passed through |
SA.phase1_n |
A constant or a function of number of free
parameters |
SA.initial_gain |
Initial gain to Phase 2 of the stochastic approximation algorithm. Defaults to 0.1. See Snijders (2002) for details. |
SA.nsubphases |
Number of sub-phases in Phase 2 of the
stochastic approximation algorithm. Defaults to
|
SA.min_iterations , SA.max_iterations
|
A constant or a function
of number of free parameters |
SA.phase3_n |
Sample size for the MCMC sample in Phase 3 of the stochastic approximation algorithm. See Snijders (2002) for details. |
SA.burnin , SA.interval , SA.samplesize
|
Sets the corresponding
|
CD.samplesize.per_theta , obs.CD.samplesize.per_theta , CD.maxit , CD.conv.min.pval , CD.NR.maxit , CD.NR.reltol , CD.metric , CD.method , CD.dampening , CD.dampening.min.ess , CD.dampening.level , CD.steplength.margin , CD.steplength , CD.steplength.parallel , CD.adaptive.epsilon , CD.steplength.esteq , CD.steplength.miss.sample , CD.steplength.min , CD.steplength.solver
|
Miscellaneous tuning parameters of the CD sampler and
optimizer. These have the same meaning as their Note that only the Hotelling's stopping criterion is implemented for CD. |
CD.nsteps , CD.multiplicity
|
Main settings for contrastive
divergence to obtain initial values for the estimation:
respectively, the number of Metropolis–Hastings steps to take
before reverting to the starting value and the number of
tentative proposals per step. Computational experiments indicate
that increasing In practice, MPLE, when available, usually outperforms CD for
even a very high The default values have been set experimentally, providing a reasonably stable, if not great, starting values. |
CD.nsteps.obs , CD.multiplicity.obs
|
When there are missing dyads,
|
loglik |
|
term.options |
A list of additional arguments to be passed to term initializers. See |
seed |
Seed value (integer) for the random number generator. See
|
parallel |
Number of threads in which to run the sampling. Defaults to
0 (no parallelism). See |
parallel.type |
API to use for parallel processing. Defaults
to using the parallel package with PSOCK clusters. See
|
parallel.version.check |
Logical: If TRUE, check that the version of ergm running on the slave nodes is the same as that running on the master node. |
parallel.inherit.MT |
Logical: If TRUE, slave nodes and
processes inherit the |
... |
A dummy argument to catch deprecated or mistyped control parameters. |
Different estimation methods or components of estimation have
different efficient tuning parameters; and we generally want to use
the estimation controls to inform the simulation controls in
control.simulate.ergm()
. To accomplish this, control.ergm()
uses
method-specific controls, with the method identified by the prefix:
CD
Contrastive Divergence estimation (Krivitsky 2017)
MPLE
Maximum Pseudo-Likelihood Estimation (Strauss and Ikeda 1990)
MCMLE
Monte-Carlo MLE (Hunter and Handcock 2006; Hummel et al. 2012)
SA
Stochastic Approximation via Robbins–Monro (Robbins and Monro 1951; Snijders 2002)
SAN
Simulated Annealing used when target.stats
are specified for ergm()
obs
Missing data MLE (Handcock and Gile 2010)
init
Affecting how initial parameter guesses are obtained
parallel
Affecting parallel processing
MCMC
Low-level MCMC simulation controls
Corresponding MCMC
controls will usually be overwritten by the
method-specific ones. After the estimation finishes, they will
contain the last MCMC parameters used.
A list with arguments as components.
Handcock MS, Gile KJ (2010).
“Modeling Social Networks from Sampled Data.”
Annals of Applied Statistics, 4(1), 5–25.
ISSN 1932-6157, doi:10.1214/08-AOAS221.
Hummel RM, Hunter DR, Handcock MS (2012).
“Improving Simulation-based Algorithms for Fitting ERGMs.”
Journal of Computational and Graphical Statistics, 21(4), 920–939.
doi:10.1080/10618600.2012.679224.
Hunter DR, Handcock MS (2006).
“Inference in Curved Exponential Family Models for Networks.”
Journal of Computational and Graphical Statistics, 15(3), 565–583.
ISSN 1061-8600, doi:10.1198/106186006X133069.
Krivitsky PN (2017).
“Using Contrastive Divergence to Seed Monte Carlo MLE for Exponential-family Random Graph Models.”
Computational Statistics & Data Analysis, 107, 149–161.
doi:10.1016/j.csda.2016.10.015.
Robbins H, Monro S (1951).
“A Stochastic Approximation Method.”
The Annals of Mathematical Statistics, 22(3), 400–407.
ISSN 00034851.
Schmid CS, Desmarais BA (2017).
“Exponential random graph models with big networks: Maximum pseudolikelihood estimation and the parametric bootstrap.”
In 2017 IEEE International Conference on Big Data (Big Data), 116–121.
doi:10.1109/bigdata.2017.8257919.
Schmid CS, Hunter DR (2023).
“Computing Pseudolikelihood Estimators for Exponential-Family Random Graph Models.”
Journal of Data Science, 21(2), 295–309.
doi:10.6339/23-JDS1094.
Snijders TAB (2002).
“Markov chain Monte Carlo Estimation of Exponential Random Graph Models.”
Journal of Social Structure, 3(2).
Strauss D, Ikeda M (1990).
“Pseudolikelihood Estimation for Social Networks.”
Journal of the American Statistical Association, 85(409), 204–212.
ISSN 0162-1459, doi:10.1080/01621459.1990.10475327.
Vats D, Flegal JM, Jones GL (2019).
“Multivariate output analysis for Markov chain Monte Carlo.”
Biometrika, 106(2), 321-337.
doi:10.1093/biomet/asz002.
Firth (1993), Bias Reduction in Maximum Likelihood Estimates. Biometrika, 80: 27-38.
Kristoffer Sahlin. Estimating convergence of Markov chain Monte Carlo simulations. Master's Thesis. Stockholm University, 2011. https://www2.math.su.se/matstat/reports/master/2011/rep2/report.pdf
ergm()
. The control.simulate()
function
performs a similar function for simulate.ergm()
;
control.gof()
performs a similar function for gof()
.
ergm.bridge.llr()
and logLik.ergm()
Auxiliary functions as user interfaces for fine-tuning the
ergm.bridge.llr()
algorithm, which approximates log likelihood
ratios using bridge sampling.
By default, the bridge sampler inherits its control
parameters from the ergm()
fit; control.logLik.ergm()
allows
the user to selectively override them.
control.ergm.bridge( bridge.nsteps = 16, bridge.target.se = NULL, bridge.bidirectional = TRUE, drop = TRUE, MCMC.burnin = MCMC.interval * 128, MCMC.burnin.between = max(ceiling(MCMC.burnin/sqrt(bridge.nsteps)), MCMC.interval * 16), MCMC.interval = 128, MCMC.samplesize = 16384, obs.MCMC.burnin = obs.MCMC.interval * 128, obs.MCMC.burnin.between = max(ceiling(obs.MCMC.burnin/sqrt(bridge.nsteps)), obs.MCMC.interval * 16), obs.MCMC.interval = MCMC.interval, obs.MCMC.samplesize = MCMC.samplesize, MCMC.prop = trim_env(~sparse + .triadic), MCMC.prop.weights = "default", MCMC.prop.args = list(), obs.MCMC.prop = MCMC.prop, obs.MCMC.prop.weights = MCMC.prop.weights, obs.MCMC.prop.args = MCMC.prop.args, MCMC.maxedges = Inf, MCMC.packagenames = c(), term.options = list(), seed = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... ) control.logLik.ergm( bridge.nsteps = 16, bridge.target.se = NULL, bridge.bidirectional = TRUE, drop = NULL, MCMC.burnin = NULL, MCMC.interval = NULL, MCMC.samplesize = NULL, obs.MCMC.samplesize = MCMC.samplesize, obs.MCMC.interval = MCMC.interval, obs.MCMC.burnin = MCMC.burnin, MCMC.prop = NULL, MCMC.prop.weights = NULL, MCMC.prop.args = NULL, obs.MCMC.prop = MCMC.prop, obs.MCMC.prop.weights = MCMC.prop.weights, obs.MCMC.prop.args = MCMC.prop.args, MCMC.maxedges = Inf, MCMC.packagenames = NULL, term.options = NULL, seed = NULL, parallel = NULL, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... )
control.ergm.bridge( bridge.nsteps = 16, bridge.target.se = NULL, bridge.bidirectional = TRUE, drop = TRUE, MCMC.burnin = MCMC.interval * 128, MCMC.burnin.between = max(ceiling(MCMC.burnin/sqrt(bridge.nsteps)), MCMC.interval * 16), MCMC.interval = 128, MCMC.samplesize = 16384, obs.MCMC.burnin = obs.MCMC.interval * 128, obs.MCMC.burnin.between = max(ceiling(obs.MCMC.burnin/sqrt(bridge.nsteps)), obs.MCMC.interval * 16), obs.MCMC.interval = MCMC.interval, obs.MCMC.samplesize = MCMC.samplesize, MCMC.prop = trim_env(~sparse + .triadic), MCMC.prop.weights = "default", MCMC.prop.args = list(), obs.MCMC.prop = MCMC.prop, obs.MCMC.prop.weights = MCMC.prop.weights, obs.MCMC.prop.args = MCMC.prop.args, MCMC.maxedges = Inf, MCMC.packagenames = c(), term.options = list(), seed = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... ) control.logLik.ergm( bridge.nsteps = 16, bridge.target.se = NULL, bridge.bidirectional = TRUE, drop = NULL, MCMC.burnin = NULL, MCMC.interval = NULL, MCMC.samplesize = NULL, obs.MCMC.samplesize = MCMC.samplesize, obs.MCMC.interval = MCMC.interval, obs.MCMC.burnin = MCMC.burnin, MCMC.prop = NULL, MCMC.prop.weights = NULL, MCMC.prop.args = NULL, obs.MCMC.prop = MCMC.prop, obs.MCMC.prop.weights = MCMC.prop.weights, obs.MCMC.prop.args = MCMC.prop.args, MCMC.maxedges = Inf, MCMC.packagenames = NULL, term.options = NULL, seed = NULL, parallel = NULL, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... )
bridge.nsteps |
Number of geometric bridges to use. |
bridge.target.se |
If not |
bridge.bidirectional |
Whether the bridge sampler first bridges from |
drop |
See |
MCMC.burnin |
Number of proposals before any MCMC sampling is done. It typically is set to a fairly large number. |
MCMC.burnin.between |
Number of proposals between the bridges; typically, less and less is needed as the number of steps decreases. |
MCMC.interval |
Number of proposals between sampled statistics. |
MCMC.samplesize |
Number of network statistics, randomly drawn from a given distribution on the set of all networks, returned by the Metropolis-Hastings algorithm. |
obs.MCMC.burnin , obs.MCMC.burnin.between , obs.MCMC.interval , obs.MCMC.samplesize
|
The |
MCMC.prop |
Specifies the proposal (directly) and/or
a series of "hints" about the structure of the model being
sampled. The specification is in the form of a one-sided formula
with hints separated by A common and default "hint" is |
MCMC.prop.weights |
Specifies the proposal
distribution used in the MCMC Metropolis-Hastings algorithm. Possible
choices depending on selected |
MCMC.prop.args |
An alternative, direct way of specifying additional arguments to proposal. |
obs.MCMC.prop , obs.MCMC.prop.weights , obs.MCMC.prop.args
|
The |
MCMC.maxedges |
The maximum number of edges that may occur during the MCMC sampling. If this number is exceeded at any time, sampling is stopped immediately. |
MCMC.packagenames |
Names of packages in which to look for change statistic functions in addition to those autodetected. This argument should not be needed outside of very strange setups. |
term.options |
A list of additional arguments to be passed to term initializers. See |
seed |
Seed value (integer) for the random number generator. See
|
parallel |
Number of threads in which to run the sampling. Defaults to
0 (no parallelism). See |
parallel.type |
API to use for parallel processing. Defaults
to using the parallel package with PSOCK clusters. See
|
parallel.version.check |
Logical: If TRUE, check that the version of ergm running on the slave nodes is the same as that running on the master node. |
parallel.inherit.MT |
Logical: If TRUE, slave nodes and
processes inherit the |
... |
A dummy argument to catch deprecated or mistyped control parameters. |
control.ergm.bridge()
is only used within a call to the
ergm.bridge.llr()
, ergm.bridge.dindstart.llk()
, or
ergm.bridge.0.llk()
functions.
control.logLik.ergm()
is only used within a call to the
logLik.ergm()
.
A list with arguments as components.
Auxiliary function as user interface for fine-tuning ERGM Goodness-of-Fit Evaluation.
The control.gof.ergm
version is intended to be used
with gof.ergm()
specifically and will "inherit" as many control
parameters from ergm
fit as possible().
control.gof.formula( nsim = 100, MCMC.burnin = 10000, MCMC.interval = 1000, MCMC.batch = 0, MCMC.prop = trim_env(~sparse + .triadic), MCMC.prop.weights = "default", MCMC.prop.args = list(), MCMC.maxedges = Inf, MCMC.packagenames = c(), MCMC.runtime.traceplot = FALSE, network.output = "network", seed = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE ) control.gof.ergm( nsim = 100, MCMC.burnin = NULL, MCMC.interval = NULL, MCMC.batch = NULL, MCMC.prop = NULL, MCMC.prop.weights = NULL, MCMC.prop.args = NULL, MCMC.maxedges = NULL, MCMC.packagenames = NULL, MCMC.runtime.traceplot = FALSE, network.output = "network", seed = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE )
control.gof.formula( nsim = 100, MCMC.burnin = 10000, MCMC.interval = 1000, MCMC.batch = 0, MCMC.prop = trim_env(~sparse + .triadic), MCMC.prop.weights = "default", MCMC.prop.args = list(), MCMC.maxedges = Inf, MCMC.packagenames = c(), MCMC.runtime.traceplot = FALSE, network.output = "network", seed = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE ) control.gof.ergm( nsim = 100, MCMC.burnin = NULL, MCMC.interval = NULL, MCMC.batch = NULL, MCMC.prop = NULL, MCMC.prop.weights = NULL, MCMC.prop.args = NULL, MCMC.maxedges = NULL, MCMC.packagenames = NULL, MCMC.runtime.traceplot = FALSE, network.output = "network", seed = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE )
nsim |
Number of networks to be randomly drawn using Markov chain Monte Carlo. This sample of networks provides the basis for comparing the model to the observed network. |
MCMC.burnin |
Number of proposals before any MCMC sampling is done. It typically is set to a fairly large number. |
MCMC.interval |
Number of proposals between sampled statistics. |
MCMC.batch |
if not 0 or |
MCMC.prop |
Specifies the proposal (directly) and/or
a series of "hints" about the structure of the model being
sampled. The specification is in the form of a one-sided formula
with hints separated by A common and default "hint" is |
MCMC.prop.weights |
Specifies the proposal
distribution used in the MCMC Metropolis-Hastings algorithm. Possible
choices depending on selected |
MCMC.prop.args |
An alternative, direct way of specifying additional arguments to proposal. |
MCMC.maxedges |
The maximum number of edges that may occur during the MCMC sampling. If this number is exceeded at any time, sampling is stopped immediately. |
MCMC.packagenames |
Names of packages in which to look for change statistic functions in addition to those autodetected. This argument should not be needed outside of very strange setups. |
MCMC.runtime.traceplot |
Logical: If |
network.output |
R class with which to output networks. The options are "network" (default) and "edgelist.compressed" (which saves space but only supports networks without vertex attributes) |
seed |
Seed value (integer) for the random number generator. See
|
parallel |
Number of threads in which to run the sampling. Defaults to
0 (no parallelism). See |
parallel.type |
API to use for parallel processing. Defaults
to using the parallel package with PSOCK clusters. See
|
parallel.version.check |
Logical: If TRUE, check that the version of ergm running on the slave nodes is the same as that running on the master node. |
parallel.inherit.MT |
Logical: If TRUE, slave nodes and
processes inherit the |
This function is only used within a call to the gof()
function.
See the Usage section in gof()
for details.
A list with arguments as components.
gof()
. The control.simulate()
function
performs a similar function for simulate.ergm()
;
control.ergm()
performs a similar function for
ergm()
.
Auxiliary function as user interface for fine-tuning simulated annealing algorithm.
control.san( SAN.maxit = 4, SAN.tau = 1, SAN.invcov = NULL, SAN.invcov.diag = FALSE, SAN.nsteps.alloc = function(nsim) 2^seq_len(nsim), SAN.nsteps = 2^19, SAN.samplesize = 2^12, SAN.prop = trim_env(~sparse + .triadic), SAN.prop.weights = "default", SAN.prop.args = list(), SAN.packagenames = c(), SAN.ignore.finite.offsets = TRUE, term.options = list(), seed = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE )
control.san( SAN.maxit = 4, SAN.tau = 1, SAN.invcov = NULL, SAN.invcov.diag = FALSE, SAN.nsteps.alloc = function(nsim) 2^seq_len(nsim), SAN.nsteps = 2^19, SAN.samplesize = 2^12, SAN.prop = trim_env(~sparse + .triadic), SAN.prop.weights = "default", SAN.prop.args = list(), SAN.packagenames = c(), SAN.ignore.finite.offsets = TRUE, term.options = list(), seed = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE )
SAN.maxit |
Number of temperature levels to use. |
SAN.tau |
Tuning parameter, specifying the temperature of the
process during the penultimate iteration. (During the last
iteration, the temperature is set to 0, resulting in a greedy
search, and during the previous iterations, the temperature is
set to |
SAN.invcov |
Initial inverse covariance matrix used to
calculate Mahalanobis distance in determining how far a proposed
MCMC move is from the |
SAN.invcov.diag |
Whether to only use the diagonal of the covariance matrix. It seems to work better in practice. |
SAN.nsteps.alloc |
Either a numeric vector or a function of the number of runs giving a sequence of relative lengths of simulated annealing runs. |
SAN.nsteps |
Number of MCMC proposals for all the annealing runs combined. |
SAN.samplesize |
Number of realisations' statistics to obtain for tuning purposes. |
SAN.prop |
Specifies the proposal (directly) and/or
a series of "hints" about the structure of the model being
sampled. The specification is in the form of a one-sided formula
with hints separated by A common and default "hint" is |
SAN.prop.weights |
Specifies the proposal
distribution used in the SAN Metropolis-Hastings algorithm. Possible
choices depending on selected |
SAN.prop.args |
An alternative, direct way of specifying additional arguments to proposal. |
SAN.packagenames |
Names of packages in which to look for change statistic functions in addition to those autodetected. This argument should not be needed outside of very strange setups. |
SAN.ignore.finite.offsets |
Whether SAN should ignore (treat as 0) finite offsets. |
term.options |
A list of additional arguments to be passed to term initializers. See |
seed |
Seed value (integer) for the random number generator. See
|
parallel |
Number of threads in which to run the sampling. Defaults to
0 (no parallelism). See |
parallel.type |
API to use for parallel processing. Defaults
to using the parallel package with PSOCK clusters. See
|
parallel.version.check |
Logical: If TRUE, check that the version of ergm running on the slave nodes is the same as that running on the master node. |
parallel.inherit.MT |
Logical: If TRUE, slave nodes and
processes inherit the |
This function is only used within a call to the san()
function.
See the Usage section in san()
for details.
A list with arguments as components.
Auxiliary function as user interface for fine-tuning ERGM
simulation. control.simulate
, control.simulate.formula
, and
control.simulate.formula.ergm
are all aliases for the same
function.
While the others supply a full set of simulation
settings, control.simulate.ergm
when passed as a control
parameter to simulate.ergm()
allows some settings to be
inherited from the ERGM stimation while overriding others.
control.simulate.formula.ergm( MCMC.burnin = MCMC.interval * 16, MCMC.interval = 1024, MCMC.prop = trim_env(~sparse + .triadic), MCMC.prop.weights = "default", MCMC.prop.args = list(), MCMC.batch = NULL, MCMC.effectiveSize = NULL, MCMC.effectiveSize.damp = 10, MCMC.effectiveSize.maxruns = 1000, MCMC.effectiveSize.burnin.pval = 0.2, MCMC.effectiveSize.burnin.min = 0.05, MCMC.effectiveSize.burnin.max = 0.5, MCMC.effectiveSize.burnin.nmin = 16, MCMC.effectiveSize.burnin.nmax = 128, MCMC.effectiveSize.burnin.PC = FALSE, MCMC.effectiveSize.burnin.scl = 1024, MCMC.effectiveSize.order.max = NULL, MCMC.maxedges = Inf, MCMC.packagenames = c(), MCMC.runtime.traceplot = FALSE, network.output = "network", term.options = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... ) control.simulate( MCMC.burnin = MCMC.interval * 16, MCMC.interval = 1024, MCMC.prop = trim_env(~sparse + .triadic), MCMC.prop.weights = "default", MCMC.prop.args = list(), MCMC.batch = NULL, MCMC.effectiveSize = NULL, MCMC.effectiveSize.damp = 10, MCMC.effectiveSize.maxruns = 1000, MCMC.effectiveSize.burnin.pval = 0.2, MCMC.effectiveSize.burnin.min = 0.05, MCMC.effectiveSize.burnin.max = 0.5, MCMC.effectiveSize.burnin.nmin = 16, MCMC.effectiveSize.burnin.nmax = 128, MCMC.effectiveSize.burnin.PC = FALSE, MCMC.effectiveSize.burnin.scl = 1024, MCMC.effectiveSize.order.max = NULL, MCMC.maxedges = Inf, MCMC.packagenames = c(), MCMC.runtime.traceplot = FALSE, network.output = "network", term.options = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... ) control.simulate.formula( MCMC.burnin = MCMC.interval * 16, MCMC.interval = 1024, MCMC.prop = trim_env(~sparse + .triadic), MCMC.prop.weights = "default", MCMC.prop.args = list(), MCMC.batch = NULL, MCMC.effectiveSize = NULL, MCMC.effectiveSize.damp = 10, MCMC.effectiveSize.maxruns = 1000, MCMC.effectiveSize.burnin.pval = 0.2, MCMC.effectiveSize.burnin.min = 0.05, MCMC.effectiveSize.burnin.max = 0.5, MCMC.effectiveSize.burnin.nmin = 16, MCMC.effectiveSize.burnin.nmax = 128, MCMC.effectiveSize.burnin.PC = FALSE, MCMC.effectiveSize.burnin.scl = 1024, MCMC.effectiveSize.order.max = NULL, MCMC.maxedges = Inf, MCMC.packagenames = c(), MCMC.runtime.traceplot = FALSE, network.output = "network", term.options = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... ) control.simulate.ergm( MCMC.burnin = NULL, MCMC.interval = NULL, MCMC.scale = 1, MCMC.prop = NULL, MCMC.prop.weights = NULL, MCMC.prop.args = NULL, MCMC.batch = NULL, MCMC.effectiveSize = NULL, MCMC.effectiveSize.damp = 10, MCMC.effectiveSize.maxruns = 1000, MCMC.effectiveSize.burnin.pval = 0.2, MCMC.effectiveSize.burnin.min = 0.05, MCMC.effectiveSize.burnin.max = 0.5, MCMC.effectiveSize.burnin.nmin = 16, MCMC.effectiveSize.burnin.nmax = 128, MCMC.effectiveSize.burnin.PC = FALSE, MCMC.effectiveSize.burnin.scl = 1024, MCMC.effectiveSize.order.max = NULL, MCMC.maxedges = Inf, MCMC.packagenames = NULL, MCMC.runtime.traceplot = FALSE, network.output = "network", term.options = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... )
control.simulate.formula.ergm( MCMC.burnin = MCMC.interval * 16, MCMC.interval = 1024, MCMC.prop = trim_env(~sparse + .triadic), MCMC.prop.weights = "default", MCMC.prop.args = list(), MCMC.batch = NULL, MCMC.effectiveSize = NULL, MCMC.effectiveSize.damp = 10, MCMC.effectiveSize.maxruns = 1000, MCMC.effectiveSize.burnin.pval = 0.2, MCMC.effectiveSize.burnin.min = 0.05, MCMC.effectiveSize.burnin.max = 0.5, MCMC.effectiveSize.burnin.nmin = 16, MCMC.effectiveSize.burnin.nmax = 128, MCMC.effectiveSize.burnin.PC = FALSE, MCMC.effectiveSize.burnin.scl = 1024, MCMC.effectiveSize.order.max = NULL, MCMC.maxedges = Inf, MCMC.packagenames = c(), MCMC.runtime.traceplot = FALSE, network.output = "network", term.options = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... ) control.simulate( MCMC.burnin = MCMC.interval * 16, MCMC.interval = 1024, MCMC.prop = trim_env(~sparse + .triadic), MCMC.prop.weights = "default", MCMC.prop.args = list(), MCMC.batch = NULL, MCMC.effectiveSize = NULL, MCMC.effectiveSize.damp = 10, MCMC.effectiveSize.maxruns = 1000, MCMC.effectiveSize.burnin.pval = 0.2, MCMC.effectiveSize.burnin.min = 0.05, MCMC.effectiveSize.burnin.max = 0.5, MCMC.effectiveSize.burnin.nmin = 16, MCMC.effectiveSize.burnin.nmax = 128, MCMC.effectiveSize.burnin.PC = FALSE, MCMC.effectiveSize.burnin.scl = 1024, MCMC.effectiveSize.order.max = NULL, MCMC.maxedges = Inf, MCMC.packagenames = c(), MCMC.runtime.traceplot = FALSE, network.output = "network", term.options = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... ) control.simulate.formula( MCMC.burnin = MCMC.interval * 16, MCMC.interval = 1024, MCMC.prop = trim_env(~sparse + .triadic), MCMC.prop.weights = "default", MCMC.prop.args = list(), MCMC.batch = NULL, MCMC.effectiveSize = NULL, MCMC.effectiveSize.damp = 10, MCMC.effectiveSize.maxruns = 1000, MCMC.effectiveSize.burnin.pval = 0.2, MCMC.effectiveSize.burnin.min = 0.05, MCMC.effectiveSize.burnin.max = 0.5, MCMC.effectiveSize.burnin.nmin = 16, MCMC.effectiveSize.burnin.nmax = 128, MCMC.effectiveSize.burnin.PC = FALSE, MCMC.effectiveSize.burnin.scl = 1024, MCMC.effectiveSize.order.max = NULL, MCMC.maxedges = Inf, MCMC.packagenames = c(), MCMC.runtime.traceplot = FALSE, network.output = "network", term.options = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... ) control.simulate.ergm( MCMC.burnin = NULL, MCMC.interval = NULL, MCMC.scale = 1, MCMC.prop = NULL, MCMC.prop.weights = NULL, MCMC.prop.args = NULL, MCMC.batch = NULL, MCMC.effectiveSize = NULL, MCMC.effectiveSize.damp = 10, MCMC.effectiveSize.maxruns = 1000, MCMC.effectiveSize.burnin.pval = 0.2, MCMC.effectiveSize.burnin.min = 0.05, MCMC.effectiveSize.burnin.max = 0.5, MCMC.effectiveSize.burnin.nmin = 16, MCMC.effectiveSize.burnin.nmax = 128, MCMC.effectiveSize.burnin.PC = FALSE, MCMC.effectiveSize.burnin.scl = 1024, MCMC.effectiveSize.order.max = NULL, MCMC.maxedges = Inf, MCMC.packagenames = NULL, MCMC.runtime.traceplot = FALSE, network.output = "network", term.options = NULL, parallel = 0, parallel.type = NULL, parallel.version.check = TRUE, parallel.inherit.MT = FALSE, ... )
MCMC.burnin |
Number of proposals before any MCMC sampling is done. It typically is set to a fairly large number. |
MCMC.interval |
Number of proposals between sampled statistics. |
MCMC.prop |
Specifies the proposal (directly) and/or
a series of "hints" about the structure of the model being
sampled. The specification is in the form of a one-sided formula
with hints separated by A common and default "hint" is |
MCMC.prop.weights |
Specifies the proposal
distribution used in the MCMC Metropolis-Hastings algorithm. Possible
choices depending on selected |
MCMC.prop.args |
An alternative, direct way of specifying additional arguments to proposal. |
MCMC.batch |
if not 0 or |
MCMC.effectiveSize , MCMC.effectiveSize.damp , MCMC.effectiveSize.maxruns , MCMC.effectiveSize.burnin.pval , MCMC.effectiveSize.burnin.min , MCMC.effectiveSize.burnin.max , MCMC.effectiveSize.burnin.nmin , MCMC.effectiveSize.burnin.nmax , MCMC.effectiveSize.burnin.PC , MCMC.effectiveSize.burnin.scl , MCMC.effectiveSize.order.max
|
Set After each run, the returned statistics are mapped to the
estimating function scale, then an exponential decay model is fit
to the scaled statistics to find that burn-in which would reduce
the difference between the initial values of statistics and their
equilibrium values by a factor of A Geweke diagnostic is then run, after thinning the sample to
If The effective size of the post-burn-in sample is computed via
Vats et al. (2019), and compared to the target
effective size. If it is not matched, the MCMC run is resumed,
with the additional draws needed linearly extrapolated but
weighted in favor of the baseline Lastly, if |
MCMC.maxedges |
The maximum number of edges that may occur during the MCMC sampling. If this number is exceeded at any time, sampling is stopped immediately. |
MCMC.packagenames |
Names of packages in which to look for change statistic functions in addition to those autodetected. This argument should not be needed outside of very strange setups. |
MCMC.runtime.traceplot |
Logical: If |
network.output |
R class with which to output networks. The options are "network" (default) and "edgelist.compressed" (which saves space but only supports networks without vertex attributes) |
term.options |
A list of additional arguments to be passed to term initializers. See |
parallel |
Number of threads in which to run the sampling. Defaults to
0 (no parallelism). See |
parallel.type |
API to use for parallel processing. Defaults
to using the parallel package with PSOCK clusters. See
|
parallel.version.check |
Logical: If TRUE, check that the version of ergm running on the slave nodes is the same as that running on the master node. |
parallel.inherit.MT |
Logical: If TRUE, slave nodes and
processes inherit the |
... |
A dummy argument to catch deprecated or mistyped control parameters. |
MCMC.scale |
For |
This function is only used within a call to the ERGM simulate()
function. See the Usage section in simulate.ergm()
for
details.
A list with arguments as components.
simulate.ergm()
, simulate.formula()
.
control.ergm()
performs a similar function for
ergm()
; control.gof()
performs a similar function
for gof()
.
By default, this term adds one
statistic to the model, equal to the number of cyclic triples in the
network, defined as a set of edges of the form .
# binary: ctriple(attr=NULL, diff=FALSE, levels=NULL) # binary: ctriad
# binary: ctriple(attr=NULL, diff=FALSE, levels=NULL) # binary: ctriad
attr , diff
|
quantitative attribute (see Specifying Vertex attributes and Levels ( |
levels |
specifies the value of |
This term can only be used with directed networks.
for all directed networks, triangle
is equal to
ttriple+ctriple
, so at most two of these three terms can be in a
model.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, triad-related, binary
Arguments may have the same forms as in the API, but for convenience, alternative forms are accepted.
If the model in formula
is curved, then the outputs of this operator term's map
argument will be used as inputs to the curved terms of the formula
model.
Curve
is an obsolete alias and may be deprecated and removed in a future release.
# binary: Curve(formula, params, map, gradient=NULL, minpar=-Inf, maxpar=+Inf, cov=NULL) # binary: Parametrise(formula, params, map, gradient=NULL, minpar=-Inf, maxpar=+Inf, # cov=NULL) # binary: Parametrize(formula, params, map, gradient=NULL, minpar=-Inf, maxpar=+Inf, # cov=NULL) # valued: Curve(formula, params, map, gradient=NULL, minpar=-Inf, maxpar=+Inf, cov=NULL) # valued: Parametrise(formula, params, map, gradient=NULL, minpar=-Inf, maxpar=+Inf, # cov=NULL) # valued: Parametrize(formula, params, map, gradient=NULL, minpar=-Inf, maxpar=+Inf, # cov=NULL)
# binary: Curve(formula, params, map, gradient=NULL, minpar=-Inf, maxpar=+Inf, cov=NULL) # binary: Parametrise(formula, params, map, gradient=NULL, minpar=-Inf, maxpar=+Inf, # cov=NULL) # binary: Parametrize(formula, params, map, gradient=NULL, minpar=-Inf, maxpar=+Inf, # cov=NULL) # valued: Curve(formula, params, map, gradient=NULL, minpar=-Inf, maxpar=+Inf, cov=NULL) # valued: Parametrise(formula, params, map, gradient=NULL, minpar=-Inf, maxpar=+Inf, # cov=NULL) # valued: Parametrize(formula, params, map, gradient=NULL, minpar=-Inf, maxpar=+Inf, # cov=NULL)
formula |
a one-sided |
params |
a named list whose names are the curved parameter names, may also be a character vector with names. |
map |
the mapping from curved to canonical. May have the following forms:
|
gradient |
its gradient function. It is optional if
|
minpar , maxpar
|
the minimum and maximum allowed curved parameter values. The parameters will be recycled to the appropriate length. |
cov |
optional |
ergmTerm
for index of model terms currently visible to the package.
operator, binary, valued
This term adds one network statistic to the model for each value of k
,
corresponding to the number of k
-cycles (or, alternately, semicycles)
in the graph.
This term can be used with either directed or undirected networks.
# binary: cycle(k, semi=FALSE)
# binary: cycle(k, semi=FALSE)
k |
a vector of integers giving the cycle lengths to count.
Directed cycle lengths may range from |
semi |
an optional logical indicating whether semicycles (rather than directed cycles) should be counted; this is ignored in the undirected case. |
directed |
2-cycles are equivalent to mutual dyads. |
ergmTerm
for index of model terms currently visible to the package.
directed, undirected, binary
This term adds one statistic, equal to the number of ties
such that there exists a two-path from
to
. (Related to the
ttriple
term.)
# binary: cyclicalties(attr=NULL, levels=NULL) # valued: cyclicalties(threshold=0)
# binary: cyclicalties(attr=NULL, levels=NULL) # valued: cyclicalties(threshold=0)
attr |
quantitative attribute (see Specifying Vertex attributes and Levels ( |
levels |
TODO (See Specifying Vertex
attributes and Levels ( |
ergmTerm
for index of model terms currently visible to the package.
directed, undirected, binary, valued
This statistic implements the cyclical weights
statistic, like that defined by Krivitsky (2012), Equation 13,
but with the focus dyad being rather than
. For each option,
the first (and the default) is more stable but also more
conservative, while the second is more sensitive but more likely
to induce a multimodal distribution of networks.
# valued: cyclicalweights(twopath="min", combine="max", affect="min")
# valued: cyclicalweights(twopath="min", combine="max", affect="min")
twopath |
the minimum of the constituent dyads ( |
combine |
the maximum of the
2-path strengths ( |
affected |
the minimum of the focus dyad and the
combined strength of the two paths ( |
ergmTerm
for index of model terms currently visible to the package.
directed, nonnegative, undirected, valued
This term adds one network statistic equal to the correlation of the degrees of all pairs of nodes in the network which are tied. Only coded for undirected networks.
# binary: degcor
# binary: degcor
ergmTerm
for index of model terms currently visible to the package.
undirected, binary
This term adds one network statistic equal to the mean of the cross-products of the degrees of all pairs of nodes in the network which are tied. Only coded for undirected networks.
# binary: degcrossprod
# binary: degcrossprod
ergmTerm
for index of model terms currently visible to the package.
undirected, binary
This term adds one
network statistic to the model for each element of from
(or to
); the th
such statistic equals the number of nodes in the network of degree
greater than or equal to
from[i]
but strictly less than to[i]
, i.e. with edges
in semiopen interval [from,to)
.
# binary: degrange(from, to=+Inf, by=NULL, homophily=FALSE, levels=NULL)
# binary: degrange(from, to=+Inf, by=NULL, homophily=FALSE, levels=NULL)
from , to
|
vectors of distinct integers. If one of the vectors have length 1, it is recycled to the length of the other. Otherwise, it must have the same length. |
by , levels , homophily
|
the optional argument |
This term can only be used with undirected networks; for directed networks
see idegrange
and odegrange
. This term can be used
with bipartite networks, and will count nodes of both first and second mode in
the specified degree range. To count only nodes of the first mode ("actors"), use b1degrange
and to count only those fo the second mode ("events"), use b2degrange
.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, undirected, binary
This term adds one
network statistic to the model for each element in d
; the th
such statistic equals the number of nodes in the network of degree
d[i]
, i.e. with exactly d[i]
edges.
This term can only be used with undirected networks; for directed networks
see idegree
and odegree
.
# binary: degree(d, by=NULL, homophily=FALSE, levels=NULL)
# binary: degree(d, by=NULL, homophily=FALSE, levels=NULL)
d |
vector of distinct integers |
by , levels , homophily
|
the optional argument |
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, frequently-used, undirected, binary
This term adds one network statistic to the model equaling the sum over the actors of each actor's degree taken to the 3/2 power (or, equivalently, multiplied by its square root). This term is an undirected analog to the terms of Snijders et al. (2010), equations (11) and (12). This term can only be used with undirected networks.
# binary: degree1.5
# binary: degree1.5
ergmTerm
for index of model terms currently visible to the package.
undirected, binary
The degreedist
generic computes and returns the degree
distribution (number of vertices in the network with each degree
value) for a given network. This help page documents the
function. For help about the ERGM sample space constraint with that name, try
help("degreedist-constraint")
.
degreedist(object, ...) ## S3 method for class 'network' degreedist(object, print = TRUE, ...)
degreedist(object, ...) ## S3 method for class 'network' degreedist(object, print = TRUE, ...)
object |
a |
... |
Additional arguments to functions. |
print |
logical, whether to print the degree distribution. |
If directed, a matrix of the distributions of in and out degrees; this is row bound and only contains degrees for which one of the in or out distributions has a positive count. If bipartite, a list containing the degree distributions of b1 and b2. Otherwise, a vector of the positive values in the degree distribution
degreedist(network)
: Method for network
objects.
data(faux.mesa.high) degreedist(faux.mesa.high)
data(faux.mesa.high) degreedist(faux.mesa.high)
Only networks whose degree distributions are the same as those in the network passed in the model formula have non-zero probability.
# degreedist
# degreedist
ergmConstraint
for index of constraints and hints currently visible to the package.
directed, undirected
Only networks whose vertex degrees are the same as those in the network passed in the model formula have non-zero probability. If the network is directed, both indegree and outdegree are preserved.
# degrees # nodedegrees
# degrees # nodedegrees
ergmConstraint
for index of constraints and hints currently visible to the package.
directed, undirected
This term adds one network statistic equal to the density of the network.
For undirected networks, density
equals kstar(1)
or
edges
divided by ; for directed networks,
density
equals edges
or istar(1)
or ostar(1)
divided by .
# binary: density
# binary: density
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, undirected, binary
For values of pow
other than
0
, this term adds one network statistic to the model,
equaling the sum, over directed edges , of
sign.action(attr[i]-attr[j])^pow
if dir
is
"t-h"
and of sign.action(attr[j]-attr[i])^pow
if
"h-t"
. That is, the
argument dir
determines which vertex's attribute is
subtracted from which, with tail being the origin of a directed edge
and head being its destination, and bipartite networks' edges being
treated as going from the first part (b1) to the second (b2).
If pow==0
, the exponentiation is replaced by the signum
function: +1
if the difference is positive, 0
if there
is no difference, and -1
if the difference is negative. Note
that this function is applied after the
sign.action
. The comparison is exact, so when using
calculated values of attr
, ensure that values that you
want to be considered equal are, in fact, equal.
# binary: diff(attr, pow=1, dir="t-h", sign.action="identity") # valued: diff(attr, pow=1, dir="t-h", sign.action="identity", form ="sum")
# binary: diff(attr, pow=1, dir="t-h", sign.action="identity") # valued: diff(attr, pow=1, dir="t-h", sign.action="identity", form ="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
pow |
exponent for the node difference |
dir |
determines which vertix's attribute is subtracted from which. Accepts: |
sign.action |
one of
|
form |
character how to aggregate tie values in a valued ERGM |
this term may not be meaningful for unipartite undirected
networks unless sign.action=="abs"
. When used on such a
network, it behaves as if all edges were directed, going from the
lower-indexed vertex to the higher-indexed vertex.
ergmTerm
for index of model terms currently visible to the package.
bipartite, directed, dyad-independent, frequently-used, quantitative nodal attribute, undirected, binary, valued
Specifies each dyad's baseline distribution to be discrete uniform
between a
and b
(both inclusive): , with
the support being
a
, a+1
, ..., b-1
, b
.
# DiscUnif(a,b)
# DiscUnif(a,b)
a , b
|
minimum and maximum to the baseline discrete uniform distribution, both inclusive. Both values must be finite. |
ergmReference
for index of reference distributions currently visible to the package.
discrete, finite
This term adds one network statistic to the model for each element in d
where the th such statistic equals the number of dyads in the network with exactly
d[i]
shared partners.
# binary: ddsp(d, type="OTP") # binary: dsp(d, type="OTP")
# binary: ddsp(d, type="OTP") # binary: dsp(d, type="OTP")
d |
a vector of distinct integers |
type |
A string indicating the type of shared partner or path to be considered for directed networks: |
While there is only one shared partner configuration in the undirected
case, nine distinct configurations are possible for directed graphs, selected
using the type
argument. Currently, terms may be defined with respect to
five of these configurations; they are defined here as follows (using
terminology from Butts (2008) and the relevent
package):
Outgoing Two-path ("OTP"
): vertex is an OTP shared partner of ordered
pair
iff
. Also known as "transitive
shared partner".
Incoming Two-path ("ITP"
): vertex is an ITP shared partner of ordered
pair
iff
. Also known as "cyclical shared
partner"
Reciprocated Two-path ("RTP"
): vertex is an RTP shared partner of ordered
pair
iff
.
Outgoing Shared Partner ("OSP"
): vertex is an OSP shared partner of
ordered pair
iff
.
Incoming Shared Partner ("ISP"
): vertex is an ISP shared partner of ordered
pair
iff
.
By default, outgoing two-paths ("OTP"
) are calculated. Note that Robins et al. (2009)
define closely related statistics to several of the above, using slightly different terminology.
This term takes an additional term option (see
options?ergm
), cache.sp
, controlling whether
the implementation will cache the number of shared partners for
each dyad in the network; this is usually enabled by default.
This term can only be used with directed networks.
ergmTerm
for index of model terms currently visible to the package.
directed, binary
This term adds three statistics to the model, each equal to the sum of the
covariate values for all dyads occupying one of the three possible non-empty
dyad states (mutual, upper-triangular asymmetric, and lower-triangular
asymmetric dyads, respectively), with the empty or null state serving as a
reference category. If the network is undirected, x
is either a
matrix of edgewise covariates, or a network; if the latter, optional
argument attrname
provides the name of the edge attribute to use for
edge values. This term adds one statistic to the model, equal to the sum of
the covariate values for each edge appearing in the network. The
edgecov
and dyadcov
terms are equivalent for undirected
networks.
# binary: dyadcov(x, attrname=NULL)
# binary: dyadcov(x, attrname=NULL)
x , attrname
|
a specification for the dyadic covariate: either one of the following, or the name of a network attribute containing one of the following:
|
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, quantitative dyadic attribute, undirected, binary
It is assumed that the observed LHS network is a noisy observation of
some unobserved true network, with p01
giving the dyadwise
probability of erroneously observing a tie where the true network
had a non-tie and p10
giving the dyadwise probability of
erroneously observing a nontie where the true network had a tie.
# dyadnoise(p01, p10)
# dyadnoise(p01, p10)
p01 , p10
|
can both be scalars or both be adjacency matrices of the same dimension as that of the LHS network giving these probabilities. |
See Karwa et al. (2016) for an application.
ergmConstraint
for index of constraints and hints currently visible to the package.
directed, dyad-independent, soft, undirected
This is an "operator" constraint that takes one or two ergmTerm
dyad-independent formulas. For the terms in the vary=
formula, only those that change at least one of the terms will be allowed to vary, and all others will be fixed. If both formulas are given, the dyads that vary either for one or for the other will be allowed to vary. Note that a formula passed to Dyads
without an argument name will default to fix=
.
# Dyads(fix=NULL, vary=NULL)
# Dyads(fix=NULL, vary=NULL)
fix , vary
|
formula with only dyad-independent terms |
ergmConstraint
for index of constraints and hints currently visible to the package.
directed, dyad-independent, operator, undirected
This network data set comprises two versions of a biological network in which the nodes are operons in Escherichia Coli and a directed edge from one node to another indicates that the first encodes the transcription factor that regulates the second.
data(ecoli)
data(ecoli)
The network object ecoli1
is directed, with 423 nodes and 519 arcs.
The object ecoli2
is an undirected version of the same network, in
which all arcs are treated as edges and the five isolated nodes (which
exhibit only self-regulation in ecoli1
) are removed, leaving 418
nodes.
When publishing results obtained using this
data set, the original authors (Salgado et al, 2001; Shen-Orr et al, 2002)
should be cited, along with this R
package.
The data set is based on the RegulonDB network (Salgado et al, 2001) and was modified by Shen-Orr et al (2002).
Salgado et al (2001), Regulondb (version 3.2): Transcriptional Regulation and Operon Organization in Escherichia Coli K-12, Nucleic Acids Research, 29(1): 72-74.
Shen-Orr et al (2002), Network Motifs in the Transcriptional Regulation Network of Escerichia Coli, Nature Genetics, 31(1): 64-68.
%Saul and Filkov (2007)
%Hummel et al (2010)
This term adds one statistic to the model, equal to the sum
of the covariate values for each edge appearing in the network. The
edgecov
term applies to both directed and undirected networks. For
undirected networks the covariates are also assumed to be undirected. The
edgecov
and dyadcov
terms are equivalent for undirected
networks.
# binary: edgecov(x, attrname=NULL) # valued: edgecov(x, attrname=NULL, form="sum")
# binary: edgecov(x, attrname=NULL) # valued: edgecov(x, attrname=NULL, form="sum")
x , attrname
|
a specification for the dyadic covariate: either one of the following, or the name of a network attribute containing one of the following:
|
form |
character how to aggregate tie values in a valued ERGM |
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, frequently-used, quantitative dyadic attribute, undirected, binary, valued
Only networks having the same number of edges as the network passed in the model formula have non-zero probability.
# edges
# edges
ergmConstraint
for index of constraints and hints currently visible to the package.
None
This term adds one network statistic equal to the number of
edges (i.e. nonzero values) in the network. For undirected networks, edges
is equal to kstar(1)
; for directed networks, edges is equal to both
ostar(1)
and istar(1)
.
# binary: edges # valued: nonzero # valued: edges
# binary: edges # valued: nonzero # valued: edges
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, undirected, binary, valued
Preserve values of dyads incident on vertices with attribute attr
being TRUE
or if attrname
is NULL
, the vertex attribute "na"
being FALSE
.
# egocentric(attr=NULL, direction="both")
# egocentric(attr=NULL, direction="both")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
direction |
one of |
ergmConstraint
for index of constraints and hints currently visible to the package.
directed, dyad-independent, undirected
The generic enformulate.curved
converts an ergm
object
or formula of a model with curved terms to the variant in which the curved
parameters embedded into the formula and are removed from the parameter
vector. This is the form that used to be required by ergm()
calls.
enformulate.curved(object, ...) ## S3 method for class 'ergm' enformulate.curved(object, ...) ## S3 method for class 'formula' enformulate.curved(object, theta, ...)
enformulate.curved(object, ...) ## S3 method for class 'ergm' enformulate.curved(object, ...) ## S3 method for class 'formula' enformulate.curved(object, theta, ...)
object |
An |
... |
Unused at this time. |
theta |
Curved model parameter configuration. |
Because of a current kludge in ergm()
, output from one run
cannot be directly passed as initial values (control.ergm(init=)
) for
the next run if any of the terms are curved. One workaround is to embed the
curved parameters into the formula (while keeping fixed=FALSE
) and
remove them from control.ergm(init=)
.
This function automates this process for curved ERGM terms included with the ergm package. It does not work with curved terms not included in ergm.
A list with the following components:
formula |
The formula with curved parameter estimates incorporated. |
theta |
The coefficient vector with curved parameter estimates removed. |
Adds one statistic equal to the number of dyads whose values
are within tolerance
of value
, i.e., between
value-tolerance
and value+tolerance
, inclusive.
# valued: equalto(value=0, tolerance=0)
# valued: equalto(value=0, tolerance=0)
value |
numerical threshold |
tolerance |
numerical threshold |
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, undirected, valued
ergm()
is used to fit exponential-family random graph
models (ERGMs), in which
the probability of a given network, , on a set of nodes is
, where
is the reference measure (usually
),
is a vector of network statistics for
,
is a natural parameter vector of the same
length (with
for most terms), and
is the
normalizing constant for the distribution.
ergm()
can return a maximum pseudo-likelihood
estimate, an approximate maximum likelihood estimate based on a Monte
Carlo scheme, or an approximate contrastive divergence estimate based
on a similar scheme.
(For an overview of the package (Hunter et al. 2008; Krivitsky et al. 2023), see ergm.)
ergm( formula, response = NULL, reference = ~Bernoulli, constraints = ~., obs.constraints = ~. - observed, offset.coef = NULL, target.stats = NULL, eval.loglik = getOption("ergm.eval.loglik"), estimate = c("MLE", "MPLE", "CD"), control = control.ergm(), verbose = FALSE, ..., basis = ergm.getnetwork(formula), newnetwork = c("one", "all", "none") ) is.ergm(object) ## S3 method for class 'ergm' is.na(x) ## S3 method for class 'ergm' anyNA(x, ...) ## S3 method for class 'ergm' nobs(object, ...) ## S3 method for class 'ergm' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'ergm' vcov(object, sources = c("all", "model", "estimation"), ...)
ergm( formula, response = NULL, reference = ~Bernoulli, constraints = ~., obs.constraints = ~. - observed, offset.coef = NULL, target.stats = NULL, eval.loglik = getOption("ergm.eval.loglik"), estimate = c("MLE", "MPLE", "CD"), control = control.ergm(), verbose = FALSE, ..., basis = ergm.getnetwork(formula), newnetwork = c("one", "all", "none") ) is.ergm(object) ## S3 method for class 'ergm' is.na(x) ## S3 method for class 'ergm' anyNA(x, ...) ## S3 method for class 'ergm' nobs(object, ...) ## S3 method for class 'ergm' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'ergm' vcov(object, sources = c("all", "model", "estimation"), ...)
formula |
An R |
response |
Either a character string, a formula, or
|
reference |
A one-sided formula specifying
the reference measure ( |
constraints |
A formula specifying one or more constraints
on the support of the distribution of the networks being modeled. Multiple constraints
may be given, separated by “+” and “-” operators. See
The default is to have no constraints except those provided through
the Together with the model terms in the formula and the reference measure, the constraints define the distribution of networks being modeled. It is also possible to specify a proposal function directly either
by passing a string with the function's name (in which case,
arguments to the proposal should be specified through the
Note that not all possible combinations of constraints and reference measures are supported. However, for relatively simple constraints (i.e., those that simply permit or forbid specific dyads or sets of dyads from changing), arbitrary combinations should be possible. |
obs.constraints |
A one-sided formula specifying one or more
constraints or other modification in addition to those
specified by This allows the domain of the integral in the numerator of the partially obseved network face-value likelihoods of Handcock and Gile (2010) and Karwa et al. (2017) to be specified explicitly. The default is to constrain the integral to only integrate over
the missing dyads (if present), after incorporating constraints
provided through the It is also possible to specify a proposal function directly by
passing a string with the function's name of the |
offset.coef |
A vector of coefficients for the offset terms. |
target.stats |
vector of "observed network statistics,"
if these statistics are for some reason different than the
actual statistics of the network on the left-hand side of
|
eval.loglik |
Logical: For dyad-dependent models, if TRUE, use bridge
sampling to evaluate the log-likelihoood associated with the
fit. Has no effect for dyad-independent models.
Since bridge sampling takes additional time, setting to FALSE may
speed performance if likelihood values (and likelihood-based
values like AIC and BIC) are not needed. Can be set globally via |
estimate |
If "MPLE," then the maximum pseudolikelihood estimator
is returned. If "MLE" (the default), then an approximate maximum likelihood
estimator is returned. For certain models, the MPLE and MLE are equivalent,
in which case this argument is ignored. (To force MCMC-based approximate
likelihood calculation even when the MLE and MPLE are the same, see the
|
control |
A list of control parameters for algorithm tuning,
typically constructed with |
verbose |
A logical or an integer to control the amount of
progress and diagnostic information to be printed. |
... |
Additional arguments, to be passed to lower-level functions. |
basis |
a value (usually a |
newnetwork |
One of |
object |
an |
x , digits
|
See |
sources |
For the |
ergm()
returns an object of ergm
that is a list
consisting of the following elements:
coef |
The Monte Carlo maximum likelihood estimate
of |
sample |
The |
sample.obs |
As |
iterations |
The number of Newton-Raphson iterations required before convergence. |
MCMCtheta |
The value of |
loglikelihood |
The approximate change in log-likelihood in the last iteration. The value is only approximate because it is estimated based on the MCMC random sample. |
gradient |
The value of the gradient vector of the approximated loglikelihood function, evaluated at the maximizer. This vector should be very close to zero. |
covar |
Approximate covariance matrix for the MLE, based on the inverse Hessian of the approximated loglikelihood evaluated at the maximizer. |
failure |
Logical: Did the MCMC estimation fail? |
network |
Network passed on the left-hand side of |
newnetworks |
If argument |
newnetwork |
If argument |
coef.init |
The initial value of |
est.cov |
The covariance matrix of the model statistics in the final MCMC sample. |
coef.hist , steplen.hist , stats.hist , stats.obs.hist
|
For the MCMLE method, the history of coefficients, Hummel step lengths, and average model statistics for each iteration.. |
control |
The control list passed to the call. |
etamap |
The set of functions mapping the true parameter theta to the canonical parameter eta (irrelevant except in a curved exponential family model) |
formula |
|
target.stats |
The target.stats used during estimation (passed through from the Arguments) |
target.esteq |
Used for curved models to preserve the target mean values of the curved terms. It is identical to target.stats for non-curved models. |
constraints |
Constraints used during estimation (passed through from the Arguments) |
reference |
The reference measure used during estimation (passed through from the Arguments) |
estimate |
The estimation method used (passed through from the Arguments). |
offset |
vector of logical telling which model parameters are to be set at a fixed value (i.e., not estimated). |
drop |
If
|
estimable |
A logical vector indicating which terms could not be
estimated due to a |
info |
A list with miscellaneous information that would typically be accessed by the user via methods; in general, it should not be accessed directly. Current elements include:
|
null.lik |
Log-likelihood of the null model. Valid only for unconstrained models. |
mle.lik |
The approximate log-likelihood for the MLE. The value is only approximate because it is estimated based on the MCMC random sample. |
is.na(ergm)
: Return TRUE
if the ERGM was fit to a partially observed network and/or an observational process, such as missing (NA
) dyads.
anyNA(ergm)
: Alias to the is.na()
method.
nobs(ergm)
: Return the number of informative dyads of a model fit.
print(ergm)
: Print the call, the estimate, and the method used to obtain it.
vcov(ergm)
: extracts the variance-covariance matrix of
parameter estimates.
Although each of the statistics in a given model is a summary statistic for the entire network, it is rarely necessary to calculate statistics for an entire network in a proposed Metropolis-Hastings step. Thus, for example, if the triangle term is included in the model, a census of all triangles in the observed network is never taken; instead, only the change in the number of triangles is recorded for each edge toggle.
In the implementation of ergm()
, the model is
initialized in R, then all the model information is passed to a C
program that generates the sample of network statistics using MCMC.
This sample is then returned to R, which then uses one of several
algorithms, selected by main.method=
control.ergm()
parameter
to update the estimate.
The mechanism for proposing new networks for the MCMC sampling
scheme, which is a Metropolis-Hastings algorithm, depends on
two things: The constraints
, which define the set of possible
networks that could be proposed in a particular Markov chain step,
and the weights placed on these possible steps by the
proposal distribution. The former may be controlled using the
constraints
argument described above. The latter may
be controlled using the prop.weights
argument to the
control.ergm()
function.
The package is designed so that the user could conceivably add additional proposal types.
Hunter DR, Handcock MS, Butts CT, Goodreau SM, Morris M (2008).
“ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks.”
Journal of Statistical Software, 24(3), 1–29.
doi:10.18637/jss.v024.i03.
Krivitsky PN, Hunter DR, Morris M, Klumb C (2023).
“ergm 4: New Features for Analyzing Exponential-Family Random Graph Models.”
Journal of Statistical Software, 105(6), 1–44.
doi:10.18637/jss.v105.i06.
Admiraal R, Handcock MS (2007). networksis: Simulate bipartite graphs with fixed marginals through sequential importance sampling. Statnet Project, Seattle, WA. Version 1. https://statnet.org.
Bender-deMoll S, Morris M, Moody J (2008). Prototype Packages for Managing and Animating Longitudinal Network Data: dynamicnetwork and rSoNIA. Journal of Statistical Software, 24(7). doi:10.18637/jss.v024.i07
Butts CT (2007). sna: Tools for Social Network Analysis. R package version 2.3-2. https://cran.r-project.org/package=sna.
Butts CT (2008). network: A Package for Managing Relational Data in R. Journal of Statistical Software, 24(2). doi:10.18637/jss.v024.i02
Butts C (2015). network: The Statnet Project (https://statnet.org). R package version 1.12.0, https://cran.r-project.org/package=network.
Goodreau SM, Handcock MS, Hunter DR, Butts CT, Morris M (2008a). A statnet Tutorial. Journal of Statistical Software, 24(8). doi:10.18637/jss.v024.i08
Goodreau SM, Kitts J, Morris M (2008b). Birds of a Feather, or Friend of a Friend? Using Exponential Random Graph Models to Investigate Adolescent Social Networks. Demography, 45, in press.
Handcock, M. S. (2003) Assessing Degeneracy in Statistical Models of Social Networks, Working Paper #39, Center for Statistics and the Social Sciences, University of Washington. https://csss.uw.edu/research/working-papers/assessing-degeneracy-statistical-models-social-networks
Handcock MS (2003b). degreenet: Models for Skewed Count Distributions Relevant to Networks. Statnet Project, Seattle, WA. Version 1.0, https://statnet.org.
Handcock MS and Gile KJ (2010). Modeling Social Networks from Sampled Data. Annals of Applied Statistics, 4(1), 5-25. doi:10.1214/08-AOAS221
Handcock MS, Hunter DR, Butts CT, Goodreau SM, Morris M (2003a). ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. Statnet Project, Seattle, WA. Version 2, https://statnet.org.
Handcock MS, Hunter DR, Butts CT, Goodreau SM, Morris M (2003b). statnet: Software Tools for the Statistical Modeling of Network Data. Statnet Project, Seattle, WA. Version 2, https://statnet.org.
Hunter, D. R. and Handcock, M. S. (2006) Inference in curved exponential family models for networks, Journal of Computational and Graphical Statistics.
Hunter DR, Handcock MS, Butts CT, Goodreau SM, Morris M (2008b). ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. Journal of Statistical Software, 24(3). doi:10.18637/jss.v024.i03
Karwa V, Krivitsky PN, and Slavkovi\'c AB (2017). Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models. Journal of the Royal Statistical Society, Series C, 66(3):481–500. doi:10.1111/rssc.12185
Krivitsky PN (2012). Exponential-Family Random Graph Models for Valued Networks. Electronic Journal of Statistics, 2012, 6, 1100-1128. doi:10.1214/12-EJS696
Morris M, Handcock MS, Hunter DR (2008). Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects. Journal of Statistical Software, 24(4). doi:10.18637/jss.v024.i04
Snijders, T.A.B. (2002), Markov Chain Monte Carlo Estimation of Exponential Random Graph Models. Journal of Social Structure. Available from https://www.cmu.edu/joss/content/articles/volume3/Snijders.pdf.
network
, %v%
, %n%
, ergmTerm
, ergmMPLE
,
summary.ergm()
# # load the Florentine marriage data matrix # data(flo) # # attach the sociomatrix for the Florentine marriage data # This is not yet a network object. # flo # # Create a network object out of the adjacency matrix # flomarriage <- network(flo,directed=FALSE) flomarriage # # print out the sociomatrix for the Florentine marriage data # flomarriage[,] # # create a vector indicating the wealth of each family (in thousands of lira) # and add it as a covariate to the network object # flomarriage %v% "wealth" <- c(10,36,27,146,55,44,20,8,42,103,48,49,10,48,32,3) flomarriage # # create a plot of the social network # plot(flomarriage) # # now make the vertex size proportional to their wealth # plot(flomarriage, vertex.cex=flomarriage %v% "wealth" / 20, main="Marriage Ties") # # Use 'data(package = "ergm")' to list the data sets in a # data(package="ergm") # # Load a network object of the Florentine data # data(florentine) # # Fit a model where the propensity to form ties between # families depends on the absolute difference in wealth # gest <- ergm(flomarriage ~ edges + absdiff("wealth")) summary(gest) # # add terms for the propensity to form 2-stars and triangles # of families # gest <- ergm(flomarriage ~ kstar(1:2) + absdiff("wealth") + triangle) summary(gest) # import synthetic network that looks like a molecule data(molecule) # Add a attribute to it to mimic the atomic type molecule %v% "atomic type" <- c(1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3) # # create a plot of the social network # colored by atomic type # plot(molecule, vertex.col="atomic type",vertex.cex=3) # measure tendency to match within each atomic type gest <- ergm(molecule ~ edges + kstar(2) + triangle + nodematch("atomic type")) summary(gest) # compare it to differential homophily by atomic type gest <- ergm(molecule ~ edges + kstar(2) + triangle + nodematch("atomic type",diff=TRUE)) summary(gest) # Extract parameter estimates as a numeric vector: coef(gest) # Sources of variation in parameter estimates: vcov(gest, sources="model") vcov(gest, sources="estimation") vcov(gest, sources="all") # the default
# # load the Florentine marriage data matrix # data(flo) # # attach the sociomatrix for the Florentine marriage data # This is not yet a network object. # flo # # Create a network object out of the adjacency matrix # flomarriage <- network(flo,directed=FALSE) flomarriage # # print out the sociomatrix for the Florentine marriage data # flomarriage[,] # # create a vector indicating the wealth of each family (in thousands of lira) # and add it as a covariate to the network object # flomarriage %v% "wealth" <- c(10,36,27,146,55,44,20,8,42,103,48,49,10,48,32,3) flomarriage # # create a plot of the social network # plot(flomarriage) # # now make the vertex size proportional to their wealth # plot(flomarriage, vertex.cex=flomarriage %v% "wealth" / 20, main="Marriage Ties") # # Use 'data(package = "ergm")' to list the data sets in a # data(package="ergm") # # Load a network object of the Florentine data # data(florentine) # # Fit a model where the propensity to form ties between # families depends on the absolute difference in wealth # gest <- ergm(flomarriage ~ edges + absdiff("wealth")) summary(gest) # # add terms for the propensity to form 2-stars and triangles # of families # gest <- ergm(flomarriage ~ kstar(1:2) + absdiff("wealth") + triangle) summary(gest) # import synthetic network that looks like a molecule data(molecule) # Add a attribute to it to mimic the atomic type molecule %v% "atomic type" <- c(1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3) # # create a plot of the social network # colored by atomic type # plot(molecule, vertex.col="atomic type",vertex.cex=3) # measure tendency to match within each atomic type gest <- ergm(molecule ~ edges + kstar(2) + triangle + nodematch("atomic type")) summary(gest) # compare it to differential homophily by atomic type gest <- ergm(molecule ~ edges + kstar(2) + triangle + nodematch("atomic type",diff=TRUE)) summary(gest) # Extract parameter estimates as a numeric vector: coef(gest) # Sources of variation in parameter estimates: vcov(gest, sources="model") vcov(gest, sources="estimation") vcov(gest, sources="all") # the default
This is an internal function, not normally called directly by the
user. The ergm_MCMC_sample
function samples networks and
network statistics using an MCMC algorithm via MCMC_wrapper
and is capable of running in multiple threads using
ergm_MCMC_slave
.
The ergm_MCMC_slave
function calls the actual C
routine and does minimal preprocessing.
ergm_MCMC_sample( state, control, theta = NULL, verbose = FALSE, ..., eta = ergm.eta(theta, (if (is.ergm_state(state)) as.ergm_model(state) else as.ergm_model(state[[1]]))$etamap) ) ergm_MCMC_slave( state, eta, control, verbose, ..., burnin = NULL, samplesize = NULL, interval = NULL )
ergm_MCMC_sample( state, control, theta = NULL, verbose = FALSE, ..., eta = ergm.eta(theta, (if (is.ergm_state(state)) as.ergm_model(state) else as.ergm_model(state[[1]]))$etamap) ) ergm_MCMC_slave( state, eta, control, verbose, ..., burnin = NULL, samplesize = NULL, interval = NULL )
state |
an |
control |
A list of control parameters for algorithm tuning,
typically constructed with |
theta |
the (possibly curved) parameters of the model. |
verbose |
A logical or an integer to control the amount of
progress and diagnostic information to be printed. |
... |
additional arugments. |
eta |
the natural parameters of the model; by default constructed from |
burnin , samplesize , interval
|
MCMC paramters that can be used
to temporarily override those in the |
ergm_MCMC_sample
returns a list
containing:
stats |
an |
networks |
a list of final sampled networks, one for each thread. |
status |
status code, propagated from |
final.interval |
adaptively determined MCMC interval. |
final.effectiveSize |
adaptively determined target ESS (non-trivial if |
sampnetworks |
If |
ergm_MCMC_slave
returns the MCMC sample as a list of
the following:
s |
the matrix of statistics. |
state |
an |
status |
success or failure code: |
ergm_MCMC_sample
and ergm_MCMC_slave
replace
ergm.getMCMCsample
and ergm.mcmcslave
respectively. They
differ slightly in their argument names and in their return
formats. For example, ergm_MCMC_sample
expects ergm_state
rather than network/model/proposal, and theta
or eta
rather than eta0
;
and it does not return statsmatrix
or newnetwork
elements. Rather, if parallel processing is not in effect,
stats
is an mcmc.list
with one chain and networks
is a
list with one element.
Note that unless stats
is a part of the ergm_state
, the
returned stats will be relative to the original network, i.e.,
the calling function must shift the statistics if required.
At this time, repeated calls to ergm_MCMC_sample
will not
produce the same sequence of networks as a single long call, even
with the same starting seeds. This is because the network
sampling algorithms rely on the internal state of the network
representation in C, which may not be reconstructed exactly the
same way when "resuming". This behaviour may change in the
future.
# This example illustrates constructing "ingredients" for calling # ergm_MCMC_sample() from calls to simulate.ergm(). One can also # construct an ergm_state object directly from ergm_model(), # ergm_proposal(), etc., but the approach shown here is likely to # be the least error-prone and the most robust to future API # changes. # # The regular simulate() call hierarchy is # # simulate_formula.network(formula) -> # simulate.ergm_model(ergm_model) -> # simulate.ergm_state_full(ergm_state) # # They take an argument, return.args=, that will interrupt the call # and have it return its arguments. We can use it to obtain # low-level inputs robustly. data(florentine) control <- control.simulate(MCMC.burnin = 2, MCMC.interval = 1) # FYI: Obtain input for simulate.ergm_model(): sim.mod <- simulate(flomarriage~absdiff("wealth"), constraints=~edges, coef = NULL, nsim=3, control=control, return.args="ergm_model") names(sim.mod) str(sim.mod$object,1) # ergm_model # Obtain input for simulate.ergm_state_full(): sim.state <- simulate(flomarriage~absdiff("wealth"), constraints=~edges, coef = NULL, nsim=3, control=control, return.args="ergm_state") names(sim.state) str(sim.state$object, 1) # ergm_state # This control parameter would be set by nsim in the regular # simulate() call: control$MCMC.samplesize <- 3 # Capture intermediate networks; can also be left NULL for just the # statistics: control$MCMC.save_networks <- TRUE # Simulate starting from this state: out <- ergm_MCMC_sample(sim.state$object, control, theta = -1, verbose=6) names(out) out$stats # Sampled statistics str(out$networks, 1) # Updated ergm_state (one per thread) # List (an element per thread) of lists of captured ergm_states, # one for each sampled network: str(out$sampnetworks, 2) lapply(out$sampnetworks[[1]], as.network) # Converted to networks. # One more, picking up where the previous sampler left off, but see Note: control$MCMC.samplesize <- 1 str(ergm_MCMC_sample(out$networks, control, theta = -1, verbose=6), 2)
# This example illustrates constructing "ingredients" for calling # ergm_MCMC_sample() from calls to simulate.ergm(). One can also # construct an ergm_state object directly from ergm_model(), # ergm_proposal(), etc., but the approach shown here is likely to # be the least error-prone and the most robust to future API # changes. # # The regular simulate() call hierarchy is # # simulate_formula.network(formula) -> # simulate.ergm_model(ergm_model) -> # simulate.ergm_state_full(ergm_state) # # They take an argument, return.args=, that will interrupt the call # and have it return its arguments. We can use it to obtain # low-level inputs robustly. data(florentine) control <- control.simulate(MCMC.burnin = 2, MCMC.interval = 1) # FYI: Obtain input for simulate.ergm_model(): sim.mod <- simulate(flomarriage~absdiff("wealth"), constraints=~edges, coef = NULL, nsim=3, control=control, return.args="ergm_model") names(sim.mod) str(sim.mod$object,1) # ergm_model # Obtain input for simulate.ergm_state_full(): sim.state <- simulate(flomarriage~absdiff("wealth"), constraints=~edges, coef = NULL, nsim=3, control=control, return.args="ergm_state") names(sim.state) str(sim.state$object, 1) # ergm_state # This control parameter would be set by nsim in the regular # simulate() call: control$MCMC.samplesize <- 3 # Capture intermediate networks; can also be left NULL for just the # statistics: control$MCMC.save_networks <- TRUE # Simulate starting from this state: out <- ergm_MCMC_sample(sim.state$object, control, theta = -1, verbose=6) names(out) out$stats # Sampled statistics str(out$networks, 1) # Updated ergm_state (one per thread) # List (an element per thread) of lists of captured ergm_states, # one for each sampled network: str(out$sampnetworks, 2) lapply(out$sampnetworks[[1]], as.network) # Converted to networks. # One more, picking up where the previous sampler left off, but see Note: control$MCMC.samplesize <- 1 str(ergm_MCMC_sample(out$networks, control, theta = -1, verbose=6), 2)
lattice
package graphicsPlot MCMC list using lattice
package graphics
ergm_plot.mcmc.list(x, main = NULL, vars.per.page = 3, ...)
ergm_plot.mcmc.list(x, main = NULL, vars.per.page = 3, ...)
x |
an |
main |
character, main plot heading title. |
vars.per.page |
Number of rows (one variable per row) per
plotting page. Ignored if |
... |
additional arguments, currently unused. |
This is not a method at this time.
This cache is intended to store large, infrequently changing data
structures such as ergm_model
s and ergm_proposal
s on worker
nodes.
ergm_state_cache( comm = c("pass", "all", "clear", "insert", "get", "check", "list"), key, object )
ergm_state_cache( comm = c("pass", "all", "clear", "insert", "get", "check", "list"), key, object )
comm |
a character string giving the desired function; see the default argument above for permitted values and Details for meanings; partial matching is supported. |
key |
a character string, typically a |
object |
the object to be stored. Supported tasks are, respectively, to do nothing (the default), return all entries (mainly useful for testing), clear the cache, insert into cache, retrieve an object by key, check if a key is present, or list keys defined. Deleting an entry can be accomplished by inserting a Cache is limited to a hard-coded size (currently 4). This should
accommodate an |
If called via, say, clusterMap(cl, ergm_state_cache, ...)
the function will not accomplish anything. This is because
parallel
package will serialise the ergm_state_cache()
function object, send it to the remote node, evaluate it there,
and fetch the return value. This will leave the environment of
the worker's ergm_state_cache()
unchanged. To actually
evaluate it on the worker nodes, it is recommended to wrap it in
an empty function whose environment is set to globalenv()
. See
Examples below.
## Not run: # Wrap ergm_state_cache() and call it explicitly from ergm: call_ergm_state_cache <- function(...) ergm::ergm_state_cache(...) # Reset the function's environment so that it does not get sent to # worker nodes (who have their own instance of ergm namespace # loaded). environment(call_ergm_state_cache) <- globalenv() # Now, call the the wrapper function, with ... below replaced by # lists of desired arguments. clusterMap(cl, call_ergm_state_cache, ...) ## End(Not run)
## Not run: # Wrap ergm_state_cache() and call it explicitly from ergm: call_ergm_state_cache <- function(...) ergm::ergm_state_cache(...) # Reset the function's environment so that it does not get sent to # worker nodes (who have their own instance of ergm namespace # loaded). environment(call_ergm_state_cache) <- globalenv() # Now, call the the wrapper function, with ... below replaced by # lists of desired arguments. clusterMap(cl, call_ergm_state_cache, ...) ## End(Not run)
Return a symmetrized version of a binary network
ergm_symmetrize(x, rule = c("weak", "strong", "upper", "lower"), ...) ## Default S3 method: ergm_symmetrize(x, rule = c("weak", "strong", "upper", "lower"), ...) ## S3 method for class 'network' ergm_symmetrize(x, rule = c("weak", "strong", "upper", "lower"), ...)
ergm_symmetrize(x, rule = c("weak", "strong", "upper", "lower"), ...) ## Default S3 method: ergm_symmetrize(x, rule = c("weak", "strong", "upper", "lower"), ...) ## S3 method for class 'network' ergm_symmetrize(x, rule = c("weak", "strong", "upper", "lower"), ...)
x |
an object representing a network. |
rule |
a string specifying how the network is to be
symmetrized; see |
... |
additional arguments to |
The network
method requires more flexibility, in order
to specify how the edge attributes are handled. Therefore, rule
can be one of the following types:
The string is interpreted as in
sna::symmetrize()
. For edge attributes, "weak"
takes the
maximum value and "strong"
takes the minimum
value" for ordered attributes, and drops the unordered.
The function is evaluated on a data.frame
constructed by joining (via merge()
) the edge tibble
with all
attributes and NA
indicators with itself reversing tail and head
columns, and appending original columns with ".th"
and the
reversed columns with ".ht"
. It is then evaluated for each
attribute in turn, given two arguments: the data frame and the name
of the attribute.
The list must have exactly one unnamed element, and the remaining elements must be named with the names of edge attributes. The elements of the list are interpreted as above, allowing each edge attribute to be handled differently. Unnamed arguments are dropped.
ergm_symmetrize(default)
: The default method, passing the input on to sna::symmetrize()
.
ergm_symmetrize(network)
: A method for network
objects, which preserves network and vertex attributes, and handles edge attributes.
This was originally exported as a generic to overwrite
sna::symmetrize()
. By developer's request, it has been renamed;
eventually, sna or network
packages will export the generic
instead.
data(sampson) samplike[1,2] <- NA samplike[4.1] <- NA sm <- as.matrix(samplike) tst <- function(x,y){ mapply(identical, x, y) } stopifnot(all(tst(as.logical(as.matrix(ergm_symmetrize(samplike, "weak"))), sm | t(sm))), all(tst(as.logical(as.matrix(ergm_symmetrize(samplike, "strong"))), sm & t(sm))), all(tst(c(as.matrix(ergm_symmetrize(samplike, "upper"))), sm[cbind(c(pmin(row(sm),col(sm))),c(pmax(row(sm),col(sm))))])), all(tst(c(as.matrix(ergm_symmetrize(samplike, "lower"))), sm[cbind(c(pmax(row(sm),col(sm))),c(pmin(row(sm),col(sm))))])))
data(sampson) samplike[1,2] <- NA samplike[4.1] <- NA sm <- as.matrix(samplike) tst <- function(x,y){ mapply(identical, x, y) } stopifnot(all(tst(as.logical(as.matrix(ergm_symmetrize(samplike, "weak"))), sm | t(sm))), all(tst(as.logical(as.matrix(ergm_symmetrize(samplike, "strong"))), sm & t(sm))), all(tst(c(as.matrix(ergm_symmetrize(samplike, "upper"))), sm[cbind(c(pmin(row(sm),col(sm))),c(pmax(row(sm),col(sm))))])), all(tst(c(as.matrix(ergm_symmetrize(samplike, "lower"))), sm[cbind(c(pmax(row(sm),col(sm))),c(pmin(row(sm),col(sm))))])))
ergm
packageOptions set via the built-in options()
functions that affect ergm
estimation and options that control the behavior of some terms.
Whether ergm()
and similar functions will evaluate the likelihood of the fitted model. Can be overridden for a specific call by passing eval.loglik
argument directly.
ergm.loglik.warn_dyads = TRUE
Whether log-likelihood evaluation should issue a warning when the effective number of dyads that can vary in the sample space is poorly defined, such as if the degree sequence is constrained.
ergm.cluster.retries = 5
ergm's parallel routines implement rudimentary fault-tolerance. This option controls the number of retries for a cluster call before giving up.
ergm.term = list()
The default term options below.
ergm.ABI.action = "stop"
What to do when ergm detects that one of its extension packages had been compiled with a different version of ergm from the current one that makes changes at the C level that can cause problems. Other choices include
"stop"
, "abort"
stop with an error
"warning"
warn and proceed
"message"
, "inform"
print a message and proceed
"silent"
return the value without side-effects
"disable"
skip the check, always returning TRUE
Partial matching is supported.
Term options can be set in three places, in the order of precedence from high to low:
As a term argument (not always). For example, gw.cutoff
below can be set in a gwesp
term by gwesp(..., cutoff=X)
.
For functions such as summary
that take ergm
formulas but do not take a control list, the named arguments passed in as ...
. E.g, summary(nw~gwesp(.5,fix=TRUE), gw.cutoff=60)
will evaluate the GWESP statistic with its cutoff set to 60.
As an element in a term.options=
list passed via a control function such as control.ergm()
or, for functions that do not, in a list with that argument name. E.g., summary(nw~gwesp(.5,fix=TRUE), term.options=list(gw.cutoff=60))
has the same effect.
As an element in a global option list ergm.term
above.
The following options are in use by terms in the ergm
package:
version
A string that can be interpreted as an R package version. If set, the term will attempt to emulate its behavior as it was that version of ergm
. Not all past version behaviors are available.
gw.cutoff
In geometrically weighted terms (gwesp
, gwdegree
, etc.) the highest number of shared partners, degrees, etc. for which to compute the statistic. This usually defaults to 30.
cache.sp
Whether the gwesp
, dgwesp
, and similar terms need should use a cache for the dyadwise number of shared partners. This usually improves performance significantly at a modest memory cost, and therefore defaults to TRUE
, but it can be disabled.
interact.dependent
Whether to allow and how to handle the user attempting to interact dyad-dependent terms (e.g., absdiff("age"):triangles
or absdiff("age")*triangles
as opposed to absdiff("age"):nodefactor("sex")
). Possible values are "error"
(the default), "message"
, and "warning"
, for their respective actions, and "silent"
for simply processing the term.
ergm
PackageUsing clusters multiple CPUs or CPU cores to speed up ERGM estimation and simulation.
The ergm.getCluster
function is usually called
internally by the ergm process (in
ergm_MCMC_sample()
) and will attempt to start the
appropriate type of cluster indicated by the
control.ergm()
settings. It will also check that the
same version of ergm
is installed on each node.
The ergm.stopCluster
shuts down a
cluster, but only if ergm.getCluster
was responsible for
starting it.
The ergm.restartCluster
restarts and returns a cluster,
but only if ergm.getCluster
was responsible for starting it.
nthreads
is a simple generic to obtain the number of
parallel processes represented by its argument, keeping in mind
that having no cluster (e.g., NULL
) represents one thread.
ergm.getCluster(control = NULL, verbose = FALSE, stop_on_exit = parent.frame()) ergm.stopCluster(..., verbose = FALSE) ergm.restartCluster(control = NULL, verbose = FALSE) set.MT_terms(n) get.MT_terms() nthreads(clinfo = NULL, ...) ## S3 method for class 'cluster' nthreads(clinfo = NULL, ...) ## S3 method for class ''NULL'' nthreads(clinfo = NULL, ...) ## S3 method for class 'control.list' nthreads(clinfo = NULL, ...)
ergm.getCluster(control = NULL, verbose = FALSE, stop_on_exit = parent.frame()) ergm.stopCluster(..., verbose = FALSE) ergm.restartCluster(control = NULL, verbose = FALSE) set.MT_terms(n) get.MT_terms() nthreads(clinfo = NULL, ...) ## S3 method for class 'cluster' nthreads(clinfo = NULL, ...) ## S3 method for class ''NULL'' nthreads(clinfo = NULL, ...) ## S3 method for class 'control.list' nthreads(clinfo = NULL, ...)
control |
a |
verbose |
A logical or an integer to control the amount of
progress and diagnostic information to be printed. |
stop_on_exit |
An |
... |
not currently used |
n |
an integer specifying the number of threads to use; 0 (the
starting value) disables multithreading, and |
clinfo |
a |
For estimation that require MCMC, ergm can take advantage of multiple CPUs or CPU cores on the system on which it runs, as well as computing clusters through one of two mechanisms:
Packages
parallel
and snow
are used to to facilitate this, all cluster
types that they support are supported.
The number of nodes used and the parallel API are controlled using
the parallel
and parallel.type
arguments passed to the control
functions, such as control.ergm()
.
The ergm.getCluster()
function is usually called internally by
the ergm process (in ergm_MCMC_sample()
) and will attempt to
start the appropriate type of cluster indicated by the
control.ergm()
settings. The ergm.stopCluster()
is helpful if
the user has directly created a cluster.
Further details on the various cluster types are included below.
Rather than running multiple MCMC chains, it is possible to attempt to accelerate sampling by evaluating qualified terms' change statistics in multiple threads run in parallel. This is done using the OpenMP API.
However, this introduces a nontrivial amont of computational overhead. See below for a list of the major factors affecting whether it is worthwhile.
Generally, the two approaches should not be used at the same time
without caution. In particular, by default, cluster slave nodes
will not “inherit” the multithreading setting; but
parallel.inherit.MT=
control parameter can override that. Their
relative advantages and disadvantages are as follows:
Multithreading terms cannot take advantage of clusters but only of CPUs and cores.
Parallel MCMC chains produce several independent chains; multithreading still only produces one.
Multithreading terms actually accellerates sampling, including the burn-in phase; parallel MCMC's multiple burn-in runs are effectively “wasted”.
set.MT_terms()
returns the previous setting, invisibly.
get.MT_terms()
returns the current setting.
The parallel
package is used with PSOCK clusters
by default, to utilize multiple cores on a system. The number of
cores on a system can be determined with the detectCores()
function.
This method works with the base installation of R on all platforms, and does not require additional software.
For more advanced applications, such as clusters that span multiple
machines on a network, the clusters can be initialized manually,
and passed into ergm()
and others using the parallel
control
argument. See the second example below.
To use MPI to accelerate ERGM sampling,
pass the control parameter parallel.type="MPI"
.
ergm requires the snow and Rmpi packages to
communicate with an MPI cluster.
Using MPI clusters requires the system to have an existing MPI installation. See the MPI documentation for your particular platform for instructions.
To use ergm()
across multiple machines in a high performance
computing environment, see the section "User initiated clusters"
below.
A cluster can be passed into ergm()
with the parallel
control parameter. ergm()
will detect the
number of nodes in the cluster, and use all of them for MCMC
sampling. This method is flexible: it will accept any cluster type
that is compatible with snow
or parallel
packages.
The more terms with statistics the model has, the more benefit from parallel execution.
The more expensive the terms in the model are, the more benefit
from parallel execution. For example, models with terms like
gwdsp
will generally get more benefit than models where all
terms are dyad-independent.
Sampling more dense networks will generally get more benefit than sparse networks. Network size has little, if any, effect.
More CPUs/cores usually give greater speed-up, but only up to a point, because the amount of overhead grows with the number of threads; it is often better to “batch” the terms into a smaller number of threads than possible.
Any other workload on the system will have a more severe effect on multithreaded execution. In particular, do not run more threads than CPUs/cores that you want to allocate to the tasks.
Under Windows, even compiling with OpenMP appears to introduce
unacceptable amounts of overhead, so it is disabled for Windows
at compile time. To enable, delete src/Makevars.win
and
recompile from scratch.
The this is a setting global to the ergm
package and all of
its C functions, including when called from other packages via
the Linking-To
mechanism.
# Uses 2 SOCK clusters for MCMLE estimation data(faux.mesa.high) nw <- faux.mesa.high fauxmodel.01 <- ergm(nw ~ edges + isolates + gwesp(0.2, fixed=TRUE), control=control.ergm(parallel=2, parallel.type="PSOCK")) summary(fauxmodel.01)
# Uses 2 SOCK clusters for MCMLE estimation data(faux.mesa.high) nw <- faux.mesa.high fauxmodel.01 <- ergm(nw ~ edges + isolates + gwesp(0.2, fixed=TRUE), control=control.ergm(parallel=2, parallel.type="PSOCK")) summary(fauxmodel.01)
ergm.allstats
calculates the sufficient statistics of an
ERGM over the network's sample space.
ergm.exact()
uses ergm.allstats()
to calculate the exact loglikelihood, evaluated at
eta
.
ergm.allstats(formula, constraints = ~., zeroobs = TRUE, force = FALSE, ...) ergm.exact(eta, formula, constraints = ~., statmat = NULL, weights = NULL, ...)
ergm.allstats(formula, constraints = ~., zeroobs = TRUE, force = FALSE, ...) ergm.exact(eta, formula, constraints = ~., statmat = NULL, weights = NULL, ...)
formula , constraints
|
An ERGM formula and
(optionally) a constraint specification formulas. See
|
zeroobs |
Logical: Should the vectors be centered so that the network
passed in the |
force |
Logical: Should the algorithm be run even if it is determined that the problem may be very large, thus bypassing the warning message that normally terminates the function in such cases? |
... |
further arguments, passed to |
eta |
vector of canonical parameter values at which the loglikelihood should be evaluated. |
statmat , weights
|
outputs from |
The mechanism for doing this is a recursive algorithm, where the number of
levels of recursion is equal to the number of possible dyads that can be
changed from 0 to 1 and back again. The algorithm starts with the network
passed in formula
, then recursively toggles each edge twice so that
every possible network is visited.
ergm.allstats()
and ergm.exact()
should only be used for small
networks, since the number of possible networks grows extremely
fast with the number of nodes. An error results if it is used on a
network with more than 31 free dyads, which corresponds to a
directed network of more than 6 nodes or an undirected network of
more than 8 nodes; use force=TRUE
to override this error.
In case ergm.exact()
is to be called repeatedly, for instance by an
optimization routine, it is preferable to call ergm.allstats()
first, then pass statmat
and weights
explicitly to avoid
repeatedly calculating these objects.
ergm.allstats()
returns a list object with these two elements:
weights |
integer of counts, one for each row of |
statmat |
matrix in which each row is a unique vector of statistics. |
ergm.exact()
returns the exact value of the loglikelihood, evaluated at
eta
.
# Count by brute force all the edge statistics possible for a 7-node # undirected network mynw <- network.initialize(7, dir = FALSE) system.time(a <- ergm.allstats(mynw~edges)) # Summarize results rbind(t(a$statmat), .freq. = a$weights) # Each value of a$weights is equal to 21-choose-k, # where k is the corresponding statistic (and 21 is # the number of dyads in an 7-node undirected network). # Here's a check of that fact: as.vector(a$weights - choose(21, t(a$statmat))) # Dyad-independent constraints are also supported: system.time(a <- ergm.allstats(mynw~edges, constraints = ~fixallbut(cbind(1:2,2:3)))) rbind(t(a$statmat), .freq. = a$weights) # Simple ergm.exact output for this network. # We know that the loglikelihood for my empty 7-node network # should simply be -21*log(1+exp(eta)), so we may check that # the following two values agree: -21*log(1+exp(.1234)) ergm.exact(.1234, mynw~edges, statmat=a$statmat, weights=a$weights)
# Count by brute force all the edge statistics possible for a 7-node # undirected network mynw <- network.initialize(7, dir = FALSE) system.time(a <- ergm.allstats(mynw~edges)) # Summarize results rbind(t(a$statmat), .freq. = a$weights) # Each value of a$weights is equal to 21-choose-k, # where k is the corresponding statistic (and 21 is # the number of dyads in an 7-node undirected network). # Here's a check of that fact: as.vector(a$weights - choose(21, t(a$statmat))) # Dyad-independent constraints are also supported: system.time(a <- ergm.allstats(mynw~edges, constraints = ~fixallbut(cbind(1:2,2:3)))) rbind(t(a$statmat), .freq. = a$weights) # Simple ergm.exact output for this network. # We know that the loglikelihood for my empty 7-node network # should simply be -21*log(1+exp(eta)), so we may check that # the following two values agree: -21*log(1+exp(.1234)) ergm.exact(.1234, mynw~edges, statmat=a$statmat, weights=a$weights)
ergm.bridge.llr
uses bridge sampling with geometric spacing to
estimate the difference between the log-likelihoods of two parameter vectors
for an ERGM via repeated calls to simulate.formula.ergm()
.
ergm.bridge.0.llk
is a convenience wrapper that
returns the log-likelihood of configuration
relative to the reference measure. That is, the
configuration with
is defined as having log-likelihood of
0.
ergm.bridge.dindstart.llk
is a wrapper that uses a
dyad-independent ERGM as a starting point for bridge sampling to
estimate the log-likelihood for a given dyad-dependent model and
parameter configuration. Note that it only handles binary ERGMs
(response=NULL
) and with constraints (constraints=
) that that
do not induce dyadic dependence.
ergm.bridge.llr( object, response = NULL, reference = ~Bernoulli, constraints = ~., from, to, obs.constraints = ~. - observed, target.stats = NULL, basis = ergm.getnetwork(object), verbose = FALSE, ..., llronly = FALSE, control = control.ergm.bridge() ) ergm.bridge.0.llk( object, response = NULL, reference = ~Bernoulli, coef, ..., llkonly = TRUE, control = control.ergm.bridge(), basis = ergm.getnetwork(object) ) ergm.bridge.dindstart.llk( object, response = NULL, constraints = ~., coef, obs.constraints = ~. - observed, target.stats = NULL, dind = NULL, coef.dind = NULL, basis = ergm.getnetwork(object), ..., llkonly = TRUE, control = control.ergm.bridge(), verbose = FALSE )
ergm.bridge.llr( object, response = NULL, reference = ~Bernoulli, constraints = ~., from, to, obs.constraints = ~. - observed, target.stats = NULL, basis = ergm.getnetwork(object), verbose = FALSE, ..., llronly = FALSE, control = control.ergm.bridge() ) ergm.bridge.0.llk( object, response = NULL, reference = ~Bernoulli, coef, ..., llkonly = TRUE, control = control.ergm.bridge(), basis = ergm.getnetwork(object) ) ergm.bridge.dindstart.llk( object, response = NULL, constraints = ~., coef, obs.constraints = ~. - observed, target.stats = NULL, dind = NULL, coef.dind = NULL, basis = ergm.getnetwork(object), ..., llkonly = TRUE, control = control.ergm.bridge(), verbose = FALSE )
object |
A model formula. See |
response |
Either a character string, a formula, or
|
reference |
A one-sided formula specifying the reference
measure ( |
constraints , obs.constraints
|
One-sided formulas specifying
one or more constraints on the support of the distribution of the
networks being simulated and on the observation process
respectively. See the documentation for similar arguments for
|
from , to
|
The initial and final parameter vectors. |
target.stats |
A vector of sufficient statistics to be used in place of those of the network in the formula. |
basis |
An optional |
verbose |
A logical or an integer to control the amount of
progress and diagnostic information to be printed. |
... |
Further arguments to |
llronly |
Logical: If TRUE, only the estiamted log-ratio will
be returned by |
control |
A list of control parameters for algorithm tuning,
typically constructed with |
coef |
A vector of coefficients for the configuration of interest. |
llkonly |
Whether only the estiamted log-likelihood should be
returned by the |
dind |
A one-sided formula with the dyad-independent model to use as a
starting point. Defaults to the dyad-independent terms found in the formula
|
coef.dind |
Parameter configuration for the dyad-independent starting
point. Defaults to the MLE of |
If llronly=TRUE
or llkonly=TRUE
, these functions return
the scalar log-likelihood-ratio or the log-likelihood.
Otherwise, they return a list with the following components:
llr |
The estimated log-ratio. |
llr.vcov |
The estimated variance of the log-ratio due to MCMC approximation. |
llrs |
A list of lists (1 per attempt) of the estimated
log-ratios for each of the |
llrs.vcov |
A list of lists (1 per attempt) of the estimated
variances of the estimated log-ratios for each of the
|
paths |
A list of lists (1 per attempt) with two elements:
|
Dtheta.Du |
The gradient vector of the parameter values with respect to position of the bridge. |
ergm.bridge.0.llk
result list also includes an llk
element, with the log-likelihood itself (with the reference
distribution assumed to have likelihood 0).
ergm.bridge.dindstart.llk
result list also includes
an llk
element, with the log-likelihood itself and an
llk.dind
element, with the log-likelihood of the nearest
dyad-independent model.
Hunter, D. R. and Handcock, M. S. (2006) Inference in curved exponential family models for networks, Journal of Computational and Graphical Statistics.
Note that this function is not recommended for general use, since
it only supports only one way of specifying observational
structure—through NA
edges. It is likely to be deprecated in
the future.
ergm.design(nw, ...)
ergm.design(nw, ...)
nw |
a |
... |
term options. |
ergm.design
returns a rlebdm
of
informative (non-missing, non fixed) dyads.
ergm
formula
and verify that it is a valid network.The function function ensures that the network in a given formula is valid; if so, the network is returned; if not, execution is halted with warnings.
ergm.getnetwork(formula, loopswarning = TRUE)
ergm.getnetwork(formula, loopswarning = TRUE)
formula |
a two-sided formula whose LHS is a |
loopswarning |
whether warnings about loops should be printed
( |
A network
object constructed by evaluating the LHS of
the model formula in the formula's environment.
Gives the network a series of proposals it can't refuse. Returns the statistics of the network, and, optionally, the final network.
ergm.godfather( object, changes = NULL, ..., end.network = FALSE, stats.start = FALSE, changes.only = FALSE, verbose = FALSE, basis = NULL, formula = NULL ) ## S3 method for class 'formula' ergm.godfather( object, changes = NULL, response = NULL, ..., end.network = FALSE, stats.start = FALSE, changes.only = FALSE, verbose = FALSE, control = NULL, basis = ergm.getnetwork(object) ) ## S3 method for class 'ergm_model' ergm.godfather( object, changes = NULL, ..., end.network = FALSE, stats.start = FALSE, changes.only = FALSE, verbose = FALSE, control = NULL, basis = NULL ) ## S3 method for class 'ergm_state' ergm.godfather( object, changes = NULL, ..., end.network = FALSE, stats.start = FALSE, verbose = FALSE, control = NULL )
ergm.godfather( object, changes = NULL, ..., end.network = FALSE, stats.start = FALSE, changes.only = FALSE, verbose = FALSE, basis = NULL, formula = NULL ) ## S3 method for class 'formula' ergm.godfather( object, changes = NULL, response = NULL, ..., end.network = FALSE, stats.start = FALSE, changes.only = FALSE, verbose = FALSE, control = NULL, basis = ergm.getnetwork(object) ) ## S3 method for class 'ergm_model' ergm.godfather( object, changes = NULL, ..., end.network = FALSE, stats.start = FALSE, changes.only = FALSE, verbose = FALSE, control = NULL, basis = NULL ) ## S3 method for class 'ergm_state' ergm.godfather( object, changes = NULL, ..., end.network = FALSE, stats.start = FALSE, verbose = FALSE, control = NULL )
object |
An |
changes |
Either a matrix with three columns: tail, head, and new value, describing the changes to be made; or a list of such matrices to apply these changes in a sequence. For binary network models, the third column may be omitted. In that case, the changes are treated as toggles. Note that if a list is passed, it must either be all of changes or all of toggles. |
... |
additional arguments to |
end.network |
Whether to return a network that
results. Defaults to |
stats.start |
Whether to return the network statistics at
|
changes.only |
Whether to return network statistics or only their changes relative to the initial network. |
verbose |
A logical or an integer to control the amount of
progress and diagnostic information to be printed. |
basis |
a value (usually a |
formula |
Deprecated; replaced with |
response |
Either a character string, a formula, or
|
control |
Deprecated; arguments such as |
If end.network==FALSE
(the default), an
mcmc
object with the requested network statistics
associed with the network series produced by applying the
specified changes. Its mcmc
attributes encode the
timing information: so start(out)
gives the time
point associated with the first row returned, and
end(out)
out the last. The "thinning interval" is
always 1.
If end.network==TRUE
, return a network
object,
representing the final network, with a matrix of statistics
described in the previous paragraph attached to it as an
attr
-style attribute "stats"
.
ergm.godfather.ergm_model()
is a lower-level interface, providing
an ergm.godfather()
method for the ergm_model
class. The basis
argument is required.
ergm.godfather.ergm_model()
is a lower-level interface, providing
an ergm.godfather()
method for the ergm_model
class. The basis
argument is required.
tergm.godfather()
in tergm, simulate.ergm()
,
simulate.formula()
data(florentine) ergm.godfather(flomarriage~edges+absdiff("wealth")+triangles, changes=list(cbind(1:2,2:3), cbind(3,5), cbind(3,5), cbind(1:2,2:3)), stats.start=TRUE)
data(florentine) ergm.godfather(flomarriage~edges+absdiff("wealth")+triangles, changes=list(cbind(1:2,2:3), cbind(3,5), cbind(3,5), cbind(1:2,2:3)), stats.start=TRUE)
This page describes how to specify the constraints on the network sample space (the set of possible networks , the set of networks
for which
) and sometimes the baseline weights
to functions in the
ergm
package. It also provides an indexed list of the constraints visible to the ergm's API. Constraints can also be searched via search.ergmConstraints
, and help for an individual constraint can be obtained with ergmConstraint?<constraint>
or help("<constraint>-ergmConstraint")
.
In an exponential-family random graph model (ERGM), the probability or density of a given network, , on a set of nodes is
where is the reference distribution (particularly for valued network models),
is a vector of network statistics for
,
is a natural parameter vector of the same length (with
for most terms),
is the dot product, and
is the normalizing constant for the distribution. A complete ERGM specification requires a list of network statistics
and (if applicable) their
mappings provided by a formula of
ergmTerm
s; and, optionally, sample space and reference distribution
information provided by
ergmConstraint
s and, for valued ERGMs, by ergmReference
s.
Constraints typically affect , or, equivalently, set
for some
, but some (“soft” constraints) set
to values other than 0 and 1.
A constraints formula is a one- or two-sided formula whose left-hand side is
an optional direct selection of the InitErgmProposal
function and
whose right-hand side is a series of one or more terms separated by
"+"
and "-"
operators, specifying the constraint.
The sample space (over and above the reference distribution) is determined by iterating over the constraints terms from left to right, each term updating it as follows:
If the constraint introduces complex
dependence structure (e.g., constrains degree or number of edges in the
network), then this constraint always restricts the sample space. It may
only have a "+"
sign.
If the constraint only restricts the set of dyads that may vary in the
sample space (e.g., block-diagonal structure or fixing specific dyads at
specific values) and has a "+"
sign, the set of dyads that may
vary is restricted to those that may vary according to this constraint
and all the constraints to date.
If the constraint only restricts the set of dyads that may vary in the
sample space but has a "-"
sign, the set of dyads that may
vary is expanded to those that may vary according to this constraint
or all the constraints up to date.
For example, a constraints formula ~a-b+c-d
with all constraints
dyadic will allow dyads permitted by either a
or b
but only if they are
also permitted by c
; as well as all dyads permitted by d
. If A
, B
,
C
, and D
were logical matrices, the matrix of variable dyads would be
equal to ((A|B)&C)|D
.
Terms with a positive sign can be viewed as "adding" a constraint while those with a negative sign can be viewed as "relaxing" a constraint.
network
By default, %ergmlhs%
attributes constraints
or
constraints.obs
(depending on which constraint) attached to the
LHS of the model formula or the basis=
argument will be added in
front of the specified constraints formula. This is the desired
behaviour most of the time, since those constraints are usually
determined by how the network was constructed (e.g., structural
zeros in a block-diagonal network).
For those situations in which this is not the desired behavior, a
.
term (with a positive sign or no sign at all) can be used to
manually set the position of the inherited constraints in the
formula, and a -.
(minus-dot) term anywhere in the constraints
formula will suppress the inherited formula altogether.
Term | Package | Description | Concepts |
---|---|---|---|
ergm | Preserve the actor degree for bipartite networks | bipartite | |
ergm | Preserve the receiver degree for bipartite networks | bipartite | |
ergm | Constrain maximum and minimum vertex degree | directed undirected | |
ergm | Block-diagonal structure constraint | directed dyad-independent undirected | |
ergm | Constrain blocks of dyads defined by mixing type on a vertex attribute. | directed dyad-independent undirected | |
ergm | Preserve the degree distribution of the given network | directed undirected | |
ergm | Preserve the degree of each vertex of the given network | directed undirected | |
ergm | A soft constraint to adjust the sampled distribution for dyad-level noise with known perturbation probabilities | directed dyad-independent soft undirected | |
ergm | Constrain fixed or varying dyad-independent terms | directed dyad-independent operator undirected | |
ergm | Preserve the edge count of the given network | ||
ergm | Preserve values of dyads incident on vertices with given attribute | directed dyad-independent undirected | |
ergm | Preserve the dyad status in all but the given edges | directed dyad-independent undirected | |
ergm | Fix specific dyads | directed dyad-independent undirected | |
ergm | Preserve the hamming distance to the given network (BROKEN: Do NOT Use) | directed undirected | |
ergm | Preserve the indegree distribution | directed | |
ergm | Preserve indegree for directed networks | directed | |
ergm | Preserve the observed dyads of the given network | directed dyad-independent undirected | |
ergm | Preserve the outdegree distribution | directed | |
ergm | Preserve outdegree for directed networks | directed |
Term | bip | dir | undir | dyad-indep | soft | op |
---|---|---|---|---|---|---|
b1degrees | ✔ | |||||
b2degrees | ✔ | |||||
bd | ✔ | ✔ | ||||
blockdiag | ✔ | ✔ | ✔ | |||
blocks | ✔ | ✔ | ✔ | |||
degreedist | ✔ | ✔ | ||||
degrees | ✔ | ✔ | ||||
dyadnoise | ✔ | ✔ | ✔ | ✔ | ||
Dyads | ✔ | ✔ | ✔ | ✔ | ||
edges | ||||||
egocentric | ✔ | ✔ | ✔ | |||
fixallbut | ✔ | ✔ | ✔ | |||
fixedas | ✔ | ✔ | ✔ | |||
hamming | ✔ | ✔ | ||||
idegreedist | ✔ | |||||
idegrees | ✔ | |||||
observed | ✔ | ✔ | ✔ | |||
odegreedist | ✔ | |||||
odegrees | ✔ |
Goodreau SM, Handcock MS, Hunter DR, Butts CT, Morris M (2008a). A statnet Tutorial. Journal of Statistical Software, 24(8). doi:10.18637/jss.v024.i08
Hunter, D. R. and Handcock, M. S. (2006) Inference in curved exponential family models for networks, Journal of Computational and Graphical Statistics.
Hunter DR, Handcock MS, Butts CT, Goodreau SM, Morris M (2008b). ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. Journal of Statistical Software, 24(3). doi:10.18637/jss.v024.i03
Karwa V, Krivitsky PN, and Slavkovi\'c AB (2016). Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models. Journal of the Royal Statistical Society, Series C, 66(3): 481-500. doi:10.1111/rssc.12185
Krivitsky PN (2012). Exponential-Family Random Graph Models for Valued Networks. Electronic Journal of Statistics, 6, 1100-1128. doi:10.1214/12-EJS696
Morris M, Handcock MS, Hunter DR (2008). Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects. Journal of Statistical Software, 24(4). doi:10.18637/jss.v024.i04
This page describes how to provide to the
ergm's MCMC algorithms information about the sample space. Hints can also be searched via search.ergmHints
, and help for an individual hint can be obtained with ergmHint?<hint>
or help("<hint>-ergmHint")
.
In an exponential-family random graph model (ERGM), the probability or density of a given network, , on a set of nodes is
where is the reference distribution (particularly for valued network models),
is a vector of network statistics for
,
is a natural parameter vector of the same length (with
for most terms),
is the dot product, and
is the normalizing constant for the distribution. A complete ERGM specification requires a list of network statistics
and (if applicable) their
mappings provided by a formula of
ergmTerm
s; and, optionally, sample space and reference distribution
information provided by
ergmConstraint
s and, for valued ERGMs, by ergmReference
s.
It is often the case that there is additional information available
about the distribution of networks being modelled. For example, you
may be aware that the network is sparse or that there are strata
among the dyads. “Hints”, typically passed on the right-hand side of MCMC.prop
and obs.MCMC.prop
arguments to control.ergm()
,
control.simulate.ergm()
, and others, allow this information to be
provided. By default, hint sparse
is in
effect.
Unlike constraints, model terms, and reference distributions, “hints” do not affect the specification of the model. That is, regardless of what “hints” may or may not be in effect, the sample space and the probabilities within it are the same. However, “hints” may affect the MCMC proposal distribution used by the samplers.
Note that not all proposals support all “hints”: and if the most suitable proposal available cannot incorporate a particular “hint”, a warning message will be printed.
“Hints” use the same underlying API as constraints, and, if present,
%ergmlhs%
attributes constraints
and constraints.obs
will
be substituted in its place.
The following hints are known to ergm at this time:
Term | Package | Description | Concepts |
---|---|---|---|
ergm | Sparse network | dyad-independent | |
ergm | Stratify Proposed Toggles by Mixing Type on a Vertex Attribute | dyad-independent | |
ergm | Network with strong clustering (triad-closure) effects |
Goodreau SM, Handcock MS, Hunter DR, Butts CT, Morris M (2008a). A statnet Tutorial. Journal of Statistical Software, 24(8). doi:10.18637/jss.v024.i08
Hunter, D. R. and Handcock, M. S. (2006) Inference in curved exponential family models for networks, Journal of Computational and Graphical Statistics.
Hunter DR, Handcock MS, Butts CT, Goodreau SM, Morris M (2008b). ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. Journal of Statistical Software, 24(3). doi:10.18637/jss.v024.i03
Karwa V, Krivitsky PN, and Slavkovi\'c AB (2016). Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models. Journal of the Royal Statistical Society, Series C, 66(3): 481-500. doi:10.1111/rssc.12185
Krivitsky PN (2012). Exponential-Family Random Graph Models for Valued Networks. Electronic Journal of Statistics, 6, 1100-1128. doi:10.1214/12-EJS696
Morris M, Handcock MS, Hunter DR (2008). Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects. Journal of Statistical Software, 24(4). doi:10.18637/jss.v024.i04
This collects all defined keywords defined for the ERGM and derived packages
name | short | description | popular | package |
---|---|---|---|---|
binary | bin | suitable for binary ERGMs | TRUE | ergm |
bipartite | bip | suitable for bipartite networks | TRUE | ergm |
categorical nodal attribute | cat nodal attr | involves a categorical nodal attribute | FALSE | ergm |
categorical dyadic attribute | cat dyad attr | involves a categorical dyadic attribute | FALSE | ergm |
categorical triadic attribute | cat triad attr | involves a categorical triadic attribute | FALSE | ergm |
continuous | cont | a continuous distribution for edge values | FALSE | ergm |
curved | curved | is a curved term | FALSE | ergm |
directed | dir | suitable for directed networks | TRUE | ergm |
discrete | discrete | a discrete distribution for edge values | FALSE | ergm |
dyad-independent | dyad-indep | does not induce dyadic dependence | TRUE | ergm |
finite | fin | finite edge values only | FALSE | ergm |
frequently-used | freq | is frequently used | FALSE | ergm |
nonnegative | nneg | only meaningful for nonnegative edge values | FALSE | ergm |
operator | op | a term operator | TRUE | ergm |
positive | pos | only meaningful for positive edge values | FALSE | ergm |
quantitative nodal attribute | quant nodal attr | involves a quantitative nodal attribute | FALSE | ergm |
quantitative dyadic attribute | quant dyad attr | involves a quantitative dyadic attribute | FALSE | ergm |
quantitative triadic attribute | quant triad attr | involves a quantitative triadic attribute | FALSE | ergm |
soft | soft | a constraint that does not necessarily forbid specific networks outright but reweights their probabilities | FALSE | ergm |
triad-related | triad rel | involves triangles, two-paths, and other triadic structures | FALSE | ergm |
valued | val | suitable for valued ERGMs | TRUE | ergm |
undirected | undir | suitable for undirected networks | TRUE | ergm |
Return the predictor matrix, response vector, and vector of weights that can be used to calculate the MPLE for an ERGM.
ergmMPLE( formula, constraints = ~., obs.constraints = ~-observed, output = c("matrix", "array", "dyadlist", "fit"), expand.bipartite = FALSE, control = control.ergm(), verbose = FALSE, ..., basis = ergm.getnetwork(formula) )
ergmMPLE( formula, constraints = ~., obs.constraints = ~-observed, output = c("matrix", "array", "dyadlist", "fit"), expand.bipartite = FALSE, control = control.ergm(), verbose = FALSE, ..., basis = ergm.getnetwork(formula) )
formula , constraints , obs.constraints
|
An ERGM formula and
(optionally) a constraint specification formulas. See
|
output |
Character, partially matched. See Value. |
expand.bipartite |
Logical. Specifies whether the output matrices (or array slices) representing dyads for bipartite networks are represented as rectangular matrices with first mode vertices in rows and second mode in columns, or as square matrices with dimension equalling the total number of vertices, containing with structural |
control |
A list of control parameters for algorithm tuning,
typically constructed with |
verbose |
A logical or an integer to control the amount of
progress and diagnostic information to be printed. |
... |
Additional arguments, to be passed to lower-level functions. |
basis |
a value (usually a |
The MPLE for an ERGM is calculated by first finding the matrix of change
statistics. Each row of this matrix is associated with a particular pair
(ordered or unordered, depending on whether the network is directed or
undirected) of nodes, and the row equals the change in the vector of network
statistics (as defined in formula
) when that pair is toggled from a 0
(no edge) to a 1 (edge), holding all the rest of the network fixed. The
MPLE results if we perform a logistic regression in which the predictor
matrix is the matrix of change statistics and the response vector is the
observed network (i.e., each entry is either 0 or 1, depending on whether
the corresponding edge exists or not).
Using output="matrix"
, note that the result of the fit may be
obtained from the glm()
function, as shown in the examples
below.
If output=="matrix"
(the default), then only the response, predictor,
and weights are returned; thus, the MPLE may be found by hand or the vector
of change statistics may be used in some other way. To save space, the
algorithm will automatically search for any duplicated rows in the predictor
matrix (and corresponding response values). ergmMPLE
function will
return a list with three elements, response
, predictor
, and
weights
, respectively the response vector, the predictor matrix, and
a vector of weights, which are really counts that tell how many times each
corresponding response, predictor pair is repeated.
If output=="dyadlist"
, as "matrix"
, but rather than
coalescing the duplicated rows, every relation in the network that
is not fixed and is observed will have its own row in predictor
and element in response
and weights
, and predictor
matrix
will have two additional rows at the start, tail
and head
,
indicating to which dyad the row and the corresponding elements
pertain.
If output=="array"
, a list with similarly named three elements is
returned, but response
is formatted into a sociomatrix;
predictor
is a 3-dimensional array of with cell
predictor[t,h,k]
containing the change score of term k
for
dyad (t
,h
); and weights
is also formatted into a
sociomatrix, with an element being 1 if it is to be added into the
pseudolikelihood and 0 if it is not.
In particular, for a unipartite network, cells corresponding to self-loops,
i.e., predictor[i,i,k]
will be NA
and weights[i,i]
will
be 0; and for a unipartite undirected network, lower triangle of each
predictor[,,k]
matrix will be set to NA
, with the lower
triangle of weights
being set to 0.
To all of the above output types, attr(., "etamap")
is attached
containing the mapping and offset information.
If output=="fit"
, then ergmMPLE
simply calls the
ergm()
function with the estimate="MPLE"
option set,
returning an object of class ergm
that gives the fitted
pseudolikelihood model.
data(faux.mesa.high) formula <- faux.mesa.high ~ edges + nodematch("Sex") + nodefactor("Grade") mplesetup <- ergmMPLE(formula) # Obtain MPLE coefficients "by hand": coef(glm(mplesetup$response ~ . - 1, data = data.frame(mplesetup$predictor), weights = mplesetup$weights, family="binomial")) # Check that the coefficients agree with the output of the ergm function: coef(ergmMPLE(formula, output="fit")) # We can also format the predictor matrix into an array: mplearray <- ergmMPLE(formula, output="array") # The resulting matrices are big, so only print the first 8 actors: mplearray$response[1:8,1:8] mplearray$predictor[1:8,1:8,] mplearray$weights[1:8,1:8] # Constraints are handled: faux.mesa.high%v%"block" <- seq_len(network.size(faux.mesa.high)) %/% 4 mplearray <- ergmMPLE(faux.mesa.high~edges, constraints=~blockdiag("block"), output="array") mplearray$response[1:8,1:8] mplearray$predictor[1:8,1:8,] mplearray$weights[1:8,1:8] # Or, a dyad list: faux.mesa.high%v%"block" <- seq_len(network.size(faux.mesa.high)) %/% 4 mplearray <- ergmMPLE(faux.mesa.high~edges, constraints=~blockdiag("block"), output="dyadlist") mplearray$response[1:8] mplearray$predictor[1:8,] mplearray$weights[1:8] # Curved terms produce predictors on the canonical scale: formula2 <- faux.mesa.high ~ gwesp mplearray <- ergmMPLE(formula2, output="array") # The resulting matrices are big, so only print the first 5 actors: mplearray$response[1:5,1:5] mplearray$predictor[1:5,1:5,1:3] mplearray$weights[1:5,1:5]
data(faux.mesa.high) formula <- faux.mesa.high ~ edges + nodematch("Sex") + nodefactor("Grade") mplesetup <- ergmMPLE(formula) # Obtain MPLE coefficients "by hand": coef(glm(mplesetup$response ~ . - 1, data = data.frame(mplesetup$predictor), weights = mplesetup$weights, family="binomial")) # Check that the coefficients agree with the output of the ergm function: coef(ergmMPLE(formula, output="fit")) # We can also format the predictor matrix into an array: mplearray <- ergmMPLE(formula, output="array") # The resulting matrices are big, so only print the first 8 actors: mplearray$response[1:8,1:8] mplearray$predictor[1:8,1:8,] mplearray$weights[1:8,1:8] # Constraints are handled: faux.mesa.high%v%"block" <- seq_len(network.size(faux.mesa.high)) %/% 4 mplearray <- ergmMPLE(faux.mesa.high~edges, constraints=~blockdiag("block"), output="array") mplearray$response[1:8,1:8] mplearray$predictor[1:8,1:8,] mplearray$weights[1:8,1:8] # Or, a dyad list: faux.mesa.high%v%"block" <- seq_len(network.size(faux.mesa.high)) %/% 4 mplearray <- ergmMPLE(faux.mesa.high~edges, constraints=~blockdiag("block"), output="dyadlist") mplearray$response[1:8] mplearray$predictor[1:8,] mplearray$weights[1:8] # Curved terms produce predictors on the canonical scale: formula2 <- faux.mesa.high ~ gwesp mplearray <- ergmMPLE(formula2, output="array") # The resulting matrices are big, so only print the first 5 actors: mplearray$response[1:5,1:5] mplearray$predictor[1:5,1:5,1:3] mplearray$weights[1:5,1:5]
This page describes the low-level Metropolis–Hastings
(MH) proposal algorithms. They are rarely invoked directly by the
user but are rather selected based on the provided sample space constraints and hints about the network process. They can also be searched via
search.ergmProposals
, and help for an individual proposal can
be obtained with ergmProposal?<proposal>
or
help("<proposal>-ergmProposal")
.
ergm
uses a Metropolis-Hastings (MH) algorithm to
control the behavior of the Markov Chain Monte Carlo (MCMC) for
sampling networks. The MCMC chain is intended to step around the
sample space of possible networks, generating a network at
regular intervals to evaluate the statistics in the model. For
each MCMC step, one or more toggles are proposed to change the
dyads to the opposite value. The probability of accepting the
proposed change is determined by the MH acceptance ratio. The
role of the different MH methods implemented in
ergm()
is to vary how the sets of dyads are selected
for toggle proposals. This is used in some cases to improve the
performance (speed and mixing) of the algorithm, and in other
cases to constrain the sample space.
Proposal | Reference | Enforces | May_Enforce | Priority | Weight | Class |
---|---|---|---|---|---|---|
BDStratTNT | Bernoulli | sparse | bdmax blocks strat | -3 | BDStratTNT | cross-sectional |
BDStratTNT | Bernoulli | bdmax sparse | blocks strat | 5 | BDStratTNT | cross-sectional |
BDStratTNT | Bernoulli | blocks sparse | bdmax strat | 5 | BDStratTNT | cross-sectional |
BDStratTNT | Bernoulli | strat sparse | bdmax blocks | 5 | BDStratTNT | cross-sectional |
CondB1Degree | Bernoulli | b1degrees | 0 | random | cross-sectional | |
CondB2Degree | Bernoulli | b2degrees | 0 | random | cross-sectional | |
CondDegree | Bernoulli | degrees | 0 | random | cross-sectional | |
CondDegree | Bernoulli | idegrees odegrees | 0 | random | cross-sectional | |
CondDegree | Bernoulli | b1degrees b2degrees | 0 | random | cross-sectional | |
CondDegreeDist | Bernoulli | degreedist | 0 | random | cross-sectional | |
CondDegreeMix | Bernoulli | degreesmix | 0 | random | cross-sectional | |
CondInDegree | Bernoulli | idegrees | 0 | random | cross-sectional | |
CondInDegreeDist | Bernoulli | idegreedist | 0 | random | cross-sectional | |
CondOutDegree | Bernoulli | odegrees | 0 | random | cross-sectional | |
CondOutDegreeDist | Bernoulli | odegreedist | 0 | random | cross-sectional | |
ConstantEdges | Bernoulli | edges | .dyads bd | 0 | random | cross-sectional |
DiscUnif | DiscUnif | 0 | random | cross-sectional | ||
DiscUnif2 | DiscUnif | -1 | random2 | cross-sectional | ||
DiscUnifNonObserved | DiscUnif | observed | 0 | random | cross-sectional | |
DistRLE | StdNormal | .dyads | 0 | random | cross-sectional | |
DistRLE | Unif | .dyads | 0 | random | cross-sectional | |
DistRLE | Unif | .dyads | -3 | random | cross-sectional | |
DistRLE | DiscUnif | .dyads | -3 | random | cross-sectional | |
DistRLE | StdNormal | .dyads | -3 | random | cross-sectional | |
DistRLE | Poisson | .dyads | -3 | random | cross-sectional | |
DistRLE | Binomial | .dyads | -3 | random | cross-sectional | |
dyadnoise | Bernoulli | dyadnoise | 0 | random | cross-sectional | |
dyadnoiseTNT | Bernoulli | dyadnoise sparse | 1 | TNT | cross-sectional | |
HammingConstantEdges | Bernoulli | edges hamming | 0 | random | cross-sectional | |
HammingTNT | Bernoulli | hamming sparse | 0 | random | cross-sectional | |
randomtoggle | Bernoulli | .dyads bd | -2 | random | cross-sectional | |
SPDyad | Bernoulli | sparse triadic | .dyads bd | 0 | TNT | cross-sectional |
StdNormal | StdNormal | 0 | random | cross-sectional | ||
TNT | Bernoulli | sparse | .dyads bd | 0 | TNT | cross-sectional |
Unif | Unif | 0 | random | cross-sectional | ||
UnifNonObserved | Unif | observed | 0 | random | cross-sectional |
Note that .dyads
is a meta-constraint, indicating that the proposal supports an arbitrary dyad-level constraint combination.
Goodreau SM, Handcock MS, Hunter DR, Butts CT, Morris M (2008a). A statnet Tutorial. Journal of Statistical Software, 24(8). doi:10.18637/jss.v024.i08
Hunter, D. R. and Handcock, M. S. (2006) Inference in curved exponential family models for networks. Journal of Computational and Graphical Statistics.
Hunter DR, Handcock MS, Butts CT, Goodreau SM, Morris M (2008b). ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. Journal of Statistical Software, 24(3). doi:10.18637/jss.v024.i03
Krivitsky PN (2012). Exponential-Family Random Graph Models for Valued Networks. Electronic Journal of Statistics, 2012, 6, 1100-1128. doi:10.1214/12-EJS696
Morris M, Handcock MS, Hunter DR (2008). Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects. Journal of Statistical Software, 24(4). doi:10.18637/jss.v024.i04
ergm
package, ergm
, ergmConstraint
, ergmHint
, ergm_proposal
This page describes how to specify the reference measures (baseline distributions)
(the set of possible networks and the baseline weights
to functions in the
ergm
package. It also provides an indexed list of the references visible to the ergm's API. References can also be searched via search.ergmReferences()
, and help for an individual reference can be obtained with ergmReference?<reference>
or help("<reference>-ergmReference")
.
In an exponential-family random graph model (ERGM), the probability or density of a given network, , on a set of nodes is
where is the reference distribution (particularly for valued network models),
is a vector of network statistics for
,
is a natural parameter vector of the same length (with
for most terms),
is the dot product, and
is the normalizing constant for the distribution. A complete ERGM specification requires a list of network statistics
and (if applicable) their
mappings provided by a formula of
ergmTerm
s; and, optionally, sample space and reference distribution
information provided by
ergmConstraint
s and, for valued ERGMs, by ergmReference
s.
The reference measure is specified on the right-hand side of a one-sided formula passed
typically as the
reference
argument.
Term | Package | Description | Concepts |
---|---|---|---|
ergm | Bernoulli reference | discrete finite nonnegative | |
ergm | Discrete Uniform reference | discrete finite | |
ergm | Standard Normal reference | continuous | |
ergm | Continuous Uniform reference | continuous |
Term | bin | discrete | fin | nneg | cont |
---|---|---|---|---|---|
Bernoulli | ✔ | ✔ | ✔ | ✔ | |
DiscUnif | ✔ | ✔ | |||
StdNormal | ✔ | ||||
Unif | ✔ |
Hunter DR, Handcock MS, Butts CT, Goodreau SM, Morris M (2008b). ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. Journal of Statistical Software, 24(3). doi:10.18637/jss.v024.i03
Krivitsky PN (2012). Exponential-Family Random Graph Models for Valued Networks. Electronic Journal of Statistics, 2012, 6, 1100-1128. doi:10.1214/12-EJS696
ergm
, network
, sna, summary.ergm
, print.ergm
, \%v\%
, \%n\%
This page explains how to specify the network statistics to functions in the
ergm
package and packages that extend it. It also provides an indexed list of the possible terms (and hence network statistics) visible to the ergm API. Terms can also be searched via search.ergmTerms
, and help for an individual term can be obtained with ergmTerm?<term>
or help("<term>-ergmTerm")
.
In an exponential-family random graph model (ERGM), the probability or density of a given network, , on a set of nodes is
where is the reference distribution (particularly for valued network models),
is a vector of network statistics for
,
is a natural parameter vector of the same length (with
for most terms),
is the dot product, and
is the normalizing constant for the distribution. A complete ERGM specification requires a list of network statistics
and (if applicable) their
mappings provided by a formula of
ergmTerm
s; and, optionally, sample space and reference distribution
information provided by
ergmConstraint
s and, for valued ERGMs, by ergmReference
s.
Network statistics and mappings
are specified by a formula object, of the form
y ~ <term 1> + <term 2> ...
, where
y
is a network object or a matrix that can be coerced to a network
object, and <term 1>
, <term 2>
, etc, are each terms chosen
from the list given below. To create a network object in , use the
network
function, then add nodal attributes to it
using the %v%
operator if necessary.
Operator terms like B()
and F()
take
formulas with other ergm
terms as their arguments and transform them
by modifying their inputs (e.g., the network they evaluate) and/or their
outputs.
By convention, their names are capitalized and CamelCased.
For binary ERGMs, interactions between ergm
terms can be
specified in a manner similar to lm
and others, as using the
:
and *
operators. However, they must be interpreted
carefully, especially for dyad-dependent terms. (Interactions involving
curved terms are not supported at this time.)
Generally, if term a
has statistics and
b
has
,
a:b
will add
statistics to the model, corresponding to each element of
interacted with each element of
.
The interaction is defined as follows. Dyad-independent terms can be
expressed in the general form for some edge
covariate matrix
,
In other words, rather than being a product of their sufficient statistics
(), it is a dyadwise product of their
dyad-level effects.
This means that an interaction between two dyad-independent terms can be
interpreted the same way as it would be in the corresponding logistic
regression for each potential edge. However, for undirected networks in
particular, this may lead to somewhat counterintuitive results. For example,
given two nodal covariates "a"
and "b"
(whose values for node
are denoted
and
, respectively),
nodecov("a")
adds one statistic of the form and analogously for
nodecov("b")
, so nodecov("a"):nodecov("b")
produces
ergm
functions such as ergm
and
simulate
(for ERGMs) may operate in two
modes: binary and weighted/valued, with the latter activated by passing a
non-NULL value as the response
argument, giving the edge attribute
name to be modeled/simulated.
Binary ERGM statistics cannot be
used directly in valued mode and vice versa. However, a substantial number
of binary ERGM statistics — particularly the ones with dyadic independence
— have simple generalizations to valued ERGMs, and have been adapted in
ergm
. They have the same form as their binary
ERGM counterparts, with an additional argument: form
, which, at this
time, has two possible values: "sum"
(the default) and
"nonzero"
. The former creates a statistic of the form , where
is the
value of dyad
and
is the term's covariate
associated with it. The latter computes the binary version, with the edge
considered to be present if its value is not 0. Valued version of some
binary ERGM terms have an argument
threshold
, which sets the value
above which a dyad is conidered to have a tie. (Value less than or equal to
threshold
is considered a nontie.)
The B()
operator term documented below can be used to pass other
binary terms to valued models, and is more flexible, at the cost of being
somewhat slower.
Terms taking a categorical nodal covariate also take the levels
argument. (There are analogous b1levels
and b2levels
arguments for some terms that apply to bipartite networks, and the
levels2
argument for mixing terms.) The levels
argument can
be used to control the set and the ordering of attribute levels.
Terms that allow the selection of nodes do so with the nodes
argument, which is interpreted in the same way as the levels
argument, where the categories are the relevant nodal indices themselves.
Both levels
and nodes
use the new level selection UI. (See
Specifying Vertex attributes and Levels (?
nodal_attributes
) for details.)
The legacy base
and keep
arguments are deprecated as of
version 3.10, and replaced by the levels
UI. The levels
argument provides consistent and flexible mechanisms for specifying which
attribute levels to exclude (previously handled by base
) and include
(previously handled by keep
). If levels
or nodes
argument is given, then base
and keep
arguments are ignored.
The legacy arguments will most likely be removed in a future version.
Note that this exact behavior is new in version 3.10, and it differs
slightly from older versions: previously if both levels
and
base
/keep
were given, levels
argument was applied first
and then applied the base
/keep
argument. Since version 3.10,
base
/keep
would be ignored, even if old term behavior is
invoked (as described in the next section).
When a term's behavior has changed from prior version, it is often possible
to invoke the old behavior by setting and/or passing a version
term
option, giving the verison (constructed by as.package_version
)
desired.
ergm
termsUsers and other packages may build custom terms, and package ergm.userterms (https://github.com/statnet/ergm.userterms) provides tools for implementing them.
The current recommendation for any package implementing additional terms is
to document the term with Roxygen comments and a name in the form
termName-ergmTerm
. This ensures that help("ergmTerm")
will list ERGM
terms available from all loaded packages.
ergm
packageAs noted above, a cross-referenced HTML version of the term documentation is
also available via vignette('ergm-term-crossRef')
and terms
can also be searched via search.ergmTerms
.
Term | Package | Description | Concepts |
---|---|---|---|
ergm | Absolute difference in nodal attribute | directed dyad-independent quantitative nodal attribute undirected | |
ergm | Categorical absolute difference in nodal attribute | categorical nodal attribute directed dyad-independent undirected | |
altkstar(lambda, fixed) (bin) |
ergm | Alternating k-star | categorical nodal attribute curved undirected |
ergm | Asymmetric dyads | directed dyad-independent triad-related | |
atleast(threshold) (val) |
ergm | Number of dyads with values greater than or equal to a threshold | directed dyad-independent undirected |
atmost(threshold) (val) |
ergm | Number of dyads with values less than or equal to a threshold | directed dyad-independent undirected |
attrcov(attr, mat) (bin) |
ergm | Edge covariate by attribute pairing | directed dyad-independent undirected |
b1concurrent(by, levels) (bin) |
ergm | Concurrent node count for the first mode in a bipartite network | bipartite categorical nodal attribute undirected |
ergm | Main effect of a covariate for the first mode in a bipartite network | bipartite dyad-independent frequently-used quantitative nodal attribute undirected | |
nodecovrange(attr) (bin) |
ergm | Range of covariate values for neighbors of a mode-1 node | bipartite quantitative nodal attribute |
ergm | Degree range for the first mode in a bipartite network | bipartite undirected | |
b1degree(d, by, levels) (bin) |
ergm | Degree for the first mode in a bipartite network | bipartite categorical nodal attribute frequently-used undirected |
b1dsp(d) (bin) |
ergm | Dyadwise shared partners for dyads in the first bipartition | bipartite undirected |
ergm | Factor attribute effect for the first mode in a bipartite network | bipartite categorical nodal attribute dyad-independent frequently-used undirected | |
ergm | Number of distinct neighbor types for the first node | bipartite categorical nodal attribute | |
b1mindegree(d) (bin) |
ergm | Minimum degree for the first mode in a bipartite network | bipartite undirected |
ergm | Nodal attribute-based homophily effect for the first mode in a bipartite network | bipartite categorical nodal attribute dyad-independent frequently-used undirected | |
ergm | Degree | bipartite dyad-independent undirected | |
b1star(k, attr, levels) (bin) |
ergm | k-stars for the first mode in a bipartite network | bipartite categorical nodal attribute undirected |
ergm | Mixing matrix for k-stars centered on the first mode of a bipartite network | bipartite categorical nodal attribute undirected | |
ergm | Two-star census for central nodes centered on the first mode of a bipartite network | bipartite categorical nodal attribute undirected | |
b2concurrent(by) (bin) |
ergm | Concurrent node count for the second mode in a bipartite network | bipartite frequently-used undirected |
ergm | Main effect of a covariate for the second mode in a bipartite network | bipartite dyad-independent frequently-used quantitative nodal attribute undirected | |
nodecovrange(attr) (bin) |
ergm | Range of covariate values for neighbors of a mode-2 node | bipartite quantitative nodal attribute |
ergm | Degree range for the second mode in a bipartite network | bipartite undirected | |
b2degree(d, by) (bin) |
ergm | Degree for the second mode in a bipartite network | bipartite categorical nodal attribute frequently-used undirected |
b2dsp(d) (bin) |
ergm | Dyadwise shared partners for dyads in the second bipartition | bipartite undirected |
ergm | Factor attribute effect for the second mode in a bipartite network | bipartite categorical nodal attribute dyad-independent frequently-used undirected | |
ergm | Number of distinct neighbor types for the second mode | bipartite categorical nodal attribute | |
b2mindegree(d) (bin) |
ergm | Minimum degree for the second mode in a bipartite network | bipartite undirected |
ergm | Nodal attribute-based homophily effect for the second mode in a bipartite network | bipartite categorical nodal attribute dyad-independent frequently-used undirected | |
ergm | Degree | bipartite dyad-independent undirected | |
b2star(k, attr, levels) (bin) |
ergm | k-stars for the second mode in a bipartite network | bipartite categorical nodal attribute undirected |
ergm | Mixing matrix for k-stars centered on the second mode of a bipartite network | bipartite categorical nodal attribute undirected | |
ergm | Two-star census for central nodes centered on the second mode of a bipartite network | bipartite categorical nodal attribute undirected | |
balance (bin) |
ergm | Balanced triads | directed triad-related undirected |
ergm | Coincident node count for the second mode in a bipartite (aka two-mode) network | bipartite undirected | |
concurrent(by, levels) (bin) |
ergm | Concurrent node count | categorical nodal attribute undirected |
ergm | Concurrent tie count | categorical nodal attribute undirected | |
ergm | Cyclic triples | categorical nodal attribute directed triad-related | |
cycle(k, semi) (bin) |
ergm | k-Cycle Census | directed undirected |
ergm | Cyclical ties | directed undirected | |
ergm | Cyclical weights | directed nonnegative undirected | |
degcor (bin) |
ergm | Degree Correlation | undirected |
degcrossprod (bin) |
ergm | Degree Cross-Product | undirected |
ergm | Degree range | categorical nodal attribute undirected | |
ergm | Degree | categorical nodal attribute frequently-used undirected | |
degree1.5 (bin) |
ergm | Degree to the 3/2 power | undirected |
density (bin) |
ergm | Density | directed dyad-independent undirected |
ergm | Difference | bipartite directed dyad-independent frequently-used quantitative nodal attribute undirected | |
ergm | Directed dyadwise shared partners | directed | |
dyadcov(x, attrname) (bin) |
ergm | Dyadic covariate | directed dyad-independent quantitative dyadic attribute undirected |
ergm | Edge covariate | directed dyad-independent frequently-used quantitative dyadic attribute undirected | |
ergm | Number of edges in the network | directed dyad-independent undirected | |
ergm | Number of dyads with values equal to a specific value (within tolerance) | directed dyad-independent undirected | |
ergm | Directed edgewise shared partners | directed | |
greaterthan(threshold) (val) |
ergm | Number of dyads with values strictly greater than a threshold | directed dyad-independent undirected |
ergm | Geometrically weighted degree distribution for the first mode in a bipartite network | bipartite curved undirected | |
ergm | Geometrically weighted dyadwise shared partner distribution for dyads in the first bipartition | bipartite curved undirected | |
ergm | Geometrically weighted degree distribution for the second mode in a bipartite network | bipartite curved undirected | |
ergm | Geometrically weighted dyadwise shared partner distribution for dyads in the second bipartition | bipartite curved undirected | |
ergm | Geometrically weighted degree distribution | curved frequently-used undirected | |
ergm | Geometrically weighted dyadwise shared partner distribution | directed | |
ergm | Geometrically weighted edgewise shared partner distribution | directed | |
ergm | Geometrically weighted in-degree distribution | curved directed | |
ergm | Geometrically weighted non-edgewise shared partner distribution | directed | |
ergm | Geometrically weighted out-degree distribution | curved directed | |
ergm | Hamming distance | directed dyad-independent undirected | |
ergm | In-degree range | categorical nodal attribute directed | |
ergm | In-degree | categorical nodal attribute directed frequently-used | |
idegree1.5 (bin) |
ergm | In-degree to the 3/2 power | directed |
ergm | Number of dyads whose values are in an interval | directed dyad-independent undirected | |
intransitive (bin) |
ergm | Intransitive triads | directed triad-related |
isolatededges (bin) |
ergm | Isolated edges | bipartite undirected |
isolates (bin) |
ergm | Isolates | directed frequently-used undirected |
istar(k, attr, levels) (bin) |
ergm | In-stars | categorical nodal attribute directed |
kstar(k, attr, levels) (bin) |
ergm | k-stars | categorical nodal attribute undirected |
localtriangle(x) (bin) |
ergm | Triangles within neighborhoods | categorical dyadic attribute directed triad-related undirected |
m2star (bin) |
ergm | Mixed 2-stars, a.k.a 2-paths | directed |
meandeg (bin) |
ergm | Mean vertex degree | directed dyad-independent undirected |
ergm | Mixing matrix cells and margins | categorical nodal attribute directed dyad-independent frequently-used undirected | |
ergm | Mutuality | directed frequently-used | |
nearsimmelian (bin) |
ergm | Near simmelian triads | directed triad-related |
ergm | Main effect of a covariate | directed dyad-independent frequently-used quantitative nodal attribute undirected | |
ergm | Covariance of undirected dyad values incident on each actor | directed | |
nodecovrange(attr) (bin) |
ergm | Range of covariate values for neighbors of a node | directed quantitative nodal attribute undirected |
ergm | Factor attribute effect | categorical nodal attribute directed dyad-independent frequently-used undirected | |
ergm | Number of distinct neighbor types | categorical nodal attribute directed undirected | |
ergm | Main effect of a covariate for in-edges | directed frequently-used quantitative nodal attribute | |
ergm | Covariance of in-dyad values incident on each actor | directed | |
nodeicovrange(attr) (bin) |
ergm | Range of covariate values for in-neighbors of a node | directed quantitative nodal attribute |
ergm | Factor attribute effect for in-edges | categorical nodal attribute directed dyad-independent frequently-used | |
ergm | Number of distinct in-neighbor types | categorical nodal attribute directed | |
ergm | Uniform homophily and differential homophily | categorical nodal attribute directed dyad-independent frequently-used undirected | |
ergm | Nodal attribute mixing | categorical nodal attribute directed dyad-independent frequently-used undirected | |
ergm | Main effect of a covariate for out-edges | directed dyad-independent quantitative nodal attribute | |
ergm | Covariance of out-dyad values incident on each actor | directed | |
nodeocovrange(attr) (bin) |
ergm | Range of covariate values for out-neighbors of a node | directed quantitative nodal attribute |
ergm | Factor attribute effect for out-edges | categorical nodal attribute directed dyad-independent | |
ergm | Number of distinct out-neighbor types | categorical nodal attribute directed | |
ergm | Directed non-edgewise shared partners | directed | |
ergm | Out-degree range | categorical nodal attribute directed | |
ergm | Out-degree | categorical nodal attribute directed frequently-used | |
odegree1.5 (bin) |
ergm | Out-degree to the 3/2 power | directed |
opentriad (bin) |
ergm | Open triads | triad-related undirected |
ostar(k, attr, levels) (bin) |
ergm | k-Outstars | categorical nodal attribute directed |
ergm | Receiver effect | directed dyad-independent | |
ergm | Sender effect | directed dyad-independent | |
simmelian (bin) |
ergm | Simmelian triads | directed triad-related |
simmelianties (bin) |
ergm | Ties in simmelian triads | directed triad-related |
smalldiff(attr, cutoff) (bin) |
ergm | Number of ties between actors with similar attribute values | directed dyad-independent quantitative nodal attribute undirected |
smallerthan(threshold) (val) |
ergm | Number of dyads with values strictly smaller than a threshold | directed dyad-independent undirected |
ergm | Undirected degree | categorical nodal attribute dyad-independent undirected | |
sum(pow) (val) |
ergm | Sum of dyad values (optionally taken to a power) | directed undirected |
ergm | Three-trails | directed triad-related undirected | |
transitive (bin) |
ergm | Transitive triads | directed triad-related |
ergm | Transitive ties | categorical nodal attribute directed triad-related undirected | |
ergm | Transitive weights | directed nonnegative triad-related undirected | |
triadcensus(levels) (bin) |
ergm | Triad census | directed triad-related undirected |
ergm | Triangles | categorical nodal attribute directed frequently-used triad-related undirected | |
ergm | Triangle percentage | categorical nodal attribute triad-related undirected | |
ergm | Transitive triples | categorical nodal attribute directed triad-related | |
twopath (bin) |
ergm | 2-Paths | directed undirected |
Term | Package | Description | Concepts |
---|---|---|---|
B(formula, form) (val) |
ergm | Wrap binary terms for use in valued models | operator |
Curve(formula, params, map, gradient, minpar, maxpar, cov) (bin) Parametrise(formula, params, map, gradient, minpar, maxpar, cov) (bin) Parametrize(formula, params, map, gradient, minpar, maxpar, cov) (bin) Curve(formula, params, map, gradient, minpar, maxpar, cov) (val) Parametrise(formula, params, map, gradient, minpar, maxpar, cov) (val) Parametrize(formula, params, map, gradient, minpar, maxpar, cov) (val) |
ergm | Impose a curved structure on term parameters | operator |
ergm | Exponentiate a network's statistic | operator | |
F(formula, filter) (bin) |
ergm | Filtering on arbitrary one-term model | operator |
For(...) (bin) |
ergm | A for operator for terms | operator |
ergm | Modify terms' coefficient names | operator | |
ergm | Take a natural logarithm of a network's statistic | operator | |
ergm | Filtering on nodematch | operator | |
ergm | Terms with fixed coefficients | operator | |
ergm | A product (or an arbitrary power combination) of one or more formulas | operator | |
ergm | Evaluation on a projection of a bipartite network | bipartite operator | |
S(formula, attrs) (bin) |
ergm | Evaluation on an induced subgraph | operator |
ergm | A sum (or an arbitrary linear combination) of one or more formulas | operator | |
ergm | Evaluation on symmetrized (undirected) network | directed operator |
Term | bin | bip | dir | dyad-indep | op | val | undir |
---|---|---|---|---|---|---|---|
b1cov | ✔ | ✔ | ✔ | ✔ | ✔ | ||
b1degree | ✔ | ✔ | ✔ | ||||
b1factor | ✔ | ✔ | ✔ | ✔ | ✔ | ||
b1nodematch | ✔ | ✔ | ✔ | ✔ | |||
b2concurrent | ✔ | ✔ | ✔ | ||||
b2cov | ✔ | ✔ | ✔ | ✔ | ✔ | ||
b2degree | ✔ | ✔ | ✔ | ||||
b2factor | ✔ | ✔ | ✔ | ✔ | ✔ | ||
b2nodematch | ✔ | ✔ | ✔ | ✔ | |||
degree | ✔ | ✔ | |||||
diff | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |
edgecov | ✔ | ✔ | ✔ | ✔ | ✔ | ||
gwdegree | ✔ | ✔ | |||||
idegree | ✔ | ✔ | |||||
isolates | ✔ | ✔ | ✔ | ||||
mm | ✔ | ✔ | ✔ | ✔ | ✔ | ||
mutual | ✔ | ✔ | ✔ | ||||
nodecov | ✔ | ✔ | ✔ | ✔ | ✔ | ||
nodefactor | ✔ | ✔ | ✔ | ✔ | ✔ | ||
nodeicov | ✔ | ✔ | ✔ | ||||
nodeifactor | ✔ | ✔ | ✔ | ✔ | |||
nodematch | ✔ | ✔ | ✔ | ✔ | ✔ | ||
nodemix | ✔ | ✔ | ✔ | ✔ | ✔ | ||
odegree | ✔ | ✔ | |||||
triangle | ✔ | ✔ | ✔ |
Term | bin | bip | dir | dyad-indep | val | undir |
---|---|---|---|---|---|---|
B | ✔ | |||||
Curve | ✔ | ✔ | ||||
Exp | ✔ | ✔ | ||||
F | ✔ | |||||
For | ✔ | |||||
Label | ✔ | ✔ | ||||
Log | ✔ | ✔ | ||||
NodematchFilter | ✔ | |||||
Offset | ✔ | |||||
Prod | ✔ | ✔ | ||||
Project | ✔ | ✔ | ||||
S | ✔ | |||||
Sum | ✔ | ✔ | ||||
Symmetrize | ✔ | ✔ |
Term | dir | dyad-indep | quant nodal attr | undir | bin | val | cat nodal attr | curved | triad rel | op | bip | freq | nneg | quant dyad attr | cat dyad attr |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
absdiff | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||||||||
absdiffcat | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||||||||
altkstar | ✔ | ✔ | ✔ | ✔ | |||||||||||
asymmetric | ✔ | ✔ | ✔ | ✔ | |||||||||||
atleast | ✔ | ✔ | ✔ | ✔ | |||||||||||
atmost | ✔ | ✔ | ✔ | ✔ | |||||||||||
attrcov | ✔ | ✔ | ✔ | ✔ | |||||||||||
B | ✔ | ✔ | |||||||||||||
b1concurrent | ✔ | ✔ | ✔ | ✔ | |||||||||||
b1cov | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||
b1covrange | ✔ | ✔ | ✔ | ||||||||||||
b1degrange | ✔ | ✔ | ✔ | ||||||||||||
b1degree | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
b1dsp | ✔ | ✔ | ✔ | ||||||||||||
b1factor | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||
b1factordistinct | ✔ | ✔ | ✔ | ||||||||||||
b1mindegree | ✔ | ✔ | ✔ | ||||||||||||
b1nodematch | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||||||||
b1sociality | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
b1star | ✔ | ✔ | ✔ | ✔ | |||||||||||
b1starmix | ✔ | ✔ | ✔ | ✔ | |||||||||||
b1twostar | ✔ | ✔ | ✔ | ✔ | |||||||||||
b2concurrent | ✔ | ✔ | ✔ | ✔ | |||||||||||
b2cov | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||
b2covrange | ✔ | ✔ | ✔ | ||||||||||||
b2degrange | ✔ | ✔ | ✔ | ||||||||||||
b2degree | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
b2dsp | ✔ | ✔ | ✔ | ||||||||||||
b2factor | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||
b2factordistinct | ✔ | ✔ | ✔ | ||||||||||||
b2mindegree | ✔ | ✔ | ✔ | ||||||||||||
b2nodematch | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||||||||
b2sociality | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
b2star | ✔ | ✔ | ✔ | ✔ | |||||||||||
b2starmix | ✔ | ✔ | ✔ | ✔ | |||||||||||
b2twostar | ✔ | ✔ | ✔ | ✔ | |||||||||||
balance | ✔ | ✔ | ✔ | ✔ | |||||||||||
coincidence | ✔ | ✔ | ✔ | ||||||||||||
concurrent | ✔ | ✔ | ✔ | ||||||||||||
concurrentties | ✔ | ✔ | ✔ | ||||||||||||
ctriple | ✔ | ✔ | ✔ | ✔ | |||||||||||
Curve | ✔ | ✔ | ✔ | ||||||||||||
cycle | ✔ | ✔ | ✔ | ||||||||||||
cyclicalties | ✔ | ✔ | ✔ | ✔ | |||||||||||
cyclicalweights | ✔ | ✔ | ✔ | ✔ | |||||||||||
degcor | ✔ | ✔ | |||||||||||||
degcrossprod | ✔ | ✔ | |||||||||||||
degrange | ✔ | ✔ | ✔ | ||||||||||||
degree | ✔ | ✔ | ✔ | ✔ | |||||||||||
degree1.5 | ✔ | ✔ | |||||||||||||
density | ✔ | ✔ | ✔ | ✔ | |||||||||||
diff | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||||||
dsp | ✔ | ✔ | |||||||||||||
dyadcov | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
edgecov | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||
edges | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
equalto | ✔ | ✔ | ✔ | ✔ | |||||||||||
esp | ✔ | ✔ | |||||||||||||
Exp | ✔ | ✔ | ✔ | ||||||||||||
F | ✔ | ✔ | |||||||||||||
For | ✔ | ✔ | |||||||||||||
greaterthan | ✔ | ✔ | ✔ | ✔ | |||||||||||
gwb1degree | ✔ | ✔ | ✔ | ✔ | |||||||||||
gwb1dsp | ✔ | ✔ | ✔ | ✔ | |||||||||||
gwb2degree | ✔ | ✔ | ✔ | ✔ | |||||||||||
gwb2dsp | ✔ | ✔ | ✔ | ✔ | |||||||||||
gwdegree | ✔ | ✔ | ✔ | ✔ | |||||||||||
gwdsp | ✔ | ✔ | |||||||||||||
gwesp | ✔ | ✔ | |||||||||||||
gwidegree | ✔ | ✔ | ✔ | ||||||||||||
gwnsp | ✔ | ✔ | |||||||||||||
gwodegree | ✔ | ✔ | ✔ | ||||||||||||
hamming | ✔ | ✔ | ✔ | ✔ | |||||||||||
idegrange | ✔ | ✔ | ✔ | ||||||||||||
idegree | ✔ | ✔ | ✔ | ✔ | |||||||||||
idegree1.5 | ✔ | ✔ | |||||||||||||
ininterval | ✔ | ✔ | ✔ | ✔ | |||||||||||
intransitive | ✔ | ✔ | ✔ | ||||||||||||
isolatededges | ✔ | ✔ | ✔ | ||||||||||||
isolates | ✔ | ✔ | ✔ | ✔ | |||||||||||
istar | ✔ | ✔ | ✔ | ||||||||||||
kstar | ✔ | ✔ | ✔ | ||||||||||||
Label | ✔ | ✔ | ✔ | ||||||||||||
localtriangle | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
Log | ✔ | ✔ | ✔ | ||||||||||||
m2star | ✔ | ✔ | |||||||||||||
meandeg | ✔ | ✔ | ✔ | ✔ | |||||||||||
mm | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||
mutual | ✔ | ✔ | ✔ | ✔ | |||||||||||
nearsimmelian | ✔ | ✔ | ✔ | ||||||||||||
nodecov | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||
nodecovar | ✔ | ✔ | |||||||||||||
nodecovrange | ✔ | ✔ | ✔ | ✔ | |||||||||||
nodefactor | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||
nodefactordistinct | ✔ | ✔ | ✔ | ✔ | |||||||||||
nodeicov | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
nodeicovar | ✔ | ✔ | |||||||||||||
nodeicovrange | ✔ | ✔ | ✔ | ||||||||||||
nodeifactor | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||||||||
nodeifactordistinct | ✔ | ✔ | ✔ | ||||||||||||
nodematch | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||
NodematchFilter | ✔ | ✔ | |||||||||||||
nodemix | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||
nodeocov | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
nodeocovar | ✔ | ✔ | |||||||||||||
nodeocovrange | ✔ | ✔ | ✔ | ||||||||||||
nodeofactor | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
nodeofactordistinct | ✔ | ✔ | ✔ | ||||||||||||
nsp | ✔ | ✔ | |||||||||||||
odegrange | ✔ | ✔ | ✔ | ||||||||||||
odegree | ✔ | ✔ | ✔ | ✔ | |||||||||||
odegree1.5 | ✔ | ✔ | |||||||||||||
Offset | ✔ | ✔ | |||||||||||||
opentriad | ✔ | ✔ | ✔ | ||||||||||||
ostar | ✔ | ✔ | ✔ | ||||||||||||
Prod | ✔ | ✔ | ✔ | ||||||||||||
Project | ✔ | ✔ | ✔ | ||||||||||||
receiver | ✔ | ✔ | ✔ | ✔ | |||||||||||
S | ✔ | ✔ | |||||||||||||
sender | ✔ | ✔ | ✔ | ✔ | |||||||||||
simmelian | ✔ | ✔ | ✔ | ||||||||||||
simmelianties | ✔ | ✔ | ✔ | ||||||||||||
smalldiff | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
smallerthan | ✔ | ✔ | ✔ | ✔ | |||||||||||
sociality | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
sum | ✔ | ✔ | ✔ | ||||||||||||
Sum | ✔ | ✔ | ✔ | ||||||||||||
Symmetrize | ✔ | ✔ | ✔ | ||||||||||||
threetrail | ✔ | ✔ | ✔ | ✔ | |||||||||||
transitive | ✔ | ✔ | ✔ | ||||||||||||
transitiveties | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
transitiveweights | ✔ | ✔ | ✔ | ✔ | ✔ | ||||||||||
triadcensus | ✔ | ✔ | ✔ | ✔ | |||||||||||
triangle | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||||||||
tripercent | ✔ | ✔ | ✔ | ✔ | |||||||||||
ttriple | ✔ | ✔ | ✔ | ✔ | |||||||||||
twopath | ✔ | ✔ | ✔ |
Krivitsky P. N., Hunter D. R., Morris M., Klumb C. (2021). "ergm 4.0: New features and improvements." arXiv:2106.04997. https://arxiv.org/abs/2106.04997
Bomiriya, R. P, Bansal, S., and Hunter, D. R. (2014). Modeling Homophily in ERGMs for Bipartite Networks. Submitted.
Butts, CT. (2008). "A Relational Event Framework for Social Action." Sociological Methodology, 38(1).
Davis, J.A. and Leinhardt, S. (1972). The Structure of Positive Interpersonal Relations in Small Groups. In J. Berger (Ed.), Sociological Theories in Progress, Volume 2, 218–251. Boston: Houghton Mifflin.
Holland, P. W. and S. Leinhardt (1981). An exponential family of probability distributions for directed graphs. Journal of the American Statistical Association, 76: 33–50.
Hunter, D. R. and M. S. Handcock (2006). Inference in curved exponential family models for networks. Journal of Computational and Graphical Statistics, 15: 565–583.
Hunter, D. R. (2007). Curved exponential family models for social networks. Social Networks, 29: 216–230.
Krackhardt, D. and Handcock, M. S. (2007). Heider versus Simmel: Emergent Features in Dynamic Structures. Lecture Notes in Computer Science, 4503, 14–27.
Krivitsky P. N. (2012). Exponential-Family Random Graph Models for Valued Networks. Electronic Journal of Statistics, 2012, 6, 1100-1128. doi:10.1214/12-EJS696
Robins, G; Pattison, P; and Wang, P. (2009). "Closure, Connectivity, and Degree Distributions: Exponential Random Graph (p*) Models for Directed Social Networks." Social Networks, 31:105-117.
Snijders T. A. B., G. G. van de Bunt, and C. E. G. Steglich. Introduction to Stochastic Actor-Based Models for Network Dynamics. Social Networks, 2010, 32(1), 44-60. doi:10.1016/j.socnet.2009.02.004
Morris M, Handcock MS, and Hunter DR. Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects. Journal of Statistical Software, 2008, 24(4), 1-24. doi:10.18637/jss.v024.i04
Snijders, T. A. B., P. E. Pattison, G. L. Robins, and M. S. Handcock (2006). New specifications for exponential random graph models, Sociological Methodology, 36(1): 99-153.
ergm
package, search.ergmTerms
, ergm
, network
, %v%
, %n%
## Not run: ergm(flomarriage ~ kstar(1:2) + absdiff("wealth") + triangle) ergm(molecule ~ edges + kstar(2:3) + triangle + nodematch("atomic type",diff=TRUE) + triangle + absdiff("atomic type")) ## End(Not run)
## Not run: ergm(flomarriage ~ kstar(1:2) + absdiff("wealth") + triangle) ergm(molecule ~ edges + kstar(2:3) + triangle + nodematch("atomic type",diff=TRUE) + triangle + absdiff("atomic type")) ## End(Not run)
This term adds one network statistic to the model for each element in d
where the th such statistic equals the number of edges in the network with exactly
d[i]
shared partners.
# binary: desp(d, type="OTP") # binary: esp(d, type="OTP")
# binary: desp(d, type="OTP") # binary: esp(d, type="OTP")
d |
a vector of distinct integers |
type |
A string indicating the type of shared partner or path to be considered for directed networks: |
While there is only one shared partner configuration in the undirected
case, nine distinct configurations are possible for directed graphs, selected
using the type
argument. Currently, terms may be defined with respect to
five of these configurations; they are defined here as follows (using
terminology from Butts (2008) and the relevent
package):
Outgoing Two-path ("OTP"
): vertex is an OTP shared partner of ordered
pair
iff
. Also known as "transitive
shared partner".
Incoming Two-path ("ITP"
): vertex is an ITP shared partner of ordered
pair
iff
. Also known as "cyclical shared
partner"
Reciprocated Two-path ("RTP"
): vertex is an RTP shared partner of ordered
pair
iff
.
Outgoing Shared Partner ("OSP"
): vertex is an OSP shared partner of
ordered pair
iff
.
Incoming Shared Partner ("ISP"
): vertex is an ISP shared partner of ordered
pair
iff
.
By default, outgoing two-paths ("OTP"
) are calculated. Note that Robins et al. (2009)
define closely related statistics to several of the above, using slightly different terminology.
This term takes an additional term option (see
options?ergm
), cache.sp
, controlling whether
the implementation will cache the number of shared partners for
each dyad in the network; this is usually enabled by default.
This term can only be used with directed networks.
ergmTerm
for index of model terms currently visible to the package.
directed, binary
Evaluate the terms specified in formula
and exponentiates them with base .
# binary: Exp(formula) # valued: Exp(formula)
# binary: Exp(formula) # valued: Exp(formula)
formula |
a one-sided |
ergmTerm
for index of model terms currently visible to the package.
operator, binary, valued
Evaluates the given formula
on a network constructed by
taking and removing any edges for which
.
# binary: F(formula, filter)
# binary: F(formula, filter)
formula |
a one-sided |
filter |
must contain one binary
Formally, this means that it is expressable as
where for all |
ergmTerm
for index of model terms currently visible to the package.
operator, binary
This data set represents a simulation of a directed in-school friendship network. The network is named faux.desert.high.
data(faux.desert.high)
data(faux.desert.high)
faux.desert.high
is a network
object
with 107 vertices (students, in this case) and 439 directed edges
(friendship nominations). To obtain additional summary information about it,
type summary(faux.desert.high)
.
The vertex attributes are Grade
, Sex
, and Race
. The
Grade
attribute has values 7 through 12, indicating each student's
grade in school. The Race
attribute is based on the answers to two
questions, one on Hispanic identity and one on race, and takes six possible
values: White (non-Hisp.), Black (non-Hisp.), Hispanic, Asian (non-Hisp.),
Native American, and Other (non-Hisp.)
If the source of the data set does not specified otherwise, this data set is protected by the Creative Commons License https://creativecommons.org/licenses/by-nc-nd/2.5/.
When publishing results obtained using this data set, the original authors (Resnick et al, 1997) should be cited. In addition this package should be cited as:
Mark S. Handcock, David R. Hunter, Carter T. Butts, Steven M. Goodreau, and
Martina Morris. 2003 statnet: Software tools for the Statistical
Modeling of Network Data
https://statnet.org.
The data set is simulation based upon an ergm model fit to data from one school community from the AddHealth Study, Wave I (Resnick et al., 1997). It was constructed as follows:
The school in question (a single school with 7th through 12th grades) was selected from the Add Health "structure files." Documentation on these files can be found here: https://addhealth.cpc.unc.edu/documentation/codebooks/.
The stucture file contains directed out-ties representing each instance of a student who named another student as a friend. Students could nominate up to 5 male and 5 female friends. Note that registered students who did not take the AddHealth survey or who were not listed by name on the schools' student roster are not included in the stucture files. In addition, we removed any students with missing values for race, grade or sex.
The following ergm()
specification was fit to the original data
(with code updated for modern syntax):
desert.fit <- ergm(original.net ~ edges + mutual + absdiff("grade") + nodefactor("race", base=5) + nodefactor("grade", base=3) + nodefactor("sex") + nodematch("race", diff = TRUE) + nodematch("grade", diff = TRUE) + nodematch("sex", diff = FALSE) + idegree(0:1) + odegree(0:1) + gwesp(0.1,fixed=T), constraints = ~bd(maxout=10), control = control.ergm(MCMLE.steplength = .25, MCMC.burnin = 100000, MCMC.interval = 10000, MCMC.samplesize = 2500, MCMLE.maxit = 100), verbose=T)
Then the faux.desert.high dataset was created by simulating a single network from the above model fit:
faux.desert.high <- simulate(desert.fit, nsim=1, control=snctrl(MCMC.burnin=1e+8), constraints = ~edges)
Resnick M.D., Bearman, P.S., Blum R.W. et al. (1997). Protecting adolescents from harm. Findings from the National Longitudinal Study on Adolescent Health, Journal of the American Medical Association, 278: 823-32.
network
,
plot.network()
, ergm()
,
faux.desert.high
, faux.mesa.high
,
faux.magnolia.high
This data set represents a simulation of a directed in-school friendship network. The network is named faux.dixon.high.
data(faux.dixon.high)
data(faux.dixon.high)
faux.dixon.high
is a network
object
with 248 vertices (students, in this case) and 1197 directed edges
(friendship nominations). To obtain additional summary information about it,
type summary(faux.dixon.high)
.
The vertex attributes are Grade
, Sex
, and Race
. The
Grade
attribute has values 7 through 12, indicating each student's
grade in school. The Race
attribute is based on the answers to two
questions, one on Hispanic identity and one on race, and takes six possible
values: White (non-Hisp.), Black (non-Hisp.), Hispanic, Asian (non-Hisp.),
Native American, and Other (non-Hisp.)
If the source of the data set does not specified otherwise, this data set is protected by the Creative Commons License https://creativecommons.org/licenses/by-nc-nd/2.5/.
When publishing results obtained using this data set, the original authors (Resnick et al, 1997) should be cited. In addition this package should be cited as:
Mark S. Handcock, David R. Hunter, Carter T. Butts, Steven M. Goodreau, and
Martina Morris. 2003 statnet: Software tools for the Statistical
Modeling of Network Data
https://statnet.org.
The data set is simulation based upon an ergm model fit to data from one school community from the AddHealth Study, Wave I (Resnick et al., 1997). It was constructed as follows:
The school in question (a single school with 7th through 12th grades) was selected from the Add Health "structure files." Documentation on these files can be found here: https://addhealth.cpc.unc.edu/documentation/codebooks/.
The stucture file contains directed out-ties representing each instance of a student who named another student as a friend. Students could nominate up to 5 male and 5 female friends. Note that registered students who did not take the AddHealth survey or who were not listed by name on the schools' student roster are not included in the stucture files. In addition, we removed any students with missing values for race, grade or sex.
The following ergm()
specification was fit to the original data
(with code updated for modern syntax):
dixon.fit <- ergm(original.net ~ edges + mutual + absdiff("grade") + nodefactor("race", base=5) + nodefactor("grade", base=3) + nodefactor("sex") + nodematch("race", diff = TRUE) + nodematch("grade", diff = TRUE) + nodematch("sex", diff = FALSE) + idegree(0:1) + odegree(0:1) + gwesp(0.1,fixed=T), constraints = ~bd(maxout=10), control = control.ergm(MCMLE.steplength = .25, MCMC.burnin = 100000, MCMC.interval = 10000, MCMC.samplesize = 2500, MCMLE.maxit = 100), verbose=T)
Then the faux.dixon.high dataset was created by simulating a single network from the above model fit:
faux.dixon.high <- simulate(dixon.fit, nsim=1, burnin=1e+8, constraint = "edges")
Resnick M.D., Bearman, P.S., Blum R.W. et al. (1997). Protecting adolescents from harm. Findings from the National Longitudinal Study on Adolescent Health, Journal of the American Medical Association, 278: 823-32.
network
,
plot.network()
, ergm()
,
faux.desert.high
, faux.mesa.high
,
faux.magnolia.high
This data set represents a simulation of an in-school friendship network. The network is named faux.magnolia.high because the school commnunities on which it is based are large and located in the southern US.
data(faux.magnolia.high)
data(faux.magnolia.high)
faux.magnolia.high
is a network
object
with 1461 vertices (students, in this case) and 974 undirected edges (mutual
friendships). To obtain additional summary information about it, type
summary(faux.magnolia.high)
.
The vertex attributes are Grade
, Sex
, and Race
. The
Grade
attribute has values 7 through 12, indicating each student's
grade in school. The Race
attribute is based on the answers to two
questions, one on Hispanic identity and one on race, and takes six possible
values: White (non-Hisp.), Black (non-Hisp.), Hispanic, Asian (non-Hisp.),
Native American, and Other (non-Hisp.)
If the source of the data set does not specified otherwise, this data set is protected by the Creative Commons License https://creativecommons.org/licenses/by-nc-nd/2.5/.
When publishing results obtained using this data set, the original authors (Resnick et al, 1997) should be cited. In addition this package should be cited as:
Mark S. Handcock, David R. Hunter, Carter T. Butts, Steven M. Goodreau, and
Martina Morris. 2003 statnet: Software tools for the Statistical
Modeling of Network Data
https://statnet.org.
The data set is based upon a model fit to data from two school communities from the AddHealth Study, Wave I (Resnick et al., 1997). It was constructed as follows:
The two schools in question (a junior and senior high school in the same community) were combined into a single network dataset. Students who did not take the AddHealth survey or who were not listed on the schools' student rosters were eliminated, then an undirected link was established between any two individuals who both named each other as a friend. All missing race, grade, and sex values were replaced by a random draw with weights determined by the size of the attribute classes in the school.
The following ergm()
specification was fit to the original data:
magnolia.fit <- ergm (magnolia ~ edges + nodematch("Grade",diff=T) + nodematch("Race",diff=T) + nodematch("Sex",diff=F) + absdiff("Grade") + gwesp(0.25,fixed=T), control=control.ergm(MCMC.burnin=10000, MCMC.interval=1000, MCMLE.maxit=25, MCMC.samplesize=2500, MCMLE.steplength=0.25))
Then the faux.magnolia.high dataset was created by simulating a single network from the above model fit:
faux.magnolia.high <- simulate (magnolia.fit, nsim=1, control = snctrl(MCMC.burnin=100000000), constraints = ~edges)
Resnick M.D., Bearman, P.S., Blum R.W. et al. (1997). Protecting adolescents from harm. Findings from the National Longitudinal Study on Adolescent Health, Journal of the American Medical Association, 278: 823-32.
network
,
plot.network()
, ergm()
,
faux.mesa.high
This data set (formerly called “fauxhigh”) represents a simulation of
an in-school friendship network. The network is named faux.mesa.high
because the school commnunity on which it is based is in the rural western
US, with a student body that is largely Hispanic and Native American.
data(faux.mesa.high)
data(faux.mesa.high)
faux.mesa.high
is a network
object
with 205 vertices (students, in this case) and 203 undirected edges (mutual
friendships). To obtain additional summary information about it, type
summary(faux.mesa.high)
.
The vertex attributes are Grade
, Sex
, and Race
. The
Grade
attribute has values 7 through 12, indicating each student's
grade in school. The Race
attribute is based on the answers to two
questions, one on Hispanic identity and one on race, and takes six possible
values: White (non-Hisp.), Black (non-Hisp.), Hispanic, Asian (non-Hisp.),
Native American, and Other (non-Hisp.)
If the source of the data set does not specified otherwise, this data set is protected by the Creative Commons License https://creativecommons.org/licenses/by-nc-nd/2.5/.
When publishing results obtained using this data set, the original authors (Resnick et al, 1997) should be cited. In addition this package should be cited as:
Mark S. Handcock, David R. Hunter, Carter T. Butts, Steven M. Goodreau, and
Martina Morris. 2003 statnet: Software tools for the Statistical
Modeling of Network Data
https://statnet.org.
The data set is based upon a model fit to data from one school community from the AddHealth Study, Wave I (Resnick et al., 1997). It was constructed as follows:
A vector representing the sex of each student in the school was randomly re-ordered. The same was done with the students' response to questions on race and grade. These three attribute vectors were permuted independently. Missing values for each were randomly assigned with weights determined by the size of the attribute classes in the school.
The following ergm()
specification was used to fit a model to the
original data:
~ edges + nodefactor("Grade") + nodefactor("Race") + nodefactor("Sex") + nodematch("Grade",diff=TRUE) + nodematch("Race",diff=TRUE) + nodematch("Sex",diff=FALSE) + gwdegree(1.0,fixed=TRUE) + gwesp(1.0,fixed=TRUE) + gwdsp(1.0,fixed=TRUE)
The resulting model fit was then applied to a network with actors possessing the permuted attributes and with the same number of edges as in the original data.
The processes for handling missing data and defining the race attribute are described in Hunter, Goodreau & Handcock (2008).
Hunter D.R., Goodreau S.M. and Handcock M.S. (2008). Goodness of Fit of Social Network Models, Journal of the American Statistical Association.
Resnick M.D., Bearman, P.S., Blum R.W. et al. (1997). Protecting adolescents from harm. Findings from the National Longitudinal Study on Adolescent Health, Journal of the American Medical Association, 278: 823-32.
network
,
plot.network()
, ergm()
,
faux.magnolia.high
The generic fix.curved
converts an ergm
object or
formula of a model with curved terms to the variant in which the curved
parameters are fixed. Note that each term has to be treated as a special
case.
fix.curved(object, ...) ## S3 method for class 'ergm' fix.curved(object, ...) ## S3 method for class 'formula' fix.curved(object, theta, ...)
fix.curved(object, ...) ## S3 method for class 'ergm' fix.curved(object, ...) ## S3 method for class 'formula' fix.curved(object, theta, ...)
object |
An |
... |
Unused at this time. |
theta |
Curved model parameter configuration. |
Some ERGM terms such as gwesp
and gwdegree
have
two forms: a curved form, for which their decay or similar parameters are to
be estimated, and whose canonical statistics is a vector of the term's
components (esp(1)
, esp(2)
, ... and
degree(1)
, degree(2)
, ..., respectively) and
a "fixed" form where the decay or similar parameters are fixed, and whose
canonical statistic is just the term itself. It is often desirable to fit a
model estimating the curved parameters but simulate the "fixed" statistic.
This function thus takes in a fit or a formula and performs this mapping, returning a "fixed" model and parameter specification. It only works for curved ERGM terms included with the ergm package. It does not work with curved terms not included in ergm.
A list with the following components:
formula |
The "fixed" formula. |
theta |
The "fixed" parameter vector. |
data(sampson) gest<-ergm(samplike~edges+gwesp(), control=control.ergm(MCMLE.maxit=2)) summary(gest) # A statistic for esp(1),...,esp(16) simulate(gest,output="stats") tmp<-fix.curved(gest) tmp # A gwesp() statistic only simulate(tmp$formula, coef=tmp$theta, output="stats")
data(sampson) gest<-ergm(samplike~edges+gwesp(), control=control.ergm(MCMLE.maxit=2)) summary(gest) # A statistic for esp(1),...,esp(16) simulate(gest,output="stats") tmp<-fix.curved(gest) tmp # A gwesp() statistic only simulate(tmp$formula, coef=tmp$theta, output="stats")
Preserve the dyad status in all but free.dyads
.
# fixallbut(free.dyads)
# fixallbut(free.dyads)
free.dyads |
a two-column edge list, a |
ergmConstraint
for index of constraints and hints currently visible to the package.
directed, dyad-independent, undirected
Fix the dyads in fixed.dyads
at their current value, preserve the edges in present
, and preclude the edges in absent
.
# fixedas(fixed.dyads, present, absent)
# fixedas(fixed.dyads, present, absent)
fixed.dyads , present , absent
|
a two-column edge list or a |
present
and absent
differ from fixed.dyads
in that
they check that the specified edges are in fact present and/or
absent and stop with an error if not.
ergmConstraint
for index of constraints and hints currently visible to the package.
directed, dyad-independent, undirected
This is a data set of marriage and business ties among Renaissance
Florentine families. The data is originally from Padgett (1994) via
UCINET
and stored as a network
object.
data(florentine)
data(florentine)
Breiger & Pattison (1986), in their discussion of local role analysis, use
a subset of data on the social relations among Renaissance Florentine
families (person aggregates) collected by John Padgett from historical
documents. The two relations are business ties (flobusiness
-
specifically, recorded financial ties such as loans, credits and joint
partnerships) and marriage alliances (flomarriage
).
As Breiger & Pattison point out, the original data are symmetrically coded.
This is acceptable perhaps for marital ties, but is unfortunate for the
financial ties (which are almost certainly directed). To remedy this, the
financial ties can be recoded as directed relations using some external
measure of power - for instance, a measure of wealth. Both graphs provide
vertex information on (1) wealth
each family's net wealth in 1427 (in
thousands of lira); (2) priorates
the number of priorates (seats on
the civic council) held between 1282- 1344; and (3) totalties
the
total number of business or marriage ties in the total dataset of 116
families (see Breiger & Pattison (1986), p 239).
Substantively, the data include families who were locked in a struggle for political control of the city of Florence around 1430. Two factions were dominant in this struggle: one revolved around the infamous Medicis (9), the other around the powerful Strozzis (15).
Padgett, John F. 1994. Marriage and Elite Structure in Renaissance Florence, 1282-1500. Paper delivered to the Social Science History Association.
Wasserman, S. and Faust, K. (1994) Social Network Analysis: Methods and Applications, Cambridge University Press, Cambridge, England.
Breiger R. and Pattison P. (1986). Cumulated social roles: The duality of persons and their algebras, Social Networks, 8, 215-256.
flo, network, plot.network, ergm
for
operator for termsThis operator evaluates the formula given to it, substituting the specified loop counter variable with each element in a sequence.
# binary: For(...)
# binary: For(...)
... |
in any order,
|
Placeholders are specified in the style of
foreach::foreach()
, as VAR = SEQ
. VAR
can be any valid R variable name, and SEQ can be a vector,
a list, a function of one argument, or a one-sided formula. The
vector or list will be used directly, whereas a function will be
called with the network as its argument to produce the list, and
the formula will be used analogously to purrr::as_mapper()
, its
RHS evaluated in an environment in which the network itself will
be accessible as .
or .nw
.
If more than one named expression is given, they will be expanded
as one would expect in a nested for
loop: earlier expressions
will form the outer loops and later expressions the inner loops.
ergmTerm
for index of model terms currently visible to the package.
operator, binary
# # The following are equivalent ways to compute differential # homophily. # data(sampson) (groups <- sort(unique(samplike%v%"group"))) # Sorted list of groups. # The "normal" way: summary(samplike ~ nodematch("group", diff=TRUE)) # One element at a time, specifying a list: summary(samplike ~ For(~nodematch("group", levels=., diff=TRUE), . = groups)) # One element at a time, specifying a function that returns a list: summary(samplike ~ For(~nodematch("group", levels=., diff=TRUE), . = function(nw) sort(unique(nw%v%"group")))) # One element at a time, specifying a formula whose RHS expression # returns a list: summary(samplike ~ For(~nodematch("group", levels=., diff=TRUE), . = ~sort(unique(.%v%"group")))) # # Multiple iterators are possible, in any order. Here, absdiff() is # being computed for each combination of attribute and power. # data(florentine) # The "normal" way: summary(flomarriage ~ absdiff("wealth", pow=1) + absdiff("priorates", pow=1) + absdiff("wealth", pow=2) + absdiff("priorates", pow=2) + absdiff("wealth", pow=3) + absdiff("priorates", pow=3)) # With a loop; note that the attribute (a) is being iterated within # power (.): summary(flomarriage ~ For(. = 1:3, a = c("wealth", "priorates"), ~absdiff(a, pow=.)))
# # The following are equivalent ways to compute differential # homophily. # data(sampson) (groups <- sort(unique(samplike%v%"group"))) # Sorted list of groups. # The "normal" way: summary(samplike ~ nodematch("group", diff=TRUE)) # One element at a time, specifying a list: summary(samplike ~ For(~nodematch("group", levels=., diff=TRUE), . = groups)) # One element at a time, specifying a function that returns a list: summary(samplike ~ For(~nodematch("group", levels=., diff=TRUE), . = function(nw) sort(unique(nw%v%"group")))) # One element at a time, specifying a formula whose RHS expression # returns a list: summary(samplike ~ For(~nodematch("group", levels=., diff=TRUE), . = ~sort(unique(.%v%"group")))) # # Multiple iterators are possible, in any order. Here, absdiff() is # being computed for each combination of attribute and power. # data(florentine) # The "normal" way: summary(flomarriage ~ absdiff("wealth", pow=1) + absdiff("priorates", pow=1) + absdiff("wealth", pow=2) + absdiff("priorates", pow=2) + absdiff("wealth", pow=3) + absdiff("priorates", pow=3)) # With a loop; note that the attribute (a) is being iterated within # power (.): summary(flomarriage ~ For(. = 1:3, a = c("wealth", "priorates"), ~absdiff(a, pow=.)))
This is an example thought of by Steve Goodreau. It is a directed network of
four nodes and five ties stored as a network
object.
data(g4)
data(g4)
It is interesting because the maximum likelihood estimator of the model with out degree 3 in it exists, but the maximum psuedolikelihood estimator does not.
Steve Goodreau
florentine, network, plot.network, ergm
data(g4) summary(ergm(g4 ~ odegree(3), estimate="MPLE")) summary(ergm(g4 ~ odegree(3), control=control.ergm(init=0)))
data(g4) summary(ergm(g4 ~ odegree(3), estimate="MPLE")) summary(ergm(g4 ~ odegree(3), control=control.ergm(init=0)))
coda
's coda::geweke.diag()
.Rather than comparing each mean independently, compares them
jointly. Note that it returns an htest
object, not a geweke.diag
object.
geweke.diag.mv(x, frac1 = 0.1, frac2 = 0.5, split.mcmc.list = FALSE, ...)
geweke.diag.mv(x, frac1 = 0.1, frac2 = 0.5, split.mcmc.list = FALSE, ...)
x |
an |
frac1 , frac2
|
the fraction at the start and, respectively, at the end of the sample to compare. |
split.mcmc.list |
when given an |
... |
additional arguments, passed on to
|
An object of class htest
, inheriting from that returned
by approx.hotelling.diff.test()
, but with p-value considered to
be 0 on insufficient sample size.
If approx.hotelling.diff.test()
returns an error, then
assume that burn-in is insufficient.
coda::geweke.diag()
, approx.hotelling.diff.test()
gof()
calculates -values for geodesic distance, degree,
and reachability summaries to diagnose the goodness-of-fit of exponential
family random graph models. See
ergm()
for more information on
these models.
gof(object, ...) ## S3 method for class 'ergm' gof( object, ..., coef = coefficients(object), GOF = NULL, constraints = object$constraints, control = control.gof.ergm(), verbose = FALSE ) ## S3 method for class 'formula' gof( object, ..., coef = NULL, GOF = NULL, constraints = ~., basis = eval_lhs.formula(object), control = NULL, unconditional = TRUE, verbose = FALSE ) ## S3 method for class 'gof' print(x, ...) ## S3 method for class 'gof' plot( x, ..., cex.axis = 0.7, plotlogodds = FALSE, main = "Goodness-of-fit diagnostics", normalize.reachability = FALSE, verbose = FALSE )
gof(object, ...) ## S3 method for class 'ergm' gof( object, ..., coef = coefficients(object), GOF = NULL, constraints = object$constraints, control = control.gof.ergm(), verbose = FALSE ) ## S3 method for class 'formula' gof( object, ..., coef = NULL, GOF = NULL, constraints = ~., basis = eval_lhs.formula(object), control = NULL, unconditional = TRUE, verbose = FALSE ) ## S3 method for class 'gof' print(x, ...) ## S3 method for class 'gof' plot( x, ..., cex.axis = 0.7, plotlogodds = FALSE, main = "Goodness-of-fit diagnostics", normalize.reachability = FALSE, verbose = FALSE )
object |
Either a formula or an |
... |
Additional arguments, to be passed to lower-level functions. |
coef |
When given either a formula or an object of class ergm,
|
GOF |
formula; an formula object, of the form |
constraints |
A one-sided formula specifying one or more constraints on
the support of the distribution of the networks being modeled. See the help
for similarly-named argument in |
control |
A list of control parameters for algorithm tuning,
typically constructed with |
verbose |
A logical or an integer to control the amount of
progress and diagnostic information to be printed. |
basis |
a value (usually a |
unconditional |
logical; if |
x |
an object of class |
cex.axis |
Character expansion of the axis labels relative to that for the plot. |
plotlogodds |
Plot the odds of a dyad having given characteristics (e.g., reachability, minimum geodesic distance, shared partners). This is an alternative to the probability of a dyad having the same property. |
main |
Title for the goodness-of-fit plots. |
normalize.reachability |
Should the reachability proportion be normalized to make it more comparable with the other geodesic distance proportions. |
A sample of graphs is randomly drawn from the specified model. The first
argument is typically the output of a call to ergm()
and the
model used for that call is the one fit.
For GOF = ~model
, the model's observed sufficient statistics are
plotted as quantiles of the simulated sample. In a good fit, the observed
statistics should be near the sample median (0.5).
By default, the sample consists of 100 simulated networks, but this sample
size (and many other settings) can be changed using the control
argument described above.
gof()
, gof.ergm()
, and
gof.formula()
return an object of class gof.ergm
, which inherits from class gof
. This
is a list of the tables of statistics and -values. This is typically
plotted using
plot.gof()
.
gof(ergm)
: Perform simulation to evaluate goodness-of-fit for
a specific ergm()
fit.
gof(formula)
: Perform simulation to evaluate goodness-of-fit for
a model configuration specified by a formula
, coefficient,
constraints, and other settings.
print(gof)
: print.gof()
summaries the diagnostics such as the
degree distribution, geodesic distances, shared partner
distributions, and reachability for the goodness-of-fit of
exponential family random graph models. (summary.gof
is a deprecated
alias that may be repurposed in the future.)
plot(gof)
: plot.gof()
plots diagnostics such as the degree
distribution, geodesic distances, shared partner distributions, and
reachability for the goodness-of-fit of exponential family random graph
models.
For gof.ergm
and gof.formula
, default behavior depends on the
directedness of the network involved; if undirected then degree, espartners,
and distance are used as default properties to examine. If the network in
question is directed, “degree” in the above is replaced by idegree
and odegree.
ergm()
, network()
, simulate.ergm()
, summary.ergm()
data(florentine) gest <- ergm(flomarriage ~ edges + kstar(2)) gest summary(gest) # test the gof.ergm function gofflo <- gof(gest) gofflo # Plot all three on the same page # with nice margins par(mfrow=c(1,3)) par(oma=c(0.5,2,1,0.5)) plot(gofflo) # And now the log-odds plot(gofflo, plotlogodds=TRUE) # Use the formula version of gof gofflo2 <-gof(flomarriage ~ edges + kstar(2), coef=c(-1.6339, 0.0049)) plot(gofflo2)
data(florentine) gest <- ergm(flomarriage ~ edges + kstar(2)) gest summary(gest) # test the gof.ergm function gofflo <- gof(gest) gofflo # Plot all three on the same page # with nice margins par(mfrow=c(1,3)) par(oma=c(0.5,2,1,0.5)) plot(gofflo) # And now the log-odds plot(gofflo, plotlogodds=TRUE) # Use the formula version of gof gofflo2 <-gof(flomarriage ~ edges + kstar(2), coef=c(-1.6339, 0.0049)) plot(gofflo2)
Adds the number of statistics equal to the length of threshold
equaling to the number of dyads whose values exceed the
corresponding element of threshold
.
# valued: greaterthan(threshold=0)
# valued: greaterthan(threshold=0)
threshold |
a vector of numerical values |
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, undirected, valued
This term adds one network statistic to the model equal to the weighted
degree distribution with decay controlled by the decay
parameter, which should be non-negative,
for nodes in the first mode of a bipartite network. The first mode of a bipartite network
object is sometimes known as the "actor" mode.
This term can only be used with undirected bipartite networks.
# binary: gwb1degree(decay, fixed=FALSE, attr=NULL, cutoff=30, levels=NULL)
# binary: gwb1degree(decay, fixed=FALSE, attr=NULL, cutoff=30, levels=NULL)
decay |
nonnegative decay parameter for the first mode degree frequencies; required if |
fixed |
optional argument indicating
whether the |
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
cutoff |
This optional argument sets the number of underlying degree terms
to use in computing the statistics when |
levels |
TODO (See Specifying Vertex
attributes and Levels ( |
ergmTerm
for index of model terms currently visible to the package.
bipartite, curved, undirected, binary
This term adds one network statistic to the model equal to the geometrically
weighted dyadwise shared partner distribution for dyads in the first bipartition with decay parameter
decay
parameter, which should be non-negative. This term can only be used with bipartite networks.
# binary: gwb1dsp(decay=0, fixed=FALSE, cutoff=30)
# binary: gwb1dsp(decay=0, fixed=FALSE, cutoff=30)
decay |
nonnegative decay parameter for the shared partner counts; required if |
fixed |
optional argument indicating
whether the |
cutoff |
This optional argument sets the number of underlying b1dsp terms
to use in computing the statistics when |
This term takes an additional term option (see
options?ergm
), cache.sp
, controlling whether
the implementation will cache the number of shared partners for
each dyad in the network; this is usually enabled by default.
ergmTerm
for index of model terms currently visible to the package.
bipartite, curved, undirected, binary
This term adds one network statistic to the model equal to the weighted degree distribution with decay controlled by the which should be non-negative, for nodes in the second mode of a bipartite network. The second mode of a bipartite network object is sometimes known as the "event" mode.
# binary: gwb2degree(decay, fixed=FALSE, attr=NULL, cutoff=30, levels=NULL)
# binary: gwb2degree(decay, fixed=FALSE, attr=NULL, cutoff=30, levels=NULL)
decay |
nonnegative decay parameter for the second mode degree frequencies; required if |
fixed |
optional argument indicating
whether the |
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
cutoff |
This optional argument sets the number of underlying degree terms
to use in computing the statistics when |
levels |
TODO (See Specifying Vertex
attributes and Levels ( |
ergmTerm
for index of model terms currently visible to the package.
bipartite, curved, undirected, binary
This term adds one network statistic to the model equal to the geometrically
weighted dyadwise shared partner distribution for dyads in the second bipartition with decay parameter
decay
parameter, which should be non-negative. This term can only be used with bipartite networks.
# binary: gwb2dsp(decay=0, fixed=FALSE, cutoff=30)
# binary: gwb2dsp(decay=0, fixed=FALSE, cutoff=30)
decay |
nonnegative decay parameter for the shared partner counts; required if |
fixed |
optional argument indicating
whether the |
cutoff |
This optional argument sets the number of underlying b2dsp terms
to use in computing the statistics when |
This term takes an additional term option (see
options?ergm
), cache.sp
, controlling whether
the implementation will cache the number of shared partners for
each dyad in the network; this is usually enabled by default.
ergmTerm
for index of model terms currently visible to the package.
bipartite, curved, undirected, binary
This term adds one network statistic to the model equal to the weighted
degree distribution with decay controlled by the decay
parameter, which should be non-negative.
# binary: gwdegree(decay, fixed=FALSE, attr=NULL, cutoff=30, levels=NULL)
# binary: gwdegree(decay, fixed=FALSE, attr=NULL, cutoff=30, levels=NULL)
decay |
nonnegative decay parameter for the degree frequencies; required if |
fixed |
optional argument indicating
whether the |
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
cutoff |
This optional argument sets the number of underlying degree terms
to use in computing the statistics when |
levels |
TODO (See Specifying Vertex
attributes and Levels ( |
ergmTerm
for index of model terms currently visible to the package.
curved, frequently-used, undirected, binary
This term adds one network statistic to the model equal to the geometrically weighted dyadwise shared partner distribution with decay parameter decay
parameter.
# binary: dgwdsp(decay, fixed=FALSE, cutoff=30, type="OTP") # binary: gwdsp(decay, fixed=FALSE, cutoff=30, type="OTP")
# binary: dgwdsp(decay, fixed=FALSE, cutoff=30, type="OTP") # binary: gwdsp(decay, fixed=FALSE, cutoff=30, type="OTP")
decay |
nonnegative decay parameter for the shared partner or selected directed analogue count; required if |
fixed |
optional argument indicating
whether the |
cutoff |
This optional argument sets the number of underlying DSP terms
to use in computing the statistics when |
type |
A string indicating the type of shared partner or path to be considered for directed networks: |
While there is only one shared partner configuration in the undirected
case, nine distinct configurations are possible for directed graphs, selected
using the type
argument. Currently, terms may be defined with respect to
five of these configurations; they are defined here as follows (using
terminology from Butts (2008) and the relevent
package):
Outgoing Two-path ("OTP"
): vertex is an OTP shared partner of ordered
pair
iff
. Also known as "transitive
shared partner".
Incoming Two-path ("ITP"
): vertex is an ITP shared partner of ordered
pair
iff
. Also known as "cyclical shared
partner"
Reciprocated Two-path ("RTP"
): vertex is an RTP shared partner of ordered
pair
iff
.
Outgoing Shared Partner ("OSP"
): vertex is an OSP shared partner of
ordered pair
iff
.
Incoming Shared Partner ("ISP"
): vertex is an ISP shared partner of ordered
pair
iff
.
By default, outgoing two-paths ("OTP"
) are calculated. Note that Robins et al. (2009)
define closely related statistics to several of the above, using slightly different terminology.
This term takes an additional term option (see
options?ergm
), cache.sp
, controlling whether
the implementation will cache the number of shared partners for
each dyad in the network; this is usually enabled by default.
The GWDSP statistic is equal to the sum of GWNSP plus GWESP.
The decay
parameter was called alpha
prior to ergm
3.7.
ergmTerm
for index of model terms currently visible to the package.
directed, binary
This term adds a statistic equal to the geometrically weighted edgewise (not dyadwise) shared partner distribution with decay parameter decay
parameter.
# binary: dgwesp(decay, fixed=FALSE, cutoff=30, type="OTP") # binary: gwesp(decay, fixed=FALSE, cutoff=30, type="OTP")
# binary: dgwesp(decay, fixed=FALSE, cutoff=30, type="OTP") # binary: gwesp(decay, fixed=FALSE, cutoff=30, type="OTP")
decay |
nonnegative decay parameter for the shared partner or selected directed analogue count; required if |
fixed |
optional argument indicating
whether the |
cutoff |
This optional argument sets the number of underlying ESP terms
to use in computing the statistics when |
type |
A string indicating the type of shared partner or path to be considered for directed networks: |
While there is only one shared partner configuration in the undirected
case, nine distinct configurations are possible for directed graphs, selected
using the type
argument. Currently, terms may be defined with respect to
five of these configurations; they are defined here as follows (using
terminology from Butts (2008) and the relevent
package):
Outgoing Two-path ("OTP"
): vertex is an OTP shared partner of ordered
pair
iff
. Also known as "transitive
shared partner".
Incoming Two-path ("ITP"
): vertex is an ITP shared partner of ordered
pair
iff
. Also known as "cyclical shared
partner"
Reciprocated Two-path ("RTP"
): vertex is an RTP shared partner of ordered
pair
iff
.
Outgoing Shared Partner ("OSP"
): vertex is an OSP shared partner of
ordered pair
iff
.
Incoming Shared Partner ("ISP"
): vertex is an ISP shared partner of ordered
pair
iff
.
By default, outgoing two-paths ("OTP"
) are calculated. Note that Robins et al. (2009)
define closely related statistics to several of the above, using slightly different terminology.
This term takes an additional term option (see
options?ergm
), cache.sp
, controlling whether
the implementation will cache the number of shared partners for
each dyad in the network; this is usually enabled by default.
The decay
parameter was called alpha
prior to ergm
3.7.
ergmTerm
for index of model terms currently visible to the package.
directed, binary
This term adds one network statistic to the model
equal to the weighted in-degree distribution with decay parameter
decay
parameter, which should be non-negative. This
term can only be used with directed networks.
# binary: gwidegree(decay, fixed=FALSE, attr=NULL, cutoff=30, levels=NULL)
# binary: gwidegree(decay, fixed=FALSE, attr=NULL, cutoff=30, levels=NULL)
decay |
nonnegative decay parameter for the indegree frequencies; required if |
fixed |
optional argument indicating
whether the |
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
cutoff |
This optional argument sets the number of underlying degree terms
to use in computing the statistics when |
levels |
TODO (See Specifying Vertex
attributes and Levels ( |
ergmTerm
for index of model terms currently visible to the package.
curved, directed, binary
This term is just like gwesp and gwdsp except it adds a statistic equal to the geometrically weighted nonedgewise (that is, over dyads that do not have an edge) shared partner distribution with decay parameter decay
parameter.
# binary: dgwnsp(decay, fixed=FALSE, cutoff=30, type="OTP") # binary: gwnsp(decay, fixed=FALSE, cutoff=30, type="OTP")
# binary: dgwnsp(decay, fixed=FALSE, cutoff=30, type="OTP") # binary: gwnsp(decay, fixed=FALSE, cutoff=30, type="OTP")
decay |
nonnegative decay parameter for the shared partner or selected directed analogue count; required if |
fixed |
optional argument indicating
whether the |
cutoff |
This optional argument sets the number of underlying NSP terms
to use in computing the statistics when |
type |
A string indicating the type of shared partner or path to be considered for directed networks: |
While there is only one shared partner configuration in the undirected
case, nine distinct configurations are possible for directed graphs, selected
using the type
argument. Currently, terms may be defined with respect to
five of these configurations; they are defined here as follows (using
terminology from Butts (2008) and the relevent
package):
Outgoing Two-path ("OTP"
): vertex is an OTP shared partner of ordered
pair
iff
. Also known as "transitive
shared partner".
Incoming Two-path ("ITP"
): vertex is an ITP shared partner of ordered
pair
iff
. Also known as "cyclical shared
partner"
Reciprocated Two-path ("RTP"
): vertex is an RTP shared partner of ordered
pair
iff
.
Outgoing Shared Partner ("OSP"
): vertex is an OSP shared partner of
ordered pair
iff
.
Incoming Shared Partner ("ISP"
): vertex is an ISP shared partner of ordered
pair
iff
.
By default, outgoing two-paths ("OTP"
) are calculated. Note that Robins et al. (2009)
define closely related statistics to several of the above, using slightly different terminology.
This term takes an additional term option (see
options?ergm
), cache.sp
, controlling whether
the implementation will cache the number of shared partners for
each dyad in the network; this is usually enabled by default.
The decay
parameter was called alpha
prior to ergm
3.7.
ergmTerm
for index of model terms currently visible to the package.
directed, binary
This term adds one network statistic to the model
equal to the weighted out-degree distribution with decay parameter
decay
parameter, which should be non-negative. This
term can only be used with directed networks.
# binary: gwodegree(decay, fixed=FALSE, attr=NULL, cutoff=30, levels=NULL)
# binary: gwodegree(decay, fixed=FALSE, attr=NULL, cutoff=30, levels=NULL)
decay |
nonnegative decay parameter for the outdegree frequencies; required if |
fixed |
optional argument indicating
whether the |
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
cutoff |
This optional argument sets the number of underlying degree terms
to use in computing the statistics when |
levels |
TODO (See Specifying Vertex
attributes and Levels ( |
ergmTerm
for index of model terms currently visible to the package.
curved, directed, binary
This constraint is currently broken. Do not use.
# hamming
# hamming
ergmConstraint
for index of constraints and hints currently visible to the package.
directed, undirected
This term adds one statistic to the model equal to the weighted or
unweighted Hamming distance of the network from the network specified by
x
. Unweighted Hamming distance is defined as the total
number of pairs (ordered or unordered, depending on whether the
network is directed or undirected) on which the two networks differ. If the
optional argument
cov
is specified, then the weighted Hamming
distance is computed instead, where each pair contributes a
pre-specified weight toward the distance when the two networks differ on
that pair.
# binary: hamming(x, cov, attrname=NULL)
# binary: hamming(x, cov, attrname=NULL)
x |
defaults to be the observed
network, i.e., the network on the left side of the |
cov |
either a matrix of edgewise weights or a network |
attrname |
option argument that provides the name of the edge attribute
to use for weight values when a network is specified in |
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, undirected, binary
This term adds one
network statistic to the model for each element of from
(or to
); the th
such statistic equals the number of nodes in the network of in-degree
greater than or equal to
from[i]
but strictly less than to[i]
, i.e. with
in-edge count in semiopen interval [from,to)
.
This term can only be used with directed networks; for undirected
networks (bipartite and not)
see degrange
. For degrees of specific modes of bipartite
networks, see b1degrange
and b2degrange
. For
in-degrees, see idegrange
.
# binary: idegrange(from, to=+Inf, by=NULL, homophily=FALSE, levels=NULL)
# binary: idegrange(from, to=+Inf, by=NULL, homophily=FALSE, levels=NULL)
from , to
|
vectors of distinct integers. If one of the vectors have length 1, it is recycled to the length of the other. Otherwise, it must have the same length. |
by , levels , homophily
|
the optional argument |
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, binary
This term adds one network statistic to
the model for each element in d
; the th such statistic equals
the number of nodes in the network of in-degree
d[i]
, i.e. the number
of nodes with exactly d[i]
in-edges.
This term can only be used with directed networks; for undirected networks
see degree
.
# binary: idegree(d, by=NULL, homophily=FALSE, levels=NULL)
# binary: idegree(d, by=NULL, homophily=FALSE, levels=NULL)
d |
a vector of distinct integers |
by , levels , homophily
|
the optional argument |
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, frequently-used, binary
This term adds one network statistic to the model equaling the sum over the actors of each actor's indegree taken to the 3/2 power (or, equivalently, multiplied by its square root). This term is analogous to the term of Snijders et al. (2010), equation (12). This term can only be used with directed networks.
# binary: idegree1.5
# binary: idegree1.5
ergmTerm
for index of model terms currently visible to the package.
directed, binary
Preserve the indegree distribution of the given network.
# idegreedist
# idegreedist
ergmConstraint
for index of constraints and hints currently visible to the package.
directed
For directed networks, preserve the indegree of each vertex of the given network, while allowing outdegree to vary
# idegrees
# idegrees
ergmConstraint
for index of constraints and hints currently visible to the package.
directed
Adds one statistic equaling to the number of dyads whose values
are between lower
and upper
.
# valued: ininterval(lower=-Inf, upper=+Inf, open=c(TRUE,TRUE))
# valued: ininterval(lower=-Inf, upper=+Inf, open=c(TRUE,TRUE))
lower |
defaults to -Inf |
upper |
defaults to +Inf |
open |
a |
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, undirected, valued
This term adds one statistic to the model, equal to the number of triads in
the network that are intransitive. The intransitive triads are those of type
111D
, 201
, 111U
, 021C
, or 030C
in the
categorization of Davis and Leinhardt (1972). For details on the 16 possible
triad types, see triad.classify
in the
sna package. Note the distinction from the ctriple
term.
# binary: intransitive
# binary: intransitive
This term can only be used with directed networks.
ergmTerm
for index of model terms currently visible to the package.
directed, triad-related, binary
These functions test whether an ERGM fit, formula, or some other object represents a curved exponential family.
The method for NULL
always returns FALSE
by
convention.
is.curved(object, ...) ## S3 method for class ''NULL'' is.curved(object, ...) ## S3 method for class 'formula' is.curved(object, response = NULL, basis = NULL, ...) ## S3 method for class 'ergm' is.curved(object, ...)
is.curved(object, ...) ## S3 method for class ''NULL'' is.curved(object, ...) ## S3 method for class 'formula' is.curved(object, response = NULL, basis = NULL, ...) ## S3 method for class 'ergm' is.curved(object, ...)
object |
An |
... |
Arguments passed on to lower-level functions. |
response |
Either a character string, a formula, or
|
basis |
See |
Curvature is checked by testing if all model parameters are canonical.
TRUE
if the object represents a
curved exponential family; FALSE
otherwise.
These functions test whether an ERGM fit, a formula, or some other object represents a dyad-independent model.
The method for NULL
always returns TRUE
by
convention.
is.dyad.independent(object, ...) ## S3 method for class ''NULL'' is.dyad.independent(object, ...) ## S3 method for class 'formula' is.dyad.independent(object, response = NULL, basis = NULL, ...) ## S3 method for class 'ergm_conlist' is.dyad.independent(object, object.obs = NULL, ...) ## S3 method for class 'ergm' is.dyad.independent(object, how = c("overall", "terms", "space"), ...)
is.dyad.independent(object, ...) ## S3 method for class ''NULL'' is.dyad.independent(object, ...) ## S3 method for class 'formula' is.dyad.independent(object, response = NULL, basis = NULL, ...) ## S3 method for class 'ergm_conlist' is.dyad.independent(object, object.obs = NULL, ...) ## S3 method for class 'ergm' is.dyad.independent(object, how = c("overall", "terms", "space"), ...)
object |
The object to be tested for dyadic independence. |
... |
Unused at this time. |
response |
Either a character string, a formula, or
|
basis |
See |
object.obs |
For the |
how |
one of |
Dyad independence is determined by checking if all of the constituent parts of the object (formula, ergm terms, constraints, etc.) are flagged as dyad-independent.
TRUE
if the model implied by the object is
dyad-independent; FALSE
otherwise.
Function to check whether an ERGM fit or some aspect of it is valued
is.valued(object, ...) ## S3 method for class 'ergm_state' is.valued(object, ...) ## S3 method for class 'edgelist' is.valued(object, ...) ## S3 method for class 'ergm' is.valued(object, ...) ## S3 method for class 'network' is.valued(object, ...)
is.valued(object, ...) ## S3 method for class 'ergm_state' is.valued(object, ...) ## S3 method for class 'edgelist' is.valued(object, ...) ## S3 method for class 'ergm' is.valued(object, ...) ## S3 method for class 'network' is.valued(object, ...)
object |
the object to be tested. |
... |
additional arguments for methods, currently unused. |
is.valued(ergm_state)
: a method for ergm_state
objects.
is.valued(edgelist)
: a method for edgelist
objects.
is.valued(ergm)
: a method for ergm
objects.
is.valued(network)
: a method for network
objects that tests whether the network has been instrumented with a valued %ergmlhs%
"response"
specification, typically by ergm_preprocess_response()
. Note that it is not a test for whether a network has edge attributes. This method is primarily for internal use.
This term adds one statistic to the model equal to the number of isolated edges in the network, i.e., the number of edges each of whose endpoints has degree 1. This term can only be used with undirected networks.
# binary: isolatededges
# binary: isolatededges
ergmTerm
for index of model terms currently visible to the package.
bipartite, undirected, binary
This term adds one statistic to the model equal to the number of isolates in the network. For an undirected network, an isolate is defined to be any node with degree zero. For a directed network, an isolate is any node with both in-degree and out-degree equal to zero.
# binary: isolates
# binary: isolates
ergmTerm
for index of model terms currently visible to the package.
directed, frequently-used, undirected, binary
This term adds one network statistic to the
model for each element in k
. The th such statistic counts the
number of distinct
k[i]
-instars in the network, where a
-instar is defined to be a node
and a set of
different nodes
such that the ties
exist for
.
This term can only be used for directed
networks; for undirected networks see
kstar
. Note that
istar(1)
is equal to both ostar(1)
and edges
.
# binary: istar(k, attr=NULL, levels=NULL)
# binary: istar(k, attr=NULL, levels=NULL)
k |
a vector of distinct integers |
attr , levels
|
a vertex attribute specification; if |
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, binary
This well-known social network dataset, collected by Bruce Kapferer in Zambia from June 1965 to August 1965, involves interactions among workers in a tailor shop as observed by Kapferer himself.
data(kapferer)
data(kapferer)
Two network
objects, kapferer
and kapferer2
.
The kapferer
dataset contains only the 39 individuals who were
present at both data-collection time periods. However, these data only
reflect data collected during the first period. The individuals' names are
included as a nodal covariate called names
.
An interaction is defined by Kapferer as "continuous uninterrupted social activity involving the participation of at least two persons"; only transactions that were relatively frequent are recorded. All of the interactions in this particular dataset are "sociational", as opposed to "instrumental". Kapferer explains the difference (p. 164) as follows:
"I have classed as transactions which were sociational in content those where the activity was markedly convivial such as general conversation, the sharing of gossip and the enjoyment of a drink together. Examples of instrumental transactions are the lending or giving of money, assistance at times of personal crisis and help at work."
Kapferer also observed and recorded instrumental transactions, many of which are unilateral (directed) rather than reciprocal (undirected), though those transactions are not recorded here. In addition, there was a second period of data collection, from September 1965 to January 1966, but these data are also not recorded here. All data are given in Kapferer's 1972 book on pp. 176-179.
During the first time period, there were 43 individuals working in this
particular tailor shop; however, the better-known dataset includes only
those 39 individuals who were present during both time collection periods.
(Missing are the workers named Lenard, Peter, Lazarus, and Laurent.) Thus,
we give two separate network datasets here: kapferer
is the
well-known 39-individual dataset, whereas kapferer2
is the full
43-individual dataset.
Original source: Kapferer, Bruce (1972), Strategy and Transaction in an African Factory, Manchester University Press.
-starsThis term adds one
network statistic to the model for each element in k
. The th
such statistic counts the number of distinct
k[i]
-stars in the
network, where a -star is defined to be a node
and a set of
different nodes
such that the ties
exist for
.
This term can only be used for undirected networks; for directed
networks, see
istar
, ostar
, twopath
and m2star
.
Note that kstar(1)
is equal to edges
.
# binary: kstar(k, attr=NULL, levels=NULL)
# binary: kstar(k, attr=NULL, levels=NULL)
k |
a vector of distinct integers |
attr , levels
|
a vertex attribute specification; if |
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, undirected, binary
This operator evaluates formula
without modification, but modifies its coefficient and/or parameter names based on label
and pos
.
# binary: Label(formula, label, pos) # valued: Label(formula, label, pos)
# binary: Label(formula, label, pos) # valued: Label(formula, label, pos)
formula |
a one-sided |
label |
a character vector specifying the label for the terms, a |
pos |
controls how |
If pos == "replace"
:
Elements for which is.na(label) == TRUE
are preserved.
If the model is curved, label=
can be a either function/mapper
or a list
with two elements, the first element giving the
curved (model) parameter names and second giving the canonical
parameter names. NULL
leaves the respective name unchanged.
ergmTerm
for index of model terms currently visible to the package.
operator, binary, valued
This term adds one statistic to the model equal to the number of triangles
in the network between nodes "close to" each other. For an undirected
network, a local triangle is defined to be any set of three edges between
nodal pairs that are in the same neighborhood.
For a directed network, a triangle is defined as any set of three edges
and either
or
where again all nodes are
within the same neighborhood.
# binary: localtriangle(x)
# binary: localtriangle(x)
x |
an undirected
network or an symmetric adjacency matrix that specifies whether the two nodes
are in the same neighborhood. Note that |
ergmTerm
for index of model terms currently visible to the package.
categorical dyadic attribute, directed, triad-related, undirected, binary
Evaluate the terms specified in formula
and takes a natural (base ) logarithm of them. Since an ERGM statistic must be finite,
log0
specifies the value to be substituted for log(0)
. The default value seems reasonable for most purposes.
# binary: Log(formula, log0=-1/sqrt(.Machine$double.eps)) # valued: Log(formula, log0=-1/sqrt(.Machine$double.eps))
# binary: Log(formula, log0=-1/sqrt(.Machine$double.eps)) # valued: Log(formula, log0=-1/sqrt(.Machine$double.eps))
formula |
a one-sided |
log0 |
the value to be substituted for |
ergmTerm
for index of model terms currently visible to the package.
operator, binary, valued
logLik()
method for ergm
fits.A function to return the log-likelihood associated with an
ergm
fit, evaluating it if
necessary. If the log-likelihood was not computed for
object
, produces an error unless eval.loglik=TRUE
.
## S3 method for class 'ergm' logLik( object, add = FALSE, force.reeval = FALSE, eval.loglik = add || force.reeval, control = control.logLik.ergm(), ..., verbose = FALSE ) ## S3 method for class 'ergm' deviance(object, ...) ## S3 method for class 'ergm' AIC(object, ..., k = 2) ## S3 method for class 'ergm' BIC(object, ...)
## S3 method for class 'ergm' logLik( object, add = FALSE, force.reeval = FALSE, eval.loglik = add || force.reeval, control = control.logLik.ergm(), ..., verbose = FALSE ) ## S3 method for class 'ergm' deviance(object, ...) ## S3 method for class 'ergm' AIC(object, ..., k = 2) ## S3 method for class 'ergm' BIC(object, ...)
object |
|
add |
Logical: If |
force.reeval |
Logical: If |
eval.loglik |
Logical: If |
control |
A list of control parameters for algorithm tuning,
typically constructed with |
... |
Other arguments to the likelihood functions. |
verbose |
A logical or an integer to control the amount of
progress and diagnostic information to be printed. |
k |
see help for |
The form of the output of logLik.ergm
depends on
add
: add=FALSE
(the default), a
logLik
object. If add=TRUE
(the default), an
ergm
object with the log-likelihood
set.
As of version 3.1, all likelihoods for which logLikNull
is
not implemented are computed relative to the reference
measure. (I.e., a null model, with no terms, is defined to have
likelihood of 0, and all other models are defined relative to
that.)
deviance(ergm)
: A deviance()
method.
AIC(ergm)
: An AIC()
method.
BIC(ergm)
: A BIC()
method.
Hunter, D. R. and Handcock, M. S. (2006) Inference in curved exponential family models for networks, Journal of Computational and Graphical Statistics.
logLik()
, logLikNull()
, ergm.bridge.llr()
,
ergm.bridge.dindstart.llk()
# See help(ergm) for a description of this model. The likelihood will # not be evaluated. data(florentine) ## Not run: # The default maximum number of iterations is currently 20. We'll only # use 2 here for speed's sake. gest <- ergm(flomarriage ~ kstar(1:2) + absdiff("wealth") + triangle, eval.loglik=FALSE) gest <- ergm(flomarriage ~ kstar(1:2) + absdiff("wealth") + triangle, eval.loglik=FALSE, control=control.ergm(MCMLE.maxit=2)) # Log-likelihood is not evaluated, so no deviance, AIC, or BIC: summary(gest) # Evaluate the log-likelihood and attach it to the object. # The default number of bridges is currently 20. We'll only use 3 here # for speed's sake. gest.logLik <- logLik(gest, add=TRUE) gest.logLik <- logLik(gest, add=TRUE, control=control.logLik.ergm(bridge.nsteps=3)) # Deviances, AIC, and BIC are now shown: summary(gest.logLik) # Null model likelihood can also be evaluated, but not for all constraints: logLikNull(gest) # == network.dyadcount(flomarriage)*log(1/2) ## End(Not run)
# See help(ergm) for a description of this model. The likelihood will # not be evaluated. data(florentine) ## Not run: # The default maximum number of iterations is currently 20. We'll only # use 2 here for speed's sake. gest <- ergm(flomarriage ~ kstar(1:2) + absdiff("wealth") + triangle, eval.loglik=FALSE) gest <- ergm(flomarriage ~ kstar(1:2) + absdiff("wealth") + triangle, eval.loglik=FALSE, control=control.ergm(MCMLE.maxit=2)) # Log-likelihood is not evaluated, so no deviance, AIC, or BIC: summary(gest) # Evaluate the log-likelihood and attach it to the object. # The default number of bridges is currently 20. We'll only use 3 here # for speed's sake. gest.logLik <- logLik(gest, add=TRUE) gest.logLik <- logLik(gest, add=TRUE, control=control.logLik.ergm(bridge.nsteps=3)) # Deviances, AIC, and BIC are now shown: summary(gest.logLik) # Null model likelihood can also be evaluated, but not for all constraints: logLikNull(gest) # == network.dyadcount(flomarriage)*log(1/2) ## End(Not run)
Calculate the null model likelihood
logLikNull(object, ...) ## S3 method for class 'ergm' logLikNull(object, control = control.logLik.ergm(), ...)
logLikNull(object, ...) ## S3 method for class 'ergm' logLikNull(object, control = control.logLik.ergm(), ...)
object |
a fitted model. |
... |
further arguments to lower-level functions.
|
control |
A list of control parameters for algorithm tuning,
typically constructed with |
logLikNull
returns an object of type logLik
if it is
able to compute the null model probability, and NA
otherwise.
logLikNull(ergm)
: A method for ergm
fits; currently only
implemented for binary ERGMs with dyad-independent sample-space
constraints.
This term adds one statistic to the model, equal to the number of mixed
2-stars in the network, where a mixed 2-star is a pair of distinct edges
. A mixed 2-star is
sometimes called a 2-path because it is a directed path of length 2 from
to
via
. However, in the case of a 2-path the focus
is usually on the endpoints
and
, whereas for a mixed 2-star
the focus is usually on the midpoint
. This term can only be used
with directed networks; for undirected networks see
kstar(2)
. See
also twopath
.
# binary: m2star
# binary: m2star
ergmTerm
for index of model terms currently visible to the package.
directed, binary
This function prints diagnistic information and creates simple diagnostic plots for MCMC sampled statistics produced from a fit.
mcmc.diagnostics(object, ...) ## S3 method for class 'ergm' mcmc.diagnostics( object, center = TRUE, esteq = TRUE, vars.per.page = 3, which = c("plots", "texts", "summary", "autocorrelation", "crosscorrelation", "burnin"), compact = FALSE, ... )
mcmc.diagnostics(object, ...) ## S3 method for class 'ergm' mcmc.diagnostics( object, center = TRUE, esteq = TRUE, vars.per.page = 3, which = c("plots", "texts", "summary", "autocorrelation", "crosscorrelation", "burnin"), compact = FALSE, ... )
object |
A model fit object to be diagnosed. |
... |
Additional arguments, to be passed to plotting functions. |
center |
Logical: If |
esteq |
Logical: If |
vars.per.page |
Number of rows (one variable per row) per plotting page. Ignored if latticeExtra package is not installed. |
which |
A character vector specifying which diagnostics to plot and/or print. Defaults to all of the below if meaningful:
Partial matching is supported. (E.g., |
compact |
Numeric: For diagnostics that print variables in
columns (e.g. correlations, hypothesis test p-values), try to
abbreviate variable names to this many characters and round the
numbers to |
A pair of plots are produced for each statistic:a trace of the sampled output statistic values on the left and density estimate for each variable in the MCMC chain on the right. Diagnostics printed to the console include correlations and convergence diagnostics.
For ergm()
specifically, recent changes in the
estimation algorithm mean that these plots can no longer be used
to ensure that the mean statistics from the model match the
observed network statistics. For that functionality, please use
the GOF command: gof(object, GOF=~model)
.
In fact, an ergm()
output object contains the sample of
statistics from the last MCMC run as element $sample
. If
missing data MLE is fit, the corresponding element is named
$sample.obs
. These are objects of mcmc
and can be used
directly in the coda package to assess MCMC
convergence.
More information can be found by looking at the documentation of
ergm()
.
mcmc.diagnostics(ergm)
:
Raftery, A.E. and Lewis, S.M. (1995). The number of iterations, convergence diagnostics and generic Metropolis algorithms. In Practical Markov Chain Monte Carlo (W.R. Gilks, D.J. Spiegelhalter and S. Richardson, eds.). London, U.K.: Chapman and Hall.
ergm()
, network package, coda package,
summary.ergm()
## Not run: # data(florentine) # # test the mcmc.diagnostics function # gest <- ergm(flomarriage ~ edges + kstar(2)) summary(gest) # # Plot the probabilities first # mcmc.diagnostics(gest) # # Use coda directly # library(coda) # plot(gest$sample, ask=FALSE) # # A full range of diagnostics is available # using codamenu() # ## End(Not run)
## Not run: # data(florentine) # # test the mcmc.diagnostics function # gest <- ergm(flomarriage ~ edges + kstar(2)) summary(gest) # # Plot the probabilities first # mcmc.diagnostics(gest) # # Use coda directly # library(coda) # plot(gest$sample, ask=FALSE) # # A full range of diagnostics is available # using codamenu() # ## End(Not run)
This term adds one network statistic to the model equal to the
average degree of a node. Note that this term is a constant multiple of
both edges
and density
.
# binary: meandeg
# binary: meandeg
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, undirected, binary
attrs
is the rows of the mixing matrix and whose RHS
gives that for its columns (which may be different). A one-sided
formula (e.g., ~A
) is symmetrized (e.g., A~A
). A two-sided
formula with a dot on one side calculates the margins of the
mixing matrix, analogously to nodefactor
, with A~.
calculating the row/sender/b1 margins and .~A
calculating the
column/receiver/b2 margins. If row and column attributes are the
same and the network is undirected, only the cells at or above
the diagonal (where ) will be calculated.
# binary: mm(attrs, levels=NULL, levels2=-1) # valued: mm(attrs, levels=NULL, levels2=-1, form="sum")
# binary: mm(attrs, levels=NULL, levels2=-1) # valued: mm(attrs, levels=NULL, levels2=-1, form="sum")
attrs |
a two-sided formula whose LHS gives the attribute or
attribute function (see Specifying Vertex attributes and Levels ( |
levels |
subset of rows and columns to be used. (See Specifying Vertex
attributes and Levels ( |
levels2 |
which specific cells of the matrix to include; |
form |
character how to aggregate tie values in a valued ERGM |
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, dyad-independent, frequently-used, undirected, binary, valued
This is a synthetic network of 20 nodes that is used as an example within
the ergm()
documentation. It has an interesting elongated shape
reminencent of a chemical molecule. It is stored as a
network
object.
data(molecule)
data(molecule)
florentine, sampson, network, plot.network, ergm
In binary ERGMs, equal to the number of
pairs of actors and
for which
and
both exist. For valued ERGMs, equal to
,
where
is determined by
form
argument: "min"
for ,
"nabsdiff"
for
,
"product"
for
, and
"geometric"
for
. See Krivitsky (2012) for a
discussion of these statistics.
form="threshold"
simply
computes the binary mutuality
after
thresholding at threshold
.
This term can only be used with directed networks.
# binary: mutual(same=NULL, by=NULL, diff=FALSE, keep=NULL, levels=NULL) # valued: mutual(form="min",threshold=0)
# binary: mutual(same=NULL, by=NULL, diff=FALSE, keep=NULL, levels=NULL) # valued: mutual(form="min",threshold=0)
same |
if the optional argument is passed
(see Specifying Vertex attributes and Levels ( |
by |
if the optional argument is passed (see Specifying Vertex attributes and Levels ( |
keep |
deprecated |
levels |
which statistics should be kept whenever the |
form |
character how to aggregate tie values in a valued ERGM |
The argument keep
is retained for backwards compatibility and may be
removed in a future version. When both keep
and levels
are passed,
levels
overrides keep
.
ergmTerm
for index of model terms currently visible to the package.
directed, frequently-used, binary, valued
This term adds one statistic to the model equal to the number of near Simmelian triads, as defined by Krackhardt and Handcock (2007). This is a sub-graph of size three which is exactly one tie short of being complete.
# binary: nearsimmelian
# binary: nearsimmelian
This term can only be used with directed networks.
ergmTerm
for index of model terms currently visible to the package.
directed, triad-related, binary
network
objects, output
by simulate.ergm()
among others.A convenience container for a list of network
objects, output
by simulate.ergm()
among others.
network.list(object, ...) ## S3 method for class 'network.list' print(x, stats.print = FALSE, ...) ## S3 method for class 'network.list' summary( object, stats.print = TRUE, net.print = FALSE, net.summary = FALSE, ... )
network.list(object, ...) ## S3 method for class 'network.list' print(x, stats.print = FALSE, ...) ## S3 method for class 'network.list' summary( object, stats.print = TRUE, net.print = FALSE, net.summary = FALSE, ... )
object , x
|
a |
... |
for |
stats.print |
Logical: If TRUE, print network statistics. |
net.print |
Logical: If TRUE, print network overviews. |
net.summary |
Logical: If TRUE, print network summaries. |
print(network.list)
: A print()
method for network lists.
summary(network.list)
: A summary()
method for network lists.
# Draw from a Bernoulli model with 16 nodes # and tie probability 0.1 # g.use <- network(16, density=0.1, directed=FALSE) # # Starting from this network let's draw 3 realizations # of a model with edges and 2-star terms # g.sim <- simulate(~edges+kstar(2), nsim=3, coef=c(-1.8, 0.03), basis=g.use, control=control.simulate( MCMC.burnin=100000, MCMC.interval=1000)) print(g.sim) summary(g.sim)
# Draw from a Bernoulli model with 16 nodes # and tie probability 0.1 # g.use <- network(16, density=0.1, directed=FALSE) # # Starting from this network let's draw 3 realizations # of a model with edges and 2-star terms # g.sim <- simulate(~edges+kstar(2), nsim=3, coef=c(-1.8, 0.03), basis=g.use, control=control.simulate( MCMC.burnin=100000, MCMC.interval=1000)) print(g.sim) summary(g.sim)
This document describes the ways to specify nodal
attributes or functions of nodal attributes and which levels for
categorical factors to include. For the helper functions to
facilitate this, see nodal_attributes-API
.
LARGEST(l, a) SMALLEST(l, a) COLLAPSE_SMALLEST(object, n, into)
LARGEST(l, a) SMALLEST(l, a) COLLAPSE_SMALLEST(object, n, into)
object , l , a , n , into
|
|
Term nodal attribute arguments, typically called attr
, attrs
, by
, or
on
are interpreted as follows:
Extract the vertex attribute with this name.
Extract the vertex attributes and paste them together, separated by dots if the term expects categorical attributes and (typically) combine into a covariate matrix if it expects quantitative attributes.
The function is called on the LHS network and
additional arguments to ergm_get_vattr()
, expected to return a
vector or matrix of appropriate dimension. (Shorter vectors and
matrix columns will be recycled as needed.)
The expression on the RHS of the formula is
evaluated in an environment of the vertex attributes of the
network, expected to return a vector or matrix of appropriate
dimension. (Shorter vectors and matrix columns will be recycled as
needed.) Within this expression, the network itself accessible as
either .
or .nw
. For example,
nodecov(~abs(Grade-mean(Grade))/network.size(.))
would return the
absolute difference of each actor's "Grade" attribute from its
network-wide mean, divided by the network size.
AsIs
object created by I()
Use as is, checking only for correct length and type.
Any of these arguments may also be wrapped in or piped through
COLLAPSE_SMALLEST(attr, n, into)
or, attr %>% COLLAPSE_SMALLEST(n, into)
, a convenience function that will
transform the attribute by collapsing the smallest n
categories
into one, naming it into
. Note that into
must be of the same
type (numeric, character, etc.) as the vertex attribute in
question. If there are ties for n
th smallest category, they will
be broken in lexicographic order, and a warning will be issued.
The name the nodal attribute receives in the statistic can be
overridden by setting a an attr()
-style attribute "name"
.
For categorical attributes, to select which levels are of interest
and their ordering, use the argument levels
. Selection of nodes (from
the appropriate vector of nodal indices) is likewise handled as the
selection of levels, using the argument nodes
. These arguments are interpreted
as follows:
I()
Use the given list of levels as is.
Used for indexing of a list of
all possible levels (typically, unique values of the attribute) in
default older (typically lexicographic), i.e.,
sort(unique(attr))[levels]
. In particular, levels=TRUE
will
retain all levels. Negative values exclude. Another special value
is LARGEST
, which will refer to the most frequent category, so,
say, to set such a category as the baseline, pass
levels=-LARGEST
. In addition, LARGEST(n)
will refer to the n
largest categories. SMALLEST
works analogously. If there are ties
in frequencies, they will be broken in lexicographic order, and a
warning will be issued. To specify numeric or logical levels
literally, wrap in I()
.
NULL
Retain all possible levels; usually equivalent to
passing TRUE
.
Use as is.
The function is called on the list of unique values of the attribute, the values of the attribute themselves, and the network itself, depending on its arity. Its return value is interpreted as above.
The expression on the RHS of the formula is
evaluated in an environment in which the network itself is
accessible as .nw
, the list of unique values of the attribute as
.
or as .levels
, and the attribute vector itself as
.attr
. Its return value is interpreted as above.
For mixing effects (i.e., level2=
arguments), a
matrix can be used to select elements of the mixing matrix, either
by specifying a logical (TRUE
and FALSE
) matrix of the same
dimension as the mixing matrix to select the corresponding cells or
a two-column numeric matrix indicating giving the coordinates of
cells to be used.
Note that levels
, nodes
, and others often have a default that is sensible for the
term in question.
library(magrittr) # for %>% data(faux.mesa.high) # Activity by grade with a baseline grade excluded: summary(faux.mesa.high~nodefactor(~Grade)) # Name overrides: summary(faux.mesa.high~nodefactor("Form"~Grade)) # Only for terms that don't use the LHS. summary(faux.mesa.high~nodefactor(~structure(Grade,name="Form"))) # Retain all levels: summary(faux.mesa.high~nodefactor(~Grade, levels=TRUE)) # or levels=NULL # Use the largest grade as baseline (also Grade 7): summary(faux.mesa.high~nodefactor(~Grade, levels=-LARGEST)) # Activity by grade with no baseline smallest two grades (11 and # 12) collapsed into a new category, labelled 0: table(faux.mesa.high %v% "Grade") summary(faux.mesa.high~nodefactor((~Grade) %>% COLLAPSE_SMALLEST(2, 0), levels=TRUE)) # Handling of tied frequencies faux.mesa.high %v% "Plans" <- sample(rep(c("College", "Trade School", "Apprenticeship", "Undecided"), c(80,80,20,25))) summary(faux.mesa.high ~ nodefactor("Plans", levels = -LARGEST)) # Mixing between lower and upper grades: summary(faux.mesa.high~mm(~Grade>=10)) # Mixing between grades 7 and 8 only: summary(faux.mesa.high~mm("Grade", levels=I(c(7,8)))) # or summary(faux.mesa.high~mm("Grade", levels=1:2)) # or using levels2 (see ? mm) to filter the combinations of levels, summary(faux.mesa.high~mm("Grade", levels2=~sapply(.levels, function(l) l[[1]]%in%c(7,8) && l[[2]]%in%c(7,8)))) # Here are some less complex ways to specify levels2. This is the # full list of combinations of sexes in an undirected network: summary(faux.mesa.high~mm("Sex", levels2=TRUE)) # Select only the second combination: summary(faux.mesa.high~mm("Sex", levels2=2)) # Equivalently, summary(faux.mesa.high~mm("Sex", levels2=-c(1,3))) # or summary(faux.mesa.high~mm("Sex", levels2=c(FALSE,TRUE,FALSE))) # Select all *but* the second one: summary(faux.mesa.high~mm("Sex", levels2=-2)) # Select via a mixing matrix: (Network is undirected and # attributes are the same on both sides, so we can use either M or # its transpose.) (M <- matrix(c(FALSE,TRUE,FALSE,FALSE),2,2)) summary(faux.mesa.high~mm("Sex", levels2=M)+mm("Sex", levels2=t(M))) # Select via an index of a cell: idx <- cbind(1,2) summary(faux.mesa.high~mm("Sex", levels2=idx)) # Or, select by specific attribute value combinations, though note # the names 'row' and 'col' and the order for undirected networks: summary(faux.mesa.high~mm("Sex", levels2 = I(list(list(row="M",col="M"), list(row="M",col="F"), list(row="F",col="M"))))) # Note the warning: in an undirected network with identical row and # column attributes, the mixing matrix is symmetric and only the # upper triangle (where row < column) is valid, so the [M,F] cell # will get a statistic of 0 with a warning. # mm() term allows two-sided attribute formulas with different attributes: summary(faux.mesa.high~mm(Grade~Race, levels2=TRUE)) # It is possible to have collapsing functions in the formula; note # the parentheses around "~Race": this is because a formula # operator (~) has lower precedence than pipe (|>): summary(faux.mesa.high~mm(Grade~(~Race) %>% COLLAPSE_SMALLEST(3,"BWO"), levels2=TRUE)) # Some terms, such as nodecov(), accept matrices of nodal # covariates. An certain R quirk means that columns whose # expressions are not typical variable names have their names # dropped and need to be adjusted. Consider, for example, the # linear and quadratic effects of grade: Grade <- faux.mesa.high %v% "Grade" colnames(cbind(Grade, Grade^2)) # Second column name missing. colnames(cbind(Grade, Grade2=Grade^2)) # Can be set manually, colnames(cbind(Grade, `Grade^2`=Grade^2)) # even to non-variable-names. colnames(cbind(Grade, Grade^2, deparse.level=2)) # Alternatively, deparse.level=2 forces naming. rm(Grade) # Therefore, the nodal attribute names are set as follows: summary(faux.mesa.high~nodecov(~cbind(Grade, Grade^2))) # column names dropped with a warning summary(faux.mesa.high~nodecov(~cbind(Grade, Grade2=Grade^2))) # column names set manually summary(faux.mesa.high~nodecov(~cbind(Grade, Grade^2, deparse.level=2))) # using deparse.level=2 # Activity by grade with a random covariate. Note that setting an attribute "name" gives it a name: randomcov <- structure(I(rbinom(network.size(faux.mesa.high),1,0.5)), name="random") summary(faux.mesa.high~nodefactor(I(randomcov)))
library(magrittr) # for %>% data(faux.mesa.high) # Activity by grade with a baseline grade excluded: summary(faux.mesa.high~nodefactor(~Grade)) # Name overrides: summary(faux.mesa.high~nodefactor("Form"~Grade)) # Only for terms that don't use the LHS. summary(faux.mesa.high~nodefactor(~structure(Grade,name="Form"))) # Retain all levels: summary(faux.mesa.high~nodefactor(~Grade, levels=TRUE)) # or levels=NULL # Use the largest grade as baseline (also Grade 7): summary(faux.mesa.high~nodefactor(~Grade, levels=-LARGEST)) # Activity by grade with no baseline smallest two grades (11 and # 12) collapsed into a new category, labelled 0: table(faux.mesa.high %v% "Grade") summary(faux.mesa.high~nodefactor((~Grade) %>% COLLAPSE_SMALLEST(2, 0), levels=TRUE)) # Handling of tied frequencies faux.mesa.high %v% "Plans" <- sample(rep(c("College", "Trade School", "Apprenticeship", "Undecided"), c(80,80,20,25))) summary(faux.mesa.high ~ nodefactor("Plans", levels = -LARGEST)) # Mixing between lower and upper grades: summary(faux.mesa.high~mm(~Grade>=10)) # Mixing between grades 7 and 8 only: summary(faux.mesa.high~mm("Grade", levels=I(c(7,8)))) # or summary(faux.mesa.high~mm("Grade", levels=1:2)) # or using levels2 (see ? mm) to filter the combinations of levels, summary(faux.mesa.high~mm("Grade", levels2=~sapply(.levels, function(l) l[[1]]%in%c(7,8) && l[[2]]%in%c(7,8)))) # Here are some less complex ways to specify levels2. This is the # full list of combinations of sexes in an undirected network: summary(faux.mesa.high~mm("Sex", levels2=TRUE)) # Select only the second combination: summary(faux.mesa.high~mm("Sex", levels2=2)) # Equivalently, summary(faux.mesa.high~mm("Sex", levels2=-c(1,3))) # or summary(faux.mesa.high~mm("Sex", levels2=c(FALSE,TRUE,FALSE))) # Select all *but* the second one: summary(faux.mesa.high~mm("Sex", levels2=-2)) # Select via a mixing matrix: (Network is undirected and # attributes are the same on both sides, so we can use either M or # its transpose.) (M <- matrix(c(FALSE,TRUE,FALSE,FALSE),2,2)) summary(faux.mesa.high~mm("Sex", levels2=M)+mm("Sex", levels2=t(M))) # Select via an index of a cell: idx <- cbind(1,2) summary(faux.mesa.high~mm("Sex", levels2=idx)) # Or, select by specific attribute value combinations, though note # the names 'row' and 'col' and the order for undirected networks: summary(faux.mesa.high~mm("Sex", levels2 = I(list(list(row="M",col="M"), list(row="M",col="F"), list(row="F",col="M"))))) # Note the warning: in an undirected network with identical row and # column attributes, the mixing matrix is symmetric and only the # upper triangle (where row < column) is valid, so the [M,F] cell # will get a statistic of 0 with a warning. # mm() term allows two-sided attribute formulas with different attributes: summary(faux.mesa.high~mm(Grade~Race, levels2=TRUE)) # It is possible to have collapsing functions in the formula; note # the parentheses around "~Race": this is because a formula # operator (~) has lower precedence than pipe (|>): summary(faux.mesa.high~mm(Grade~(~Race) %>% COLLAPSE_SMALLEST(3,"BWO"), levels2=TRUE)) # Some terms, such as nodecov(), accept matrices of nodal # covariates. An certain R quirk means that columns whose # expressions are not typical variable names have their names # dropped and need to be adjusted. Consider, for example, the # linear and quadratic effects of grade: Grade <- faux.mesa.high %v% "Grade" colnames(cbind(Grade, Grade^2)) # Second column name missing. colnames(cbind(Grade, Grade2=Grade^2)) # Can be set manually, colnames(cbind(Grade, `Grade^2`=Grade^2)) # even to non-variable-names. colnames(cbind(Grade, Grade^2, deparse.level=2)) # Alternatively, deparse.level=2 forces naming. rm(Grade) # Therefore, the nodal attribute names are set as follows: summary(faux.mesa.high~nodecov(~cbind(Grade, Grade^2))) # column names dropped with a warning summary(faux.mesa.high~nodecov(~cbind(Grade, Grade2=Grade^2))) # column names set manually summary(faux.mesa.high~nodecov(~cbind(Grade, Grade^2, deparse.level=2))) # using deparse.level=2 # Activity by grade with a random covariate. Note that setting an attribute "name" gives it a name: randomcov <- structure(I(rbinom(network.size(faux.mesa.high),1,0.5)), name="random") summary(faux.mesa.high~nodefactor(I(randomcov)))
This term adds a single network statistic for each quantitative attribute or matrix column to the model equaling the sum of
attr(i)
and attr(j)
for all edges in the
network. For categorical attributes, see
nodefactor
. Note that for
directed networks, nodecov
equals nodeicov
plus
nodeocov
.
# binary: nodecov(attr) # binary: nodemain # valued: nodecov(attr, form="sum") # valued: nodemain(attr, form="sum")
# binary: nodecov(attr) # binary: nodemain # valued: nodecov(attr, form="sum") # valued: nodemain(attr, form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
ergm versions 3.9.4 and earlier used different arguments for this
term. See ergm-options
for how to invoke the old behaviour.
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, frequently-used, quantitative nodal attribute, undirected, binary, valued
This term adds one statistic equal to
. This can be
viewed as a valued analog of the
star(2)
statistic.
# valued: nodecovar(center, transform)
# valued: nodecovar(center, transform)
center |
If |
transform |
If |
Note that this term replaces nodesqrtcovar
, which has been
deprecated in favor of nodecovar(transform="sqrt")
.
ergmTerm
for index of model terms currently visible to the package.
directed, valued
This term adds a single network statistic equalling the sum over the nodes of the range over of its neighbors' values.
# binary: nodecovrange(attr)
# binary: nodecovrange(attr)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
This is a network analogue of the statistic introduced by Hoffman et al. (2023).
Hoffman M, Block P, Snijders TAB (2023). “Modeling Partitions of Individuals.” Sociological Methodology, 53(1), 1–41. ISSN 1467-9531, doi:10.1177/00811750221145166.
ergmTerm
for index of model terms currently visible to the package.
directed, quantitative nodal attribute, undirected, binary
This term adds multiple network statistics to the
model, one for each of (a subset of) the unique values of the
attr
attribute (or each combination of the attributes
given). Each of these statistics gives the number of times a node
with that attribute or those attributes appears in an edge in the
network.
# binary: nodefactor(attr, base=1, levels=-1) # valued: nodefactor(attr, base=1, levels=-1, form="sum")
# binary: nodefactor(attr, base=1, levels=-1) # valued: nodefactor(attr, base=1, levels=-1, form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
base |
deprecated |
levels |
this optional argument controls which levels of the attribute
attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
To include all attribute values is usually not a good idea, because
the sum of all such statistics equals the number of edges and hence a linear
dependency would arise in any model also including edges
. The default,
levels=-1
, is therefore to omit the first (in lexicographic order)
attribute level. To include all levels, pass either levels=TRUE
(i.e., keep all levels) or levels=NULL
(i.e., do not filter levels).
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels
are passed,
levels
overrides base
.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, dyad-independent, frequently-used, undirected, binary, valued
This term adds a single network statistic to the model, counting, for each node, the number of distinct values of the attribute found among its neighbors.
# binary: nodefactordistinct(attr, levels=TRUE)
# binary: nodefactordistinct(attr, levels=TRUE)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
levels |
this optional argument controls which levels of the attribute
attributes and Levels ( |
This is a network analogue of the statistic introduced by Hoffman et al. (2023).
Hoffman M, Block P, Snijders TAB (2023). “Modeling Partitions of Individuals.” Sociological Methodology, 53(1), 1–41. ISSN 1467-9531, doi:10.1177/00811750221145166.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, undirected, binary
This term adds a single network statistic for each quantitative attribute or matrix column to the model equaling the total
value of attr(j)
for all edges in the network. This
term may only be used with directed networks. For categorical attributes,
see
nodeifactor
.
# binary: nodeicov(attr) # valued: nodeicov(attr, form="sum")
# binary: nodeicov(attr) # valued: nodeicov(attr, form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
ergm versions 3.9.4 and earlier used different arguments for this
term. See ergm-options
for how to invoke the old behaviour.
ergmTerm
for index of model terms currently visible to the package.
directed, frequently-used, quantitative nodal attribute, binary, valued
This term adds one statistic equal to
. This can be
viewed as a valued analog of the
istar(2)
statistic.
# valued: nodeicovar(center, transform)
# valued: nodeicovar(center, transform)
center |
If |
transform |
If |
Note that this term replaces nodeisqrtcovar
, which has been
deprecated in favor of nodeicovar(transform="sqrt")
.
ergmTerm
for index of model terms currently visible to the package.
directed, valued
This term adds a single network statistic equalling the sum over the nodes of the range over of its neighbors' values.
# binary: nodeicovrange(attr)
# binary: nodeicovrange(attr)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
This is a network analogue of the statistic introduced by Hoffman et al. (2023).
Hoffman M, Block P, Snijders TAB (2023). “Modeling Partitions of Individuals.” Sociological Methodology, 53(1), 1–41. ISSN 1467-9531, doi:10.1177/00811750221145166.
ergmTerm
for index of model terms currently visible to the package.
directed, quantitative nodal attribute, binary
This term adds multiple network
statistics to the model, one for each of (a subset of) the unique
values of the attr
attribute (or each combination of the
attributes given). Each of these statistics gives the number of
times a node with that attribute or those attributes appears as the
terminal node of a directed tie.
For an analogous term for quantitative vertex attributes, see nodeicov
.
# binary: nodeifactor(attr, base=1, levels=-1) # valued: nodeifactor(attr, base=1, levels=-1, form="sum")
# binary: nodeifactor(attr, base=1, levels=-1) # valued: nodeifactor(attr, base=1, levels=-1, form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
base |
deprecated |
levels |
this optional argument controls which levels of the attribute
attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
To include all attribute values is usually not a good idea, because
the sum of all such statistics equals the number of edges and hence a linear
dependency would arise in any model also including edges
. The default,
levels=-1
, is therefore to omit the first (in lexicographic order)
attribute level. To include all levels, pass either levels=TRUE
(i.e., keep all levels) or levels=NULL
(i.e., do not filter levels).
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels
are passed,
levels
overrides base
.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, dyad-independent, frequently-used, binary, valued
This term adds a single network statistic to the model, counting, for each node, the number of distinct values of the attribute found among its neighbors.
# binary: nodeifactordistinct(attr, levels=TRUE)
# binary: nodeifactordistinct(attr, levels=TRUE)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
levels |
this optional argument controls which levels of the attribute
attributes and Levels ( |
This is a network analogue of the statistic introduced by Hoffman et al. (2023).
Hoffman M, Block P, Snijders TAB (2023). “Modeling Partitions of Individuals.” Sociological Methodology, 53(1), 1–41. ISSN 1467-9531, doi:10.1177/00811750221145166.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, binary
When diff=FALSE
, this term adds one network statistic
to the model, which counts the number of edges for which
attr(i)==attr(j)
. This is also called “uniform homophily”, because each group is assumed to have the same propensity for within-group ties. When multiple attribute names are given, the
statistic counts only ties for which all of the attributes
match. When diff=TRUE
, network statistics are added
to the model, where
is the number of unique values of the
attr
attribute. The th such statistic counts the
number of edges
for which
attr(i) == attr(j) == value(k)
, where value(k)
is the th
smallest unique value of the
attr
attribute. This is also called “differential homophily”, because each group is allowed to have a unique propensity for within-group ties. Note that a statistical test of uniform vs. differential homophily should be conducted using the ANOVA function.
By default, matches on all levels are
counted. This works for both
diff=TRUE
and diff=FALSE
.
# binary: nodematch(attr, diff=FALSE, keep=NULL, levels=NULL) # valued: nodematch(attr, diff=FALSE, keep=NULL, levels=NULL, form="sum") # valued: match(attr, diff=FALSE, keep=NULL, levels=NULL, form="sum")
# binary: nodematch(attr, diff=FALSE, keep=NULL, levels=NULL) # valued: nodematch(attr, diff=FALSE, keep=NULL, levels=NULL, form="sum") # valued: match(attr, diff=FALSE, keep=NULL, levels=NULL, form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
diff |
specify if the term has uniform or differential homophily |
keep |
deprecated |
levels |
this optional argument controls which levels of the attribute
attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
The argument keep
is retained for backwards compatibility and may be
removed in a future version. When both keep
and levels
are passed,
levels
overrides keep
.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, dyad-independent, frequently-used, undirected, binary, valued
Evaluates the terms specified in formula
on a network
constructed by taking and removing any edges for which
attrname(i)!=attrname(j)
.
# binary: NodematchFilter(formula, attrname)
# binary: NodematchFilter(formula, attrname)
formula |
formula to be evaluated |
attrname |
a character vector giving one or more names of attributes in the network's vertex attribute list. |
ergmTerm
for index of model terms currently visible to the package.
operator, binary
By default, this term adds one network statistic to the model for each possible pairing of attribute values. The statistic equals the number of edges in the network in which the nodes have that pairing of values. (When multiple attributes are specified, a statistic is added for each combination of attribute values for those attributes.) In other words, this term produces one statistic for every entry in the mixing matrix for the attribute(s). By default, the ordering of the attribute values is lexicographic: alphabetical (for nominal categories) or numerical (for ordered categories).
# binary: nodemix(attr, base=NULL, b1levels=NULL, b2levels=NULL, levels=NULL, levels2=-1) # valued: nodemix(attr, base=NULL, b1levels=NULL, b2levels=NULL, levels=NULL, # levels2=-1, form="sum")
# binary: nodemix(attr, base=NULL, b1levels=NULL, b2levels=NULL, levels=NULL, levels2=-1) # valued: nodemix(attr, base=NULL, b1levels=NULL, b2levels=NULL, levels=NULL, # levels2=-1, form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
base |
deprecated |
b1levels , b2levels , levels
|
control what statistics are included in the model and the order in which they appear. |
levels2 |
similar to the other levels arguments above and applies to all networks. Optionally allows a factor or character matrix to be specified to group certain levels. Level combinations corresponding to |
form |
character how to aggregate tie values in a valued ERGM |
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels
are passed,
levels
overrides base
.
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels2
are passed,
levels2
overrides base
.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, dyad-independent, frequently-used, undirected, binary, valued
This term adds a single network statistic for each quantitative attribute or matrix column to the model equaling the total
value of attr(i)
for all edges in the network. This
term may only be used with directed networks. For categorical attributes,
see
nodeofactor
.
# binary: nodeocov(attr) # valued: nodeocov(attr, form="sum")
# binary: nodeocov(attr) # valued: nodeocov(attr, form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
ergm versions 3.9.4 and earlier used different arguments for this
term. See ergm-options
for how to invoke the old behaviour.
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, quantitative nodal attribute, binary, valued
This term adds one statistic equal to
. This can be
viewed as a valued analog of the
ostar(2)
statistic.
# valued: nodeocovar(center, transform)
# valued: nodeocovar(center, transform)
center |
whether the |
transform |
if |
Note that this term replaces nodeosqrtcovar
, which has been
deprecated in favor of nodeocovar(transform="sqrt")
.
ergmTerm
for index of model terms currently visible to the package.
directed, valued
This term adds a single network statistic equalling the sum over the nodes of the range over of its neighbors' values.
# binary: nodeocovrange(attr)
# binary: nodeocovrange(attr)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
This is a network analogue of the statistic introduced by Hoffman et al. (2023).
Hoffman M, Block P, Snijders TAB (2023). “Modeling Partitions of Individuals.” Sociological Methodology, 53(1), 1–41. ISSN 1467-9531, doi:10.1177/00811750221145166.
ergmTerm
for index of model terms currently visible to the package.
directed, quantitative nodal attribute, binary
This term adds multiple network
statistics to the model, one for each of (a subset of) the unique
values of the attr
attribute (or each combination of the
attributes given). Each of these statistics gives the number of
times a node with that attribute or those attributes appears as the
node of origin of a directed tie.
# binary: nodeofactor(attr, base=1, levels=-1) # valued: nodeofactor(attr, base=1, levels=-1, form="sum")
# binary: nodeofactor(attr, base=1, levels=-1) # valued: nodeofactor(attr, base=1, levels=-1, form="sum")
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
base |
deprecated |
levels |
this optional argument controls which levels of the attribute
attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels
are passed,
levels
overrides base
.
To include all attribute values is usually not a good idea, because
the sum of all such statistics equals the number of edges and hence a linear
dependency would arise in any model also including edges
. The default,
levels=-1
, is therefore to omit the first (in lexicographic order)
attribute level. To include all levels, pass either levels=TRUE
(i.e., keep all levels) or levels=NULL
(i.e., do not filter levels).
This term can only be used with directed networks.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, dyad-independent, binary, valued
This term adds a single network statistic to the model, counting, for each node, the number of distinct values of the attribute found among its neighbors.
# binary: nodeofactordistinct(attr, levels=TRUE)
# binary: nodeofactordistinct(attr, levels=TRUE)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
levels |
this optional argument controls which levels of the attribute
attributes and Levels ( |
This is a network analogue of the statistic introduced by Hoffman et al. (2023).
Hoffman M, Block P, Snijders TAB (2023). “Modeling Partitions of Individuals.” Sociological Methodology, 53(1), 1–41. ISSN 1467-9531, doi:10.1177/00811750221145166.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, binary
This is a generic that returns the number of parameters associated with a model or a model fit.
nparam(object, ...) ## Default S3 method: nparam(object, ...) ## S3 method for class 'ergm' nparam(object, offset = NA, ...)
nparam(object, ...) ## Default S3 method: nparam(object, ...) ## S3 method for class 'ergm' nparam(object, offset = NA, ...)
object |
An object for which number of parameters is defined. |
... |
Additional arguments to methods. |
offset |
If |
nparam(default)
: By default, the length of the coef()
vector is returned.
nparam(ergm)
: A method to return the number of parameters of an ergm
fit.
This term adds one network statistic to the model for each element in d
where the th such statistic equals the number of non-edges in the network with exactly
d[i]
shared partners.
# binary: dnsp(d, type="OTP") # binary: nsp(d, type="OTP")
# binary: dnsp(d, type="OTP") # binary: nsp(d, type="OTP")
d |
a vector of distinct integers |
type |
A string indicating the type of shared partner or path to be considered for directed networks: |
While there is only one shared partner configuration in the undirected
case, nine distinct configurations are possible for directed graphs, selected
using the type
argument. Currently, terms may be defined with respect to
five of these configurations; they are defined here as follows (using
terminology from Butts (2008) and the relevent
package):
Outgoing Two-path ("OTP"
): vertex is an OTP shared partner of ordered
pair
iff
. Also known as "transitive
shared partner".
Incoming Two-path ("ITP"
): vertex is an ITP shared partner of ordered
pair
iff
. Also known as "cyclical shared
partner"
Reciprocated Two-path ("RTP"
): vertex is an RTP shared partner of ordered
pair
iff
.
Outgoing Shared Partner ("OSP"
): vertex is an OSP shared partner of
ordered pair
iff
.
Incoming Shared Partner ("ISP"
): vertex is an ISP shared partner of ordered
pair
iff
.
By default, outgoing two-paths ("OTP"
) are calculated. Note that Robins et al. (2009)
define closely related statistics to several of the above, using slightly different terminology.
This term takes an additional term option (see
options?ergm
), cache.sp
, controlling whether
the implementation will cache the number of shared partners for
each dyad in the network; this is usually enabled by default.
This term can only be used with directed networks.
ergmTerm
for index of model terms currently visible to the package.
directed, binary
Preserve the observed dyads of the given network.
# observed
# observed
ergmConstraint
for index of constraints and hints currently visible to the package.
directed, dyad-independent, undirected
This term adds one
network statistic to the model for each element of from
(or to
); the th
such statistic equals the number of nodes in the network of out-degree
greater than or equal to
from[i]
but strictly less than to[i]
, i.e. with
out-edge count
in semiopen interval [from,to)
.
This term can only be used with directed networks; for undirected
networks (bipartite and not)
see degrange
. For degrees of specific modes of bipartite
networks, see b1degrange
and b2degrange
. For
in-degrees, see idegrange
.
# binary: odegrange(from, to=+Inf, by=NULL, homophily=FALSE, levels=NULL)
# binary: odegrange(from, to=+Inf, by=NULL, homophily=FALSE, levels=NULL)
from , to
|
vectors of distinct integers. If one of the vectors have length 1, it is recycled to the length of the other. Otherwise, it must have the same length. |
by , levels , homophily
|
the optional argument |
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, binary
This term adds one network statistic to
the model for each element in d
; the th such statistic equals
the number of nodes in the network of out-degree
d[i]
, i.e. the
number of nodes with exactly d[i]
out-edges.
This term can only be used with directed networks; for undirected networks
see degree
.
# binary: odegree(d, by=NULL, homophily=FALSE, levels=NULL)
# binary: odegree(d, by=NULL, homophily=FALSE, levels=NULL)
d |
a vector of distinct integers |
by , levels , homophily
|
the optional argument |
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, frequently-used, binary
This term adds one network statistic to the model equaling the sum over the actors of each actor's outdegree taken to the 3/2 power (or, equivalently, multiplied by its square root). This term is analogous to the term of Snijders et al. (2010), equation (12). This term can only be used with directed networks.
# binary: odegree1.5
# binary: odegree1.5
ergmTerm
for index of model terms currently visible to the package.
directed, binary
Preserve the outdegree distribution of the given network.
# odegreedist
# odegreedist
ergmConstraint
for index of constraints and hints currently visible to the package.
directed
For directed networks, preserve the outdegree of each vertex of the given network, while allowing indegree to vary
# odegrees
# odegrees
ergmConstraint
for index of constraints and hints currently visible to the package.
directed
This operator is analogous to the offset()
wrapper, but the
coefficients are specified within the term and the curved ERGM
mechanism is used internally.
# binary: Offset(formula, coef, which)
# binary: Offset(formula, coef, which)
formula |
a one-sided |
coef |
coefficients to the formula |
which |
used to specify which of the parameters in the formula are fixed. It can be a logical vector (recycled as needed), a numeric vector of indices of parameters to be fixed, or a character vector of parameter names. |
ergmTerm
for index of model terms currently visible to the package.
operator, binary
This term adds one statistic to the model equal to the number of 2-stars minus three times the number of triangles in the network. It is currently only implemented for undirected networks.
# binary: opentriad
# binary: opentriad
ergmTerm
for index of model terms currently visible to the package.
triad-related, undirected, binary
This term adds one network statistic to the
model for each element in k
. The th such statistic counts the
number of distinct
k[i]
-outstars in the network, where a
-outstar is defined to be a node
and a set of
different nodes
such that the ties
exist for
.
This term can only be used with directed
networks; for undirected networks see
kstar
.
# binary: ostar(k, attr=NULL, levels=NULL)
# binary: ostar(k, attr=NULL, levels=NULL)
k |
a vector of distinct integers |
attr , levels
|
a vertex attribute specification; if |
ostar(1)
is equal to both istar(1)
and edges
.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, binary
This is a generic that returns a vector giving the names of the parameters associated with a model or a model fit.
param_names(object, ...) ## Default S3 method: param_names(object, ...) param_names(object, ...) <- value
param_names(object, ...) ## Default S3 method: param_names(object, ...) param_names(object, ...) <- value
object |
An object for which parameter names are defined. |
... |
Additional arguments to methods. |
value |
Specification for the new parameter names. |
param_names(default)
: By default, the names of the coef()
vector is returned.
param_names(object, ...) <- value
: a method for modifying parameter names of an object.
Calculate model-predicted conditional and unconditional tie probabilities for dyads in the given network. Conditional probabilities of a dyad given the state of all the remaining dyads in the graph are computed exactly. Unconditional probabilities are computed through simulating networks using the given model. Currently there are two methods implemented:
Method for formula objects requires (1) an ERGM model formula with an existing network object on the left hand side and model terms on the right hand side, and (2) a vector of corresponding parameter values.
Method for ergm
objects, as returned by ergm()
, takes both the formula
and parameter values from the fitted model object.
Both methods can limit calculations to specific set of dyads of interest.
## S3 method for class 'formula' predict( object, theta, conditional = TRUE, type = c("response", "link"), nsim = 100, output = c("data.frame", "matrix"), ... ) ## S3 method for class 'ergm' predict(object, ...)
## S3 method for class 'formula' predict( object, theta, conditional = TRUE, type = c("response", "link"), nsim = 100, output = c("data.frame", "matrix"), ... ) ## S3 method for class 'ergm' predict(object, ...)
object |
a formula or a fitted ERGM model object |
theta |
numeric vector of ERGM model parameter values |
conditional |
logical whether to compute conditional or unconditional predicted probabilities |
type |
character element, one of |
nsim |
integer, number of simulated networks used for computing unconditional probabilities. Defaults to 100. |
output |
character, type of object returned. Defaults to |
... |
other arguments passed to/from other methods. For the |
Type of object returned depends on the argument output
. If
output="data.frame"
the function will return a data frame with columns:
tail
, head
– indices of nodes identifying a dyad
p
– predicted conditional tie probability
If output="matrix"
the function will return an "adjacency matrix" with the
predicted probabilities. Diagonal values are 0s.
# A three-node empty directed network net <- network.initialize(3, directed=TRUE) # In homogeneous Bernoulli model with odds of a tie of 1/5 all ties are # equally likely predict(net ~ edges, log(1/5)) # Let's add a tie so that `net` has 1 tie out of possible 6 (so odds of 1/5) net[1,2] <- 1 # Fit the model fit <- ergm(net ~ edges) # The p's should be identical predict(fit)
# A three-node empty directed network net <- network.initialize(3, directed=TRUE) # In homogeneous Bernoulli model with odds of a tie of 1/5 all ties are # equally likely predict(net ~ edges, log(1/5)) # Let's add a tie so that `net` has 1 tie out of possible 6 (so odds of 1/5) net[1,2] <- 1 # Fit the model fit <- ergm(net ~ edges) # The p's should be identical predict(fit)
This operator evaluates a list of formulas whose corresponnding RHS statistics will be multiplied elementwise. They are required to be nonnegative.
# binary: Prod(formulas, label) # valued: Prod(formulas, label)
# binary: Prod(formulas, label) # valued: Prod(formulas, label)
formulas |
a list (constructed using If a formula in the list has an LHS, it is interpreted as follows:
|
label |
used to specify the names of the elements of the resulting term product vector. If |
Note that each formula must either produce the same number of statistics or be mapped through a matrix to produce the same number of statistics.
A single formula is also permitted. This can be useful if one wishes to, say, scale or multiply together the statistics returned by a formula.
Offsets are ignored unless there is only one formula and the transformation only scales the statistics (i.e., the effective transformation matrix is diagonal).
Curved models are supported, subject to some limitations. In particular, the first model's etamap will be used, overwriting the others. If label
is not of length 1, it should have an attr
-style attribute "curved"
specifying the names for the curved parameters.
The current implementation piggybacks on the Log
, Exp
, and Sum
operators, essentially Exp(~Sum(~Log(formula), label))
. This may result in loss of precision, particularly for extremely large or small statistics. The implementation may change in the future.
ergmTerm
for index of model terms currently visible to the package.
operator, binary, valued
This operator on a bipartite network evaluates the
formula on the undirected, valued network constructed by
projecting it onto its specified mode. Proj1(formula)
and
Proj2(formula)
are aliases for Project(formula, 1)
and
Project(formula, 2)
, respectively.
# binary: Project(formula, mode) # binary: Proj1(formula) # binary: Proj2(formula)
# binary: Project(formula, mode) # binary: Proj1(formula) # binary: Proj2(formula)
formula |
a one-sided |
mode |
the mode onto which to project: 1 or 2 |
ergmTerm
for index of model terms currently visible to the package.
bipartite, operator, binary
A simple test reporting the sample quantile of the observed network's probability in the distribution under the MLE. This is a conservative p-value for the null hypothesis of the observed network being a draw from the distribution of interest.
rank_test.ergm(x, plot = FALSE)
rank_test.ergm(x, plot = FALSE)
x |
an |
plot |
if |
The sample quantile of the observed network's probability among the predicted.
This term adds one network statistic for each node equal to the number of
in-ties for that node. This measures the popularity of the node. The term
for the first node is omitted by default because of linear dependence that
arises if this term is used together with edges
, but its coefficient
can be computed as the negative of the sum of the coefficients of all the
other actors. That is, the average coefficient is zero, following the
Holland-Leinhardt parametrization of the $p_1$ model (Holland and Leinhardt,
1981). This
term can only be used with directed networks. For undirected networks, see
sociality
.
# binary: receiver(base=1, nodes=-1) # valued: receiver(base=1, nodes=-1, form="sum")
# binary: receiver(base=1, nodes=-1) # valued: receiver(base=1, nodes=-1, form="sum")
base |
deprecated |
nodes |
specify which nodes' statistics should be included or excluded (see Specifying Vertex attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and nodes
are passed,
nodes
overrides base
.
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, binary, valued
This operator takes a two-sided forumla attrs
whose LHS gives the attribute or attribute function for which tails and heads will be used to construct the induced subgraph. They must evaluate either to a logical vector equal in length to the number of tails (for LHS) and heads (for RHS) indicating which nodes are to be used to induce the subgraph or a numeric vector giving their indices.
# binary: S(formula, attrs)
# binary: S(formula, attrs)
formula |
a one-sided |
attrs |
a two-sided formula to be used. A one-sided formula (e.g., |
As with indexing vectors, the logical vector will be recycled to the size of the network or the size of the appropriate bipartition, and negative indices will deselect vertices.
When the two sets are identical, the induced subgraph retains the directedness of the original graph. Otherwise, an undirected bipartite graph is induced.
ergmTerm
for index of model terms currently visible to the package.
operator, binary
Three network
objects containing the "liking" nominations of
Sampson's (1969) monks at the three time points.
data(samplk)
data(samplk)
Sampson (1969) recorded the social interactions among a group of monks while he was a resident as an experimenter at the cloister. During his stay, a political "crisis in the cloister" resulted in the expulsion of four monks– namely, the three "outcasts," Brothers Elias, Simplicius, Basil, and the leader of the "young Turks," Brother Gregory. Not long after Brother Gregory departed, all but one of the "young Turks" left voluntarily: Brothers John Bosco, Albert, Boniface, Hugh, and Mark. Then, all three of the "waverers" also left: First, Brothers Amand and Victor, then later Brother Romuald. Eventually, Brother Peter and Brother Winfrid also left, leaving only four of the original group.
Of particular interest are the data on positive affect relations ("liking," using the terminology later adopted by White et al. (1976)), in which each monk was asked if he had positive relations to each of the other monks. Each monk ranked only his top three choices (or four, in the case of ties) on "liking". Here, we consider a directed edge from monk A to monk B to exist if A nominated B among these top choices.
The data were gathered at three times to capture changes in group sentiment
over time. They represent three time points in the period during which a new
cohort had entered the monastery near the end of the study but before the
major conflict began. These three time points are labeled T2, T3, and T4 in
Tables D5 through D16 in the appendices of Sampson's 1969 dissertation. and
the corresponding network data sets are named samplk1
,
samplk2
, and samplk3
, respectively.
See also the data set sampson
containing the time-aggregated
graph samplike
.
samplk3
is a data set of Hoff, Raftery and Handcock (2002).
The data sets are stored as network
objects with
three vertex attributes:
Groups of novices as classified by Sampson, that is,
"Loyal", "Outcasts", and "Turks", but with a fourth group called the
"Waverers" by White et al. (1975) that comprises two of the original Loyal
opposition and one of the original Outcasts. See the samplike
data set for the original classifications of these three waverers.
An indicator of attendance in the minor seminary of "Cloisterville" before coming to the monastery.
The
given names of the novices. NB: These names have been corrected as of
ergm
version 3.6.1.
This data set is standard in the social network analysis literature, having been modeled by Holland and Leinhardt (1981), Reitz (1982), Holland, Laskey and Leinhardt (1983), Fienberg, Meyer, and Wasserman (1981), and Hoff, Raftery, and Handcock (2002), among others. This is only a small piece of the data collected by Sampson.
This data set was updated for version 2.5 (March 2012) to add the
cloisterville
variable and refine the names. This information is from
de Nooy, Mrvar, and Batagelj (2005). The original vertex names were:
Romul_10, Bonaven_5, Ambrose_9, Berth_6, Peter_4, Louis_11, Victor_8,
Winf_12, John_1, Greg_2, Hugh_14, Boni_15, Mark_7, Albert_16, Amand_13,
Basil_3, Elias_17, Simp_18. The numbers indicate the ordering used in the
original dissertation of Sampson (1969).
In ergm
versions
3.6.0 and earlier, The adjacency matrices of the samplike
,
samplk1
, samplk2
, and samplk3
networks reflected the original Sampson (1969) ordering of the names even
though the vertex labels used the name order of de Nooy, Mrvar, and Batagelj
(2005). That is, in ergm
version 3.6.0 and earlier, the vertices were
mislabeled. The correct order is the same one given in Tables D5, D9, and
D13 of Sampson (1969): John Bosco, Gregory, Basil, Peter, Bonaventure,
Berthold, Mark, Victor, Ambrose, Romauld (Sampson uses both spellings
"Romauld" and "Ramauld" in the dissertation), Louis, Winfrid, Amand, Hugh,
Boniface, Albert, Elias, Simplicius. By contrast, the order given in
ergm
version 3.6.0 and earlier is: Ramuald, Bonaventure, Ambrose,
Berthold, Peter, Louis, Victor, Winfrid, John Bosco, Gregory, Hugh,
Boniface, Mark, Albert, Amand, Basil, Elias, Simplicius.
Sampson, S.~F. (1968), A novitiate in a period of change: An experimental and case study of relationships, Unpublished Ph.D. dissertation, Department of Sociology, Cornell University.
https://github.com/bavla/Nets/raw/refs/heads/master/data/Pajek/esna/Sampson.zip
White, H.C., Boorman, S.A. and Breiger, R.L. (1976). Social structure from multiple networks. I. Blockmodels of roles and positions. American Journal of Sociology, 81(4), 730-780.
Wouter de Nooy, Andrej Mrvar, Vladimir Batagelj (2005) Exploratory Social Network Analysis with Pajek, Cambridge: Cambridge University Press
sampson, florentine, network, plot.network, ergm
A network
object containing the cumulative "liking"
nominations of Sampson's (1969) monks over the three time points.
data(sampson)
data(sampson)
Sampson (1969) recorded the social interactions among a group of monks while he was a resident as an experimenter at the cloister. During his stay, a political "crisis in the cloister" resulted in the expulsion of four monks– namely, the three "outcasts," Brothers Elias, Simplicius, Basil, and the leader of the "young Turks," Brother Gregory. Not long after Brother Gregory departed, all but one of the "young Turks" left voluntarily: Brothers John Bosco, Albert, Boniface, Hugh, and Mark. Then, all three of the "waverers" also left: First, Brothers Amand and Victor, then later Brother Romuald. Eventually, Brother Peter and Brother Winfrid also left, leaving only four of the original group.
Of particular interest are the data on positive affect relations ("liking," using the terminology later adopted by White et al. (1976)), in which each monk was asked if he had positive relations to each of the other monks. Each monk ranked only his top three choices (or four, in the case of ties) on "liking". Here, we consider a directed edge from monk A to monk B to exist if A nominated B among these top choices.
The data were gathered at three times to capture changes in group sentiment
over time. They represent three time points in the period during which a new
cohort had entered the monastery near the end of the study but before the
major conflict began. These three time points are labeled T2, T3, and T4 in
Tables D5 through D16 in the appendices of Sampson's 1969 dissertation. The
samplike
data set is the time-aggregated network. Thus, a tie from
monk A to monk B exists if A nominated B as one of his three (or four, in
case of ties) best friends at any of the three time points.
See also the data sets samplk1
, samplk2
, and
samplk3
, containing the networks at each of the three
individual time points.
The data set is stored as a network
object with three
vertex attributes:
Groups of novices as classified by Sampson: "Loyal", "Outcasts", and "Turks".
An indicator of attendance in the minor seminary of "Cloisterville" before coming to the monastery.
The given names of the novices. NB:
These names have been corrected as of ergm
version 3.6.1; see details
below.
In addition, the data set has an edge attribute,
nominations
, giving the number of times (out of 3) that monk A
nominated monk B.
This data set is standard in the social network analysis literature, having been modeled by Holland and Leinhardt (1981), Reitz (1982), Holland, Laskey and Leinhardt (1983), Fienberg, Meyer, and Wasserman (1981), and Hoff, Raftery, and Handcock (2002), among others. This is only a small piece of the data collected by Sampson.
This data set was updated for version 2.5 (March 2012) to add the
cloisterville
variable and refine the names. This information is from
de Nooy, Mrvar, and Batagelj (2005). The original vertex names were:
Romul_10, Bonaven_5, Ambrose_9, Berth_6, Peter_4, Louis_11, Victor_8,
Winf_12, John_1, Greg_2, Hugh_14, Boni_15, Mark_7, Albert_16, Amand_13,
Basil_3, Elias_17, Simp_18. The numbers indicate the ordering used in the
original dissertation of Sampson (1969).
In ergm
version
3.6.0 and earlier, The adjacency matrices of the samplike
,
samplk1
, samplk2
, and samplk3
networks reflected the original Sampson (1969) ordering of the names even
though the vertex labels used the name order of de Nooy, Mrvar, and Batagelj
(2005). That is, in ergm
version 3.6.0 and earlier, the vertices were
mislabeled. The correct order is the same one given in Tables D5, D9, and
D13 of Sampson (1969): John Bosco, Gregory, Basil, Peter, Bonaventure,
Berthold, Mark, Victor, Ambrose, Romauld (Sampson uses both spellings
"Romauld" and "Ramauld" in the dissertation), Louis, Winfrid, Amand, Hugh,
Boniface, Albert, Elias, Simplicius. By contrast, the order given in
ergm
version 3.6.0 and earlier is: Ramuald, Bonaventure, Ambrose,
Berthold, Peter, Louis, Victor, Winfrid, John Bosco, Gregory, Hugh,
Boniface, Mark, Albert, Amand, Basil, Elias, Simplicius.
Sampson, S.~F. (1968), A novitiate in a period of change: An experimental and case study of relationships, Unpublished Ph.D. dissertation, Department of Sociology, Cornell University.
https://github.com/bavla/Nets/raw/refs/heads/master/data/Pajek/esna/Sampson.zip
White, H.C., Boorman, S.A. and Breiger, R.L. (1976). Social structure from multiple networks. I. Blockmodels of roles and positions. American Journal of Sociology, 81(4), 730-780.
Wouter de Nooy, Andrej Mrvar, Vladimir Batagelj (2005) Exploratory Social Network Analysis with Pajek, Cambridge: Cambridge University Press
florentine, network, plot.network, ergm
This function attempts to find a network or networks whose statistics match
those passed in via the target.stats
vector.
san(object, ...) ## S3 method for class 'formula' san( object, response = NULL, reference = ~Bernoulli, constraints = ~., target.stats = NULL, nsim = NULL, basis = NULL, output = c("network", "edgelist", "ergm_state"), only.last = TRUE, control = control.san(), verbose = FALSE, offset.coef = NULL, ... ) ## S3 method for class 'ergm_model' san( object, reference = ~Bernoulli, constraints = ~., target.stats = NULL, nsim = NULL, basis = NULL, output = c("network", "edgelist", "ergm_state"), only.last = TRUE, control = control.san(), verbose = FALSE, offset.coef = NULL, ... )
san(object, ...) ## S3 method for class 'formula' san( object, response = NULL, reference = ~Bernoulli, constraints = ~., target.stats = NULL, nsim = NULL, basis = NULL, output = c("network", "edgelist", "ergm_state"), only.last = TRUE, control = control.san(), verbose = FALSE, offset.coef = NULL, ... ) ## S3 method for class 'ergm_model' san( object, reference = ~Bernoulli, constraints = ~., target.stats = NULL, nsim = NULL, basis = NULL, output = c("network", "edgelist", "ergm_state"), only.last = TRUE, control = control.san(), verbose = FALSE, offset.coef = NULL, ... )
object |
Either a |
... |
Further arguments passed to other functions. |
response |
Either a character string, a formula, or
|
reference |
A one-sided formula specifying
the reference measure ( |
constraints |
A formula specifying one or more constraints
on the support of the distribution of the networks being modeled. Multiple constraints
may be given, separated by “+” and “-” operators. See
The default is to have no constraints except those provided through
the Together with the model terms in the formula and the reference measure, the constraints define the distribution of networks being modeled. It is also possible to specify a proposal function directly either
by passing a string with the function's name (in which case,
arguments to the proposal should be specified through the
Note that not all possible combinations of constraints and reference measures are supported. However, for relatively simple constraints (i.e., those that simply permit or forbid specific dyads or sets of dyads from changing), arbitrary combinations should be possible. |
target.stats |
A vector of the same length as the number of non-offset statistics implied by the formula. |
nsim |
Number of networks to generate. Deprecated: just use |
basis |
If not NULL, a |
output |
Character, one of |
only.last |
if |
control |
A list of control parameters for algorithm tuning,
typically constructed with |
verbose |
A logical or an integer to control the amount of
progress and diagnostic information to be printed. |
offset.coef |
A vector of offset coefficients; these must be passed in by the user.
Note that these should be the same set of coefficients one would pass to |
formula |
(By default, the |
The following description is an exegesis of section 4 of Krivitsky et al. (2022).
Let be a vector of target statistics for the
network we wish to construct. That is, we are given an arbitrary network
, and we seek a network
such that
– ideally equality is achieved,
but in practice we may have to settle for a close approximation. The
variant of simulated annealing is as follows.
The energy function is defined
with a symmetric positive (barring multicollinearity in statistics)
definite matrix of weights. This function achieves 0 only if the target is
reached. A good choice of this matrix yields a more efficient search.
A standard simulated annealing loop is used, as described below, with some
modifications. In particular, we allow the user to specify a vector of
offsets to bias the annealing, with
denoting no offset. Offsets can be used with SAN to forbid certain
statistics from ever increasing or decreasing. As with
ergm()
, offset
terms are specified using the offset()
decorator and their coefficients
specified with the offset.coef
argument. By default, finite offsets are
ignored by, but this can be overridden by setting the control.san()
argument SAN.ignore.finite.offsets = FALSE
.
The number of simulated annealing runs is specified by the SAN.maxit
control parameter and the initial value of the temperature is set
to
SAN.tau
. The value of decreases linearly until
at the last run, which implies that all proposals that increase
are rejected. The weight matrix
is initially set to
, where
is the identity matrix
of an appropriate dimension. For weight
and temperature
,
the simulated annealing iteration proceeds as follows:
Test if . If so, then exit.
Generate a perturbed network from a proposal that
respects the model constraints. (This is typically the same proposal as
that used for MCMC.)
Store the quantity
for later use.
Calculate acceptance probability
(If and
, their product is defined to be 0.)
Replace with
with probability
.
After the specified number of iterations, is updated as described
above, and
is recalculated by first computing a matrix
, the
sample covariance matrix of the proposed differences stored in Step 3
(i.e., whether or not they were rejected), then
, where
is the
Moore–Penrose pseudoinverse of
and
is the
trace of
. The differences in Step 3 closely reflect the
relative variances and correlations among the network statistics.
In Step 2, the many options for MCMC proposals can provide for effective means of speeding the SAN algorithm's search for a viable network.
A network or list of networks that hopefully have network
statistics close to the target.stats
vector. No guarantees
are provided about their probability distribution. Additionally,
attr()
-style attributes formula
and stats
are included.
san(formula)
: Sufficient statistics are specified by a formula
.
san(ergm_model)
: A lower-level function that expects a pre-initialized ergm_model
.
Krivitsky, P. N., Hunter, D. R., Morris, M., & Klumb, C. (2022). ergm 4: Computational Improvements. arXiv preprint arXiv:2203.08198.
# initialize x to a random undirected network with 50 nodes and a density of 0.1 x <- network(50, density = 0.05, directed = FALSE) # try to find a network on 50 nodes with 300 edges, 150 triangles, # and 1250 4-cycles, starting from the network x y <- san(x ~ edges + triangles + cycle(4), target.stats = c(300, 150, 1250)) # check results summary(y ~ edges + triangles + cycle(4)) # initialize x to a random directed network with 50 nodes x <- network(50) # add vertex attributes x %v% 'give' <- runif(50, 0, 1) x %v% 'take' <- runif(50, 0, 1) # try to find a set of 100 directed edges making the outward sum of # 'give' and the inward sum of 'take' both equal to 62.5, so in # edges (i,j) the node i tends to have above average 'give' and j # tends to have above average 'take' y <- san(x ~ edges + nodeocov('give') + nodeicov('take'), target.stats = c(100, 62.5, 62.5)) # check results summary(y ~ edges + nodeocov('give') + nodeicov('take')) # initialize x to a random undirected network with 50 nodes x <- network(50, directed = FALSE) # add a vertex attribute x %v% 'popularity' <- runif(50, 0, 1) # try to find a set of 100 edges making the total sum of # popularity(i) and popularity(j) over all edges (i,j) equal to # 125, so nodes with higher popularity are more likely to be # connected to other nodes y <- san(x ~ edges + nodecov('popularity'), target.stats = c(100, 125)) # check results summary(y ~ edges + nodecov('popularity')) # creates a network with denser "core" spreading out to sparser # "periphery" plot(y)
# initialize x to a random undirected network with 50 nodes and a density of 0.1 x <- network(50, density = 0.05, directed = FALSE) # try to find a network on 50 nodes with 300 edges, 150 triangles, # and 1250 4-cycles, starting from the network x y <- san(x ~ edges + triangles + cycle(4), target.stats = c(300, 150, 1250)) # check results summary(y ~ edges + triangles + cycle(4)) # initialize x to a random directed network with 50 nodes x <- network(50) # add vertex attributes x %v% 'give' <- runif(50, 0, 1) x %v% 'take' <- runif(50, 0, 1) # try to find a set of 100 directed edges making the outward sum of # 'give' and the inward sum of 'take' both equal to 62.5, so in # edges (i,j) the node i tends to have above average 'give' and j # tends to have above average 'take' y <- san(x ~ edges + nodeocov('give') + nodeicov('take'), target.stats = c(100, 62.5, 62.5)) # check results summary(y ~ edges + nodeocov('give') + nodeicov('take')) # initialize x to a random undirected network with 50 nodes x <- network(50, directed = FALSE) # add a vertex attribute x %v% 'popularity' <- runif(50, 0, 1) # try to find a set of 100 edges making the total sum of # popularity(i) and popularity(j) over all edges (i,j) equal to # 125, so nodes with higher popularity are more likely to be # connected to other nodes y <- san(x ~ edges + nodecov('popularity'), target.stats = c(100, 125)) # check results summary(y ~ edges + nodecov('popularity')) # creates a network with denser "core" spreading out to sparser # "periphery" plot(y)
Searches through the database of ergmTerm
s,
ergmConstraint
s, ergmReference
s, ergmHint
s, and
ergmProposal
s and prints out a list of terms and term-alikes
appropriate for the specified network's structural constraints,
optionally restricting by additional keywords and search term
matches.
search.ergmTerms(search, net, keywords, name, packages) search.ergmConstraints(search, keywords, name, packages) search.ergmReferences(search, keywords, name, packages) search.ergmHints(search, keywords, name, packages) search.ergmProposals(search, name, reference, constraints, packages)
search.ergmTerms(search, net, keywords, name, packages) search.ergmConstraints(search, keywords, name, packages) search.ergmReferences(search, keywords, name, packages) search.ergmHints(search, keywords, name, packages) search.ergmProposals(search, name, reference, constraints, packages)
search |
optional character search term to search for in the text of the term descriptions. Only matching terms will be returned. Matching is case insensitive. |
net |
a network object that the term would be applied to, used as template to determine directedness, bipartite, etc |
keywords |
optional character vector of keyword tags to use to restrict the results (i.e. 'curved', 'triad-related') |
name |
optional character name of a specific term to return |
packages |
optional character vector indicating the subset of packages in which to search |
reference , constraints
|
optional names of references and constraints to narrow down the proposal |
Uses grep()
internally to match the search terms against the term
description, so search
is currently matched as a single phrase.
Keyword tags will only return a match if all of the specified tags are
included in the term.
prints out the name and short description of matching terms, and
invisibly returns them as a list. If name
is specified, prints out
the full definition for the named term.
See also ergmTerm
,
ergmConstraint
, ergmReference
, ergmHint
, and
ergmProposal
, for lists of terms and term-alikes visible to ergm.
# find all of the terms that mention triangles search.ergmTerms('triangle') # two ways to search for bipartite terms: # search using a bipartite net as a template myNet<-network.initialize(5,bipartite=3) search.ergmTerms(net=myNet) # or request the bipartite keyword search.ergmTerms(keywords='bipartite') # search on multiple keywords search.ergmTerms(keywords=c('bipartite','dyad-independent')) # print out the content for a specific term search.ergmTerms(name='b2factor') # request the bipartite keyword in the ergm package search.ergmTerms(keywords='bipartite', packages='ergm') # find all of the constraint that mention degrees search.ergmConstraints('degree') # search for hints only search.ergmConstraints(keywords='hint') # search on multiple keywords search.ergmConstraints(keywords=c('directed','dyad-independent')) # print out the content for a specific constraint search.ergmConstraints(name='b1degrees') # request the bipartite keyword in the ergm package search.ergmConstraints(keywords='directed', packages='ergm') # find all discrete references search.ergmReferences(keywords='discrete') # find all of the hints search.ergmHints('degree') # find all of the proposals that mention triangles search.ergmProposals('MH algorithm') # print out the content for a specific proposals search.ergmProposals(name='randomtoggle') # find all proposals with required or optional constraints search.ergmProposals(constraints='.dyads') # find all proposals with references search.ergmProposals(reference='Bernoulli') # request proposals that mention triangle in the ergm package search.ergmProposals('MH algorithm', packages='ergm')
# find all of the terms that mention triangles search.ergmTerms('triangle') # two ways to search for bipartite terms: # search using a bipartite net as a template myNet<-network.initialize(5,bipartite=3) search.ergmTerms(net=myNet) # or request the bipartite keyword search.ergmTerms(keywords='bipartite') # search on multiple keywords search.ergmTerms(keywords=c('bipartite','dyad-independent')) # print out the content for a specific term search.ergmTerms(name='b2factor') # request the bipartite keyword in the ergm package search.ergmTerms(keywords='bipartite', packages='ergm') # find all of the constraint that mention degrees search.ergmConstraints('degree') # search for hints only search.ergmConstraints(keywords='hint') # search on multiple keywords search.ergmConstraints(keywords=c('directed','dyad-independent')) # print out the content for a specific constraint search.ergmConstraints(name='b1degrees') # request the bipartite keyword in the ergm package search.ergmConstraints(keywords='directed', packages='ergm') # find all discrete references search.ergmReferences(keywords='discrete') # find all of the hints search.ergmHints('degree') # find all of the proposals that mention triangles search.ergmProposals('MH algorithm') # print out the content for a specific proposals search.ergmProposals(name='randomtoggle') # find all proposals with required or optional constraints search.ergmProposals(constraints='.dyads') # find all proposals with references search.ergmProposals(reference='Bernoulli') # request proposals that mention triangle in the ergm package search.ergmProposals('MH algorithm', packages='ergm')
This term adds one network statistic for each node equal to the number of
out-ties for that node. This measures the activity of the node. The term for
the first node is omitted by default because of linear dependence that
arises if this term is used together with edges
, but its coefficient
can be computed as the negative of the sum of the coefficients of all the
other actors. That is, the average coefficient is zero, following the
Holland-Leinhardt parametrization of the $p_1$ model (Holland and Leinhardt,
1981).
For undirected networks, see sociality
.
# binary: sender(base=1, nodes=-1) # valued: sender(base=1, nodes=-1, form="sum")
# binary: sender(base=1, nodes=-1) # valued: sender(base=1, nodes=-1, form="sum")
base |
deprecated |
nodes |
specify which nodes' statistics should be included or excluded (see Specifying Vertex attributes and Levels ( |
form |
character how to aggregate tie values in a valued ERGM |
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and nodes
are passed,
nodes
overrides base
.
This term can only be used with directed networks.
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, binary, valued
This term adds one statistic to the model equal to the number of Simmelian triads, as defined by Krackhardt and Handcock (2007). This is a complete sub-graph of size three.
# binary: simmelian
# binary: simmelian
This term can only be used with directed networks.
ergmTerm
for index of model terms currently visible to the package.
directed, triad-related, binary
This term adds one statistic to the model equal to the number of ties in the network that are associated with Simmelian triads, as defined by Krackhardt and Handcock (2007). Each Simmelian has six ties in it but, because Simmelians can overlap in terms of nodes (and associated ties), the total number of ties in these Simmelians is less than six times the number of Simmelians. Hence this is a measure of the clustering of Simmelians (given the number of Simmelians).
# binary: simmelianties
# binary: simmelianties
This term can only be used with directed networks.
ergmTerm
for index of model terms currently visible to the package.
directed, triad-related, binary
simulate
is used to draw from exponential
family random network models. See ergm()
for more
information on these models.
The method for ergm
objects inherits the model,
the coefficients, the response attribute, the reference, the
constraints, and most simulation parameters from the model fit,
unless overridden by passing them explicitly. Unless overridden,
the simulation is initialized with either a random draw from near
the fitted model saved by ergm()
or, if unavailable, the
network to which the ERGM was fit.
## S3 method for class 'formula_lhs_network' simulate(object, nsim = 1, seed = NULL, ...) simulate_formula(object, ..., basis = eval_lhs.formula(object)) ## S3 method for class 'network' simulate_formula( object, nsim = 1, seed = NULL, coef, response = NULL, reference = ~Bernoulli, constraints = ~., observational = FALSE, monitor = NULL, statsonly = FALSE, esteq = FALSE, output = c("network", "stats", "edgelist", "ergm_state"), simplify = TRUE, sequential = TRUE, control = control.simulate.formula(), verbose = FALSE, ..., basis = ergm.getnetwork(object), do.sim = NULL, return.args = NULL ) ## S3 method for class 'ergm_state' simulate_formula( object, nsim = 1, seed = NULL, coef, response = NULL, reference = ~Bernoulli, constraints = ~., observational = FALSE, monitor = NULL, statsonly = FALSE, esteq = FALSE, output = c("network", "stats", "edgelist", "ergm_state"), simplify = TRUE, sequential = TRUE, control = control.simulate.formula(), verbose = FALSE, ..., basis = ergm.getnetwork(object), do.sim = NULL, return.args = NULL ) ## S3 method for class 'ergm_model' simulate( object, nsim = 1, seed = NULL, coef, reference = if (is(constraints, "ergm_proposal")) NULL else trim_env(~Bernoulli), constraints = trim_env(~.), observational = FALSE, monitor = NULL, basis = NULL, esteq = FALSE, output = c("network", "stats", "edgelist", "ergm_state"), simplify = TRUE, sequential = TRUE, control = control.simulate.formula(), verbose = FALSE, ..., do.sim = NULL, return.args = NULL ) ## S3 method for class 'ergm_state_full' simulate( object, nsim = 1, seed = NULL, coef, esteq = FALSE, output = c("network", "stats", "edgelist", "ergm_state"), simplify = TRUE, sequential = TRUE, control = control.simulate.formula(), verbose = FALSE, ..., return.args = NULL ) ## S3 method for class 'ergm' simulate( object, nsim = 1, seed = NULL, coef = coefficients(object), response = object$network %ergmlhs% "response", reference = object$reference, constraints = list(object$constraints, object$obs.constraints), observational = FALSE, monitor = NULL, basis = if (observational) object$network else NVL(object$newnetwork, object$network), statsonly = FALSE, esteq = FALSE, output = c("network", "stats", "edgelist", "ergm_state"), simplify = TRUE, sequential = TRUE, control = control.simulate.ergm(), verbose = FALSE, ..., return.args = NULL )
## S3 method for class 'formula_lhs_network' simulate(object, nsim = 1, seed = NULL, ...) simulate_formula(object, ..., basis = eval_lhs.formula(object)) ## S3 method for class 'network' simulate_formula( object, nsim = 1, seed = NULL, coef, response = NULL, reference = ~Bernoulli, constraints = ~., observational = FALSE, monitor = NULL, statsonly = FALSE, esteq = FALSE, output = c("network", "stats", "edgelist", "ergm_state"), simplify = TRUE, sequential = TRUE, control = control.simulate.formula(), verbose = FALSE, ..., basis = ergm.getnetwork(object), do.sim = NULL, return.args = NULL ) ## S3 method for class 'ergm_state' simulate_formula( object, nsim = 1, seed = NULL, coef, response = NULL, reference = ~Bernoulli, constraints = ~., observational = FALSE, monitor = NULL, statsonly = FALSE, esteq = FALSE, output = c("network", "stats", "edgelist", "ergm_state"), simplify = TRUE, sequential = TRUE, control = control.simulate.formula(), verbose = FALSE, ..., basis = ergm.getnetwork(object), do.sim = NULL, return.args = NULL ) ## S3 method for class 'ergm_model' simulate( object, nsim = 1, seed = NULL, coef, reference = if (is(constraints, "ergm_proposal")) NULL else trim_env(~Bernoulli), constraints = trim_env(~.), observational = FALSE, monitor = NULL, basis = NULL, esteq = FALSE, output = c("network", "stats", "edgelist", "ergm_state"), simplify = TRUE, sequential = TRUE, control = control.simulate.formula(), verbose = FALSE, ..., do.sim = NULL, return.args = NULL ) ## S3 method for class 'ergm_state_full' simulate( object, nsim = 1, seed = NULL, coef, esteq = FALSE, output = c("network", "stats", "edgelist", "ergm_state"), simplify = TRUE, sequential = TRUE, control = control.simulate.formula(), verbose = FALSE, ..., return.args = NULL ) ## S3 method for class 'ergm' simulate( object, nsim = 1, seed = NULL, coef = coefficients(object), response = object$network %ergmlhs% "response", reference = object$reference, constraints = list(object$constraints, object$obs.constraints), observational = FALSE, monitor = NULL, basis = if (observational) object$network else NVL(object$newnetwork, object$network), statsonly = FALSE, esteq = FALSE, output = c("network", "stats", "edgelist", "ergm_state"), simplify = TRUE, sequential = TRUE, control = control.simulate.ergm(), verbose = FALSE, ..., return.args = NULL )
object |
Either a |
nsim |
Number of networks to be randomly drawn from the given distribution on the set of all networks, returned by the Metropolis-Hastings algorithm. |
seed |
Seed value (integer) for the random number generator. See
|
... |
Further arguments passed to or used by methods. |
basis |
a value (usually a |
coef |
Vector of parameter values for the model from which the
sample is to be drawn. If |
response |
Either a character string, a formula, or
|
reference |
A one-sided formula specifying
the reference measure ( |
constraints |
A formula specifying one or more constraints
on the support of the distribution of the networks being modeled. Multiple constraints
may be given, separated by “+” and “-” operators. See
The default is to have no constraints except those provided through
the Together with the model terms in the formula and the reference measure, the constraints define the distribution of networks being modeled. It is also possible to specify a proposal function directly either
by passing a string with the function's name (in which case,
arguments to the proposal should be specified through the
Note that not all possible combinations of constraints and reference measures are supported. However, for relatively simple constraints (i.e., those that simply permit or forbid specific dyads or sets of dyads from changing), arbitrary combinations should be possible. |
observational |
Inherit observational constraints rather than model constraints. |
monitor |
A one-sided formula specifying one or more terms
whose value is to be monitored. These terms are appended to the
model, along with a coefficient of 0, so their statistics are
returned. An |
statsonly |
Logical: If TRUE, return only the network statistics, not
the network(s) themselves. Deprecated in favor of |
esteq |
Logical: If TRUE, compute the sample estimating equations of an ERGM: if the model is non-curved, all non-offset statistics are returned either way, but if the model is curved, the score estimating function values (3.1) by Hunter and Handcock (2006) are returned instead. |
output |
Normally character, one of Alternatively, a function with prototype
|
simplify |
Logical: If |
sequential |
Logical: If FALSE, each of the |
control |
A list of control parameters for algorithm tuning,
typically constructed with |
verbose |
A logical or an integer to control the amount of
progress and diagnostic information to be printed. |
do.sim |
Logical; a deprecated interface superseded by |
return.args |
Character; if not |
A sample of networks is randomly drawn from the specified model. The model
is specified by the first argument of the function. If the first argument
is a formula
then this defines the model. If the first
argument is the output of a call to ergm()
then the model used
for that call is the one fit – and unless coef
is specified, the
sample is from the MLE of the parameters. If neither of those are given as
the first argument then a Bernoulli network is generated with the
probability of ties defined by prob
or coef
.
Note that the first network is sampled after burnin
steps,
and any subsequent networks are sampled each interval
steps
after the first.
More information can be found by looking at the documentation of
ergm()
.
If output=="stats"
an mcmc
object containing the
simulated network statistics. If control$parallel>0
, an
mcmc.list
object. If simplify=TRUE
(the default), these
would then be "stacked" and converted to a standard matrix
. A
logical vector indicating whether or not the term had come from
the monitor=
formula is stored in attr()
-style attribute
"monitored"
.
Otherwise, a representation of the simulated network is returned,
in the form specified by output
. In addition to a network
representation or a list thereof, they have the following
attr()
-style attributes:
formula
The formula
used to generate the
sample.
stats
control
Control parameters used to generate the sample.
constraints
Constraints used to generate the sample.
reference
The reference measure for the sample.
monitor
The monitoring formula.
response
The edge attribute used as a response.
The following are the permitted network formats:
"network"
If nsim==1
, an object of class
network
. If nsim>1
, it returns an object of class
network.list
(a list of networks) with the
above-listed additional attributes.
"edgelist"
An edgelist
representation of the network,
or a list thereof, depending on nsim
.
"ergm_state"
A semi-internal representation of
a network consisting of a network
object emptied of edges, with
an attached edgelist matrix, or a list thereof, depending on
nsim
.
If simplify==FALSE
, the networks are returned as a nested list,
with outer list being the parallel chain (including 1 for no
parallelism) and inner list being the samples within that chains
(including 1, if one network per chain). If TRUE
, they are
concatenated, and if a total of one network had been simulated, the
network itself will be returned.
simulate(ergm_state_full)
: a low-level function to simulate from an ergm_state
object.
The actual network
method for simulate_formula()
is
actually called .simulate_formula.network()
and is also
exported as an object. This allows it to be overridden by
extension packages, such as tergm
, but also accessed directly
when needed.
simulate.ergm_model()
is a lower-level interface, providing
a simulate()
method for the ergm_model
class. The basis
argument is required; monitor
, if passed, must be an
ergm_model
as well; and constraints
can be an
ergm_proposal
object instead.
ergm()
, network
,
ergm_MCMC_sample()
for a demonstration of return.args=
.
# # Let's draw from a Bernoulli model with 16 nodes # and density 0.5 (i.e., coef = c(0,0)) # g.sim <- simulate(network(16) ~ edges + mutual, coef=c(0, 0)) # # What are the statistics like? # summary(g.sim ~ edges + mutual) # # Now simulate a network with higher mutuality # g.sim <- simulate(network(16) ~ edges + mutual, coef=c(0,2)) # # How do the statistics look? # summary(g.sim ~ edges + mutual) # # Let's draw from a Bernoulli model with 16 nodes # and tie probability 0.1 # g.use <- network(16,density=0.1,directed=FALSE) # # Starting from this network let's draw 3 realizations # of a edges and 2-star network # g.sim <- simulate(~edges+kstar(2), nsim=3, coef=c(-1.8,0.03), basis=g.use, control=control.simulate( MCMC.burnin=1000, MCMC.interval=100)) g.sim summary(g.sim) # # attach the Florentine Marriage data # data(florentine) # # fit an edges and 2-star model using the ergm function # gest <- ergm(flomarriage ~ edges + kstar(2)) summary(gest) # # Draw from the fitted model (statistics only), and observe the number # of triangles as well. # g.sim <- simulate(gest, nsim=10, monitor=~triangles, output="stats", control=control.simulate.ergm(MCMC.burnin=1000, MCMC.interval=100)) g.sim # Custom output: store the edgecount (computed in R), iteration index, and chain index. output.f <- function(x, iter, chain, ...){ list(nedges = network.edgecount(as.network(x)), chain = chain, iter = iter) } g.sim <- simulate(gest, nsim=3, output=output.f, simplify=FALSE, control=control.simulate.ergm(MCMC.burnin=1000, MCMC.interval=100)) unclass(g.sim)
# # Let's draw from a Bernoulli model with 16 nodes # and density 0.5 (i.e., coef = c(0,0)) # g.sim <- simulate(network(16) ~ edges + mutual, coef=c(0, 0)) # # What are the statistics like? # summary(g.sim ~ edges + mutual) # # Now simulate a network with higher mutuality # g.sim <- simulate(network(16) ~ edges + mutual, coef=c(0,2)) # # How do the statistics look? # summary(g.sim ~ edges + mutual) # # Let's draw from a Bernoulli model with 16 nodes # and tie probability 0.1 # g.use <- network(16,density=0.1,directed=FALSE) # # Starting from this network let's draw 3 realizations # of a edges and 2-star network # g.sim <- simulate(~edges+kstar(2), nsim=3, coef=c(-1.8,0.03), basis=g.use, control=control.simulate( MCMC.burnin=1000, MCMC.interval=100)) g.sim summary(g.sim) # # attach the Florentine Marriage data # data(florentine) # # fit an edges and 2-star model using the ergm function # gest <- ergm(flomarriage ~ edges + kstar(2)) summary(gest) # # Draw from the fitted model (statistics only), and observe the number # of triangles as well. # g.sim <- simulate(gest, nsim=10, monitor=~triangles, output="stats", control=control.simulate.ergm(MCMC.burnin=1000, MCMC.interval=100)) g.sim # Custom output: store the edgecount (computed in R), iteration index, and chain index. output.f <- function(x, iter, chain, ...){ list(nedges = network.edgecount(as.network(x)), chain = chain, iter = iter) } g.sim <- simulate(gest, nsim=3, output=output.f, simplify=FALSE, control=control.simulate.ergm(MCMC.burnin=1000, MCMC.interval=100)) unclass(g.sim)
simulate
Method for formula
objects that dispatches based on the Left-Hand SideThis method evaluates the left-hand side (LHS) of the given formula and dispatches it to an appropriate method based on the result by setting an nonce class name on the formula.
## S3 method for class 'formula' simulate(object, nsim = 1, seed = NULL, ..., basis, newdata, data) ## S3 method for class 'formula_lhs' simulate(object, nsim = 1, seed = NULL, ...)
## S3 method for class 'formula' simulate(object, nsim = 1, seed = NULL, ..., basis, newdata, data) ## S3 method for class 'formula_lhs' simulate(object, nsim = 1, seed = NULL, ...)
object |
a one- or two-sided |
nsim , seed
|
number of realisations to simulate and the random
seed to use; see |
... |
additional arguments to methods. |
basis |
if given, overrides the LHS of the formula for the purposes of dispatching. |
newdata , data
|
if passed, the The dispatching works as follows:
A "method" to receive a formula whose LHS evaluates to CLASS
can therefore be implemented by a function
|
simulate(formula_lhs)
: A function to catch the situation when there is no method implemented for the class to which the LHS evaluates.
simulate.ergm()
family of functions, which uses this interface.
This term adds one statistic, having as its
value the number of edges in the network for which the incident
actors' attribute values differ less than cutoff
; that is,
number of edges between i
to j
such that
abs(attr[i]-attr[j])<cutoff
.
# binary: smalldiff(attr, cutoff)
# binary: smalldiff(attr, cutoff)
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
maximum |
difference in attribute values for ties to be considered |
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, quantitative nodal attribute, undirected, binary
Adds the number of statistics equal to the length of threshold
equaling to the number of dyads whose values are exceeded by the
corresponding element of threshold
.
# valued: smallerthan(threshold=0)
# valued: smallerthan(threshold=0)
threshold |
vector of numerical values |
ergmTerm
for index of model terms currently visible to the package.
directed, dyad-independent, undirected, valued
A utility to facilitate argument completion of control lists, reexported from statnet.common
.
This list is updated as packages are loaded and unloaded.
control.ergm
drop, init, init.method, main.method, force.main, main.hessian,
checkpoint, resume, MPLE.samplesize, init.MPLE.samplesize,
MPLE.type, MPLE.maxit, MPLE.nonvar, MPLE.nonident,
MPLE.nonident.tol, MPLE.covariance.samplesize,
MPLE.covariance.method, MPLE.covariance.sim.burnin,
MPLE.covariance.sim.interval, MPLE.check,
MPLE.constraints.ignore, MCMC.prop, MCMC.prop.weights,
MCMC.prop.args, MCMC.interval, MCMC.burnin,
MCMC.samplesize, MCMC.effectiveSize,
MCMC.effectiveSize.damp, MCMC.effectiveSize.maxruns,
MCMC.effectiveSize.burnin.pval,
MCMC.effectiveSize.burnin.min,
MCMC.effectiveSize.burnin.max,
MCMC.effectiveSize.burnin.nmin,
MCMC.effectiveSize.burnin.nmax,
MCMC.effectiveSize.burnin.PC,
MCMC.effectiveSize.burnin.scl,
MCMC.effectiveSize.order.max, MCMC.return.stats,
MCMC.runtime.traceplot, MCMC.maxedges, MCMC.addto.se,
MCMC.packagenames, SAN.maxit, SAN.nsteps.times, SAN,
MCMLE.termination, MCMLE.maxit, MCMLE.conv.min.pval,
MCMLE.confidence, MCMLE.confidence.boost,
MCMLE.confidence.boost.threshold,
MCMLE.confidence.boost.lag, MCMLE.NR.maxit,
MCMLE.NR.reltol, obs.MCMC.mul, obs.MCMC.samplesize.mul,
obs.MCMC.samplesize, obs.MCMC.effectiveSize,
obs.MCMC.interval.mul, obs.MCMC.interval,
obs.MCMC.burnin.mul, obs.MCMC.burnin, obs.MCMC.prop,
obs.MCMC.prop.weights, obs.MCMC.prop.args,
obs.MCMC.impute.min_informative,
obs.MCMC.impute.default_density, MCMLE.min.depfac,
MCMLE.sampsize.boost.pow, MCMLE.MCMC.precision,
MCMLE.MCMC.max.ESS.frac, MCMLE.metric, MCMLE.method,
MCMLE.dampening, MCMLE.dampening.min.ess,
MCMLE.dampening.level, MCMLE.steplength.margin,
MCMLE.steplength, MCMLE.steplength.parallel,
MCMLE.sequential, MCMLE.density.guard.min,
MCMLE.density.guard, MCMLE.effectiveSize,
obs.MCMLE.effectiveSize, MCMLE.interval, MCMLE.burnin,
MCMLE.samplesize.per_theta, MCMLE.samplesize.min,
MCMLE.samplesize, obs.MCMLE.samplesize.per_theta,
obs.MCMLE.samplesize.min, obs.MCMLE.samplesize,
obs.MCMLE.interval, obs.MCMLE.burnin,
MCMLE.steplength.solver, MCMLE.last.boost,
MCMLE.steplength.esteq, MCMLE.steplength.miss.sample,
MCMLE.steplength.min, MCMLE.effectiveSize.interval_drop,
MCMLE.save_intermediates, MCMLE.nonvar, MCMLE.nonident,
MCMLE.nonident.tol, SA.phase1_n, SA.initial_gain,
SA.nsubphases, SA.min_iterations, SA.max_iterations,
SA.phase3_n, SA.interval, SA.burnin, SA.samplesize,
CD.samplesize.per_theta, obs.CD.samplesize.per_theta,
CD.nsteps, CD.multiplicity, CD.nsteps.obs,
CD.multiplicity.obs, CD.maxit, CD.conv.min.pval,
CD.NR.maxit, CD.NR.reltol, CD.metric, CD.method,
CD.dampening, CD.dampening.min.ess, CD.dampening.level,
CD.steplength.margin, CD.steplength, CD.adaptive.epsilon,
CD.steplength.esteq, CD.steplength.miss.sample,
CD.steplength.min, CD.steplength.parallel,
CD.steplength.solver, loglik, term.options, seed,
parallel, parallel.type, parallel.version.check,
parallel.inherit.MT, ...
control.ergm.bridge
bridge.nsteps, bridge.target.se, bridge.bidirectional, drop,
MCMC.burnin, MCMC.burnin.between, MCMC.interval,
MCMC.samplesize, obs.MCMC.burnin,
obs.MCMC.burnin.between, obs.MCMC.interval,
obs.MCMC.samplesize, MCMC.prop, MCMC.prop.weights,
MCMC.prop.args, obs.MCMC.prop,
obs.MCMC.prop.weights, obs.MCMC.prop.args,
MCMC.maxedges, MCMC.packagenames, term.options,
seed, parallel, parallel.type,
parallel.version.check, parallel.inherit.MT, ...
control.ergm.godfather
term.options
control.gof.ergm
nsim, MCMC.burnin, MCMC.interval, MCMC.batch, MCMC.prop,
MCMC.prop.weights, MCMC.prop.args, MCMC.maxedges,
MCMC.packagenames, MCMC.runtime.traceplot,
network.output, seed, parallel, parallel.type,
parallel.version.check, parallel.inherit.MT
control.gof.formula
nsim, MCMC.burnin, MCMC.interval, MCMC.batch, MCMC.prop,
MCMC.prop.weights, MCMC.prop.args, MCMC.maxedges,
MCMC.packagenames, MCMC.runtime.traceplot,
network.output, seed, parallel, parallel.type,
parallel.version.check, parallel.inherit.MT
control.logLik.ergm
bridge.nsteps, bridge.target.se, bridge.bidirectional, drop,
MCMC.burnin, MCMC.interval, MCMC.samplesize,
obs.MCMC.samplesize, obs.MCMC.interval,
obs.MCMC.burnin, MCMC.prop, MCMC.prop.weights,
MCMC.prop.args, obs.MCMC.prop,
obs.MCMC.prop.weights, obs.MCMC.prop.args,
MCMC.maxedges, MCMC.packagenames, term.options,
seed, parallel, parallel.type,
parallel.version.check, parallel.inherit.MT, ...
control.san
SAN.maxit, SAN.tau, SAN.invcov, SAN.invcov.diag, SAN.nsteps.alloc,
SAN.nsteps, SAN.samplesize, SAN.prop, SAN.prop.weights,
SAN.prop.args, SAN.packagenames, SAN.ignore.finite.offsets,
term.options, seed, parallel, parallel.type,
parallel.version.check, parallel.inherit.MT
control.simulate
MCMC.burnin, MCMC.interval, MCMC.prop, MCMC.prop.weights,
MCMC.prop.args, MCMC.batch, MCMC.effectiveSize,
MCMC.effectiveSize.damp, MCMC.effectiveSize.maxruns,
MCMC.effectiveSize.burnin.pval,
MCMC.effectiveSize.burnin.min,
MCMC.effectiveSize.burnin.max,
MCMC.effectiveSize.burnin.nmin,
MCMC.effectiveSize.burnin.nmax,
MCMC.effectiveSize.burnin.PC,
MCMC.effectiveSize.burnin.scl,
MCMC.effectiveSize.order.max, MCMC.maxedges,
MCMC.packagenames, MCMC.runtime.traceplot,
network.output, term.options, parallel, parallel.type,
parallel.version.check, parallel.inherit.MT, ...
control.simulate.ergm
MCMC.burnin, MCMC.interval, MCMC.scale, MCMC.prop, MCMC.prop.weights,
MCMC.prop.args, MCMC.batch, MCMC.effectiveSize,
MCMC.effectiveSize.damp,
MCMC.effectiveSize.maxruns,
MCMC.effectiveSize.burnin.pval,
MCMC.effectiveSize.burnin.min,
MCMC.effectiveSize.burnin.max,
MCMC.effectiveSize.burnin.nmin,
MCMC.effectiveSize.burnin.nmax,
MCMC.effectiveSize.burnin.PC,
MCMC.effectiveSize.burnin.scl,
MCMC.effectiveSize.order.max, MCMC.maxedges,
MCMC.packagenames, MCMC.runtime.traceplot,
network.output, term.options, parallel,
parallel.type, parallel.version.check,
parallel.inherit.MT, ...
control.simulate.formula
MCMC.burnin, MCMC.interval, MCMC.prop, MCMC.prop.weights,
MCMC.prop.args, MCMC.batch,
MCMC.effectiveSize, MCMC.effectiveSize.damp,
MCMC.effectiveSize.maxruns,
MCMC.effectiveSize.burnin.pval,
MCMC.effectiveSize.burnin.min,
MCMC.effectiveSize.burnin.max,
MCMC.effectiveSize.burnin.nmin,
MCMC.effectiveSize.burnin.nmax,
MCMC.effectiveSize.burnin.PC,
MCMC.effectiveSize.burnin.scl,
MCMC.effectiveSize.order.max, MCMC.maxedges,
MCMC.packagenames, MCMC.runtime.traceplot,
network.output, term.options, parallel,
parallel.type, parallel.version.check,
parallel.inherit.MT, ...
control.simulate.formula.ergm
MCMC.burnin, MCMC.interval, MCMC.prop, MCMC.prop.weights,
MCMC.prop.args, MCMC.batch,
MCMC.effectiveSize,
MCMC.effectiveSize.damp,
MCMC.effectiveSize.maxruns,
MCMC.effectiveSize.burnin.pval,
MCMC.effectiveSize.burnin.min,
MCMC.effectiveSize.burnin.max,
MCMC.effectiveSize.burnin.nmin,
MCMC.effectiveSize.burnin.nmax,
MCMC.effectiveSize.burnin.PC,
MCMC.effectiveSize.burnin.scl,
MCMC.effectiveSize.order.max,
MCMC.maxedges, MCMC.packagenames,
MCMC.runtime.traceplot, network.output,
term.options, parallel, parallel.type,
parallel.version.check,
parallel.inherit.MT, ...
This term adds one network statistic for each node equal to the number of
ties of that node. For directed networks, see sender
and
receiver
.
# binary: sociality(attr=NULL, base=1, levels=NULL, nodes=-1) # valued: sociality(attr=NULL, base=1, levels=NULL, nodes=-1, form="sum")
# binary: sociality(attr=NULL, base=1, levels=NULL, nodes=-1) # valued: sociality(attr=NULL, base=1, levels=NULL, nodes=-1, form="sum")
attr , levels
|
this optional argument is deprecated and will be replaced with a more elegant implementation in a future release. In the meantime, it specifies a categorical vertex attribute (see Specifying Vertex attributes and Levels ( |
base |
deprecated |
nodes |
By default, |
form |
character how to aggregate tie values in a valued ERGM |
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and levels
are passed,
levels
overrides base
.
The argument base
is retained for backwards compatibility and may be
removed in a future version. When both base
and nodes
are passed,
nodes
overrides base
.
This term can only be used with undirected networks.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, dyad-independent, undirected, binary, valued
The network is sparse. This typically results in a Tie-Non-Tie (TNT) proposal regime.
# sparse
# sparse
ergmHint
for index of constraints and hints currently visible to the package.
dyad-independent
coda
's spectrum0.ar()
.Its return value, divided by nrow(cbind(x))
, is the estimated
variance-covariance matrix of the sampling distribution of the mean
of x
if x
is a multivatriate time series with AR() structure, with
determined by AIC.
spectrum0.mvar( x, order.max = NULL, aic = is.null(order.max), tol = .Machine$double.eps^0.5, ... )
spectrum0.mvar( x, order.max = NULL, aic = is.null(order.max), tol = .Machine$double.eps^0.5, ... )
x |
a matrix with observations in rows and variables in columns. |
order.max |
maximum (or fixed) order for the AR model. |
aic |
use AIC to select the order (up to |
tol |
tolerance used in detecting multicollinearity. See Note below. |
... |
additional arguments to |
A square matrix with dimension equalling to the number of
columns of x
, with an additional attribute "infl"
giving the
factor by which the effective sample size is reduced due to
autocorrelation, according to the Vats, Flegal, and Jones (2015)
estimate for ESS.
ar()
fails if crossprod(x)
is singular. This is
is remedied as follows:
Standardize the variables.
Use the eigenvectors to map the variables onto their principal components.
Use the eigenvalues to standardize the principal components.
Drop those components whose standard deviation differs from 1 by more than tol
. This should filter out redundant components or those too numerically unstable.
Call ar()
and calculate the variance.
Reverse the mapping in steps 1-4 to obtain the variance of the original data.
Specifies each dyad's baseline distribution to be the normal distribution with mean 0 and variance 1.
# StdNormal
# StdNormal
ergmReference
for index of reference distributions currently visible to the package.
continuous
Proposed toggles are stratified according to mixing type on a vertex attribute.
# strat(attr=NULL, pmat=NULL, empirical=FALSE)
# strat(attr=NULL, pmat=NULL, empirical=FALSE)
The user may pass a vertex attribute attr
as an argument
(the default for attr
gives every vertex the same attribute
value), and may also pass a matrix of weights pmat
(the default
for pmat
gives equal weight to each mixing type). See
Specifying Vertex Attributes and Levels for details on specifying vertex attributes. The
matrix pmat
, if specified, must have the same dimensions as a
mixing matrix for the network and vertex attribute under
consideration, and the correspondence between rows and columns of
pmat
and values of attr
is the same as for a mixing matrix.
The interpretation is that pmat[i,j]/sum(pmat)
is the probability of
proposing a toggle for mixing type (i,j)
. (For undirected, unipartite
networks, pmat
is first symmetrized, and then entries below the diagonal
are set to zero. Only entries on or above the diagonal of the symmetrized
pmat
are considered when making proposals. This accounts for the
convention that mixing is undirected in an undirected, unipartite network:
a tail of type i
and a head of type j
has the same mixing type
as a tail of type j
and a head of type i
.)
As an alternative way of specifying pmat
, the user may pass
empirical = TRUE
to use the mixing matrix of the network beginning
the MCMC chain as pmat
. In order for this to work, that network should
have a reasonable (in particular, nonempty) edge set.
While some mixing types may be assigned zero proposal probability
(either with a direct specification of pmat
or with empirical = TRUE
),
this will not be recognized as a constraint by all components of ergm
,
and should be used with caution.
ergmHint
for index of constraints and hints currently visible to the package.
dyad-independent
This term adds one statistic equal to the sum of
dyad values taken to the power pow
.
# valued: sum(pow=1)
# valued: sum(pow=1)
pow |
power of dyad values. Defaults to 1. |
ergmTerm
for index of model terms currently visible to the package.
directed, undirected, valued
This operator sums up the RHS statistics of the input formulas elementwise.
# binary: Sum(formulas, label) # valued: Sum(formulas, label)
# binary: Sum(formulas, label) # valued: Sum(formulas, label)
formulas |
a list (constructed using If a formula in the list has an LHS, it is interpreted as follows:
|
label |
used to specify the names of the elements of the resulting term sum vector. If |
Note that each formula must either produce the same number of statistics or be mapped through a matrix to produce the same number of statistics.
A single formula is also permitted. This can be useful if one wishes to, say, scale or sum up the statistics returned by a formula.
Offsets are ignored unless there is only one formula and the transformation only scales the statistics (i.e., the effective transformation matrix is diagonal).
Curved models are supported, subject to some limitations. In particular, the first model's etamap will be used, overwriting the others. If label
is not of length 1, it should have an attr
-style attribute "curved"
specifying the names for the curved parameters.
ergmTerm
for index of model terms currently visible to the package.
operator, binary, valued
base::summary()
method for ergm()
fits.
## S3 method for class 'ergm' summary( object, ..., correlation = FALSE, covariance = FALSE, total.variation = TRUE ) ## S3 method for class 'summary.ergm' print( x, digits = max(3, getOption("digits") - 3), correlation = x$correlation, covariance = x$covariance, signif.stars = getOption("show.signif.stars"), eps.Pvalue = 1e-04, print.formula = FALSE, print.fitinfo = TRUE, print.coefmat = TRUE, print.message = TRUE, print.deviances = TRUE, print.drop = TRUE, print.offset = TRUE, print.call = TRUE, ... )
## S3 method for class 'ergm' summary( object, ..., correlation = FALSE, covariance = FALSE, total.variation = TRUE ) ## S3 method for class 'summary.ergm' print( x, digits = max(3, getOption("digits") - 3), correlation = x$correlation, covariance = x$covariance, signif.stars = getOption("show.signif.stars"), eps.Pvalue = 1e-04, print.formula = FALSE, print.fitinfo = TRUE, print.coefmat = TRUE, print.message = TRUE, print.deviances = TRUE, print.drop = TRUE, print.offset = TRUE, print.call = TRUE, ... )
object |
an object of class |
... |
For |
correlation |
logical; if |
covariance |
logical; if |
total.variation |
logical; if |
x |
object of class |
digits |
significant digits for coefficients |
signif.stars |
whether to print dots and stars to signify
statistical significance. See |
eps.Pvalue |
|
print.formula , print.fitinfo , print.coefmat , print.message , print.deviances , print.drop , print.offset , print.call
|
which components of the fit summary to print. |
summary.ergm()
tries to be smart about formatting the
coefficients, standard errors, etc.
The default printout of the summary object contains the call, number of iterations used, null and residual deviances, and the values of AIC and BIC (and their MCMC standard errors, if applicable). The coefficient table contains the following columns:
Estimate
, Std. Error
- parameter estimates and their standard errors
MCMC %
- if total.variation=TRUE
(default) the percentage of standard
error attributable to MCMC estimation process rounded to an integer. See
also vcov.ergm()
and its sources
argument.
z value
, Pr(>|z|)
- z-test and p-values
The returned object is a list of class "ergm.summary" with the following elements:
formula |
ERGM model formula |
call |
R call used to fit the model |
correlation , covariance
|
whether to print correlation/covariance matrices of the estimated parameters |
pseudolikelihood |
was the model estimated with MPLE |
independence |
is the model dyad-independent |
control |
the |
samplesize |
MCMC sample size |
message |
optional message on the validity of the standard error estimates |
null.lik.0 |
It is |
devtext , devtable
|
Deviance type and table |
aic , bic
|
values of AIC and BIC |
coefficients |
matrices with model parameters and associated statistics |
asycov |
asymptotic covariance matrix |
asyse |
asymptotic standard error matrix |
offset , drop , estimate , iterations , mle.lik , null.lik
|
see documentation of the object returned by |
The model fitting function ergm()
, print.ergm()
, and
base::summary()
. Function stats::coef()
will extract the matrix of
coefficients with standard errors, t-statistics and p-values.
data(florentine) x <- ergm(flomarriage ~ density) summary(x)
data(florentine) x <- ergm(flomarriage ~ density) summary(x)
Most generally, this function computes those summaries of the
object on the LHS of the formula that are specified by its RHS. In
particular, if given a network as its LHS and
ergmTerm
on its RHS, it computes the sufficient
statistics associated with those terms.
## S3 method for class 'formula' summary(object, ...)
## S3 method for class 'formula' summary(object, ...)
object |
A formula having as its LHS a
|
... |
further arguments passed to or used by methods. |
In practice, summary.formula()
is a thin wrapper around the
summary_formula()
generic, which dispatches methods based on the
class of the LHS of the formula.
A vector of statistics specified in RHS of the formula.
# # Lets look at the Florentine marriage data # data(florentine) # # test the summary_formula function # summary(flomarriage ~ edges + kstar(2)) m <- as.matrix(flomarriage) summary(m ~ edges) # twice as large as it should be summary(m ~ edges, directed=FALSE) # Now it's correct
# # Lets look at the Florentine marriage data # data(florentine) # # test the summary_formula function # summary(flomarriage ~ edges + kstar(2)) m <- as.matrix(flomarriage) summary(m ~ edges) # twice as large as it should be summary(m ~ edges, directed=FALSE) # Now it's correct
Evaluates the terms in formula
on an undirected network
constructed by symmetrizing the LHS network using one of four rules:
"weak" A tie is present in the constructed
network if the LHS network has either tie
or
(or both).
"strong" A tie is present in the constructed
network if the LHS network has both tie
and tie
.
"upper" A tie is present in the constructed
network if the LHS network has tie
:
the upper triangle of the LHS network.
"lower" A tie is present in the constructed
network if the LHS network has tie
:
the lower triangle of the LHS network.
# binary: Symmetrize(formula, rule="weak")
# binary: Symmetrize(formula, rule="weak")
formula |
a one-sided |
rule |
one of |
ergmTerm
for index of model terms currently visible to the package.
directed, operator, binary
For an undirected network, this term adds one statistic equal to the number
of 3-trails, where a 3-trail is defined as a trail of length three that
traverses three distinct edges.
Note that a 3-trail need not
include four distinct nodes; in particular, a triangle counts as three
3-trails. For a directed network, this term adds four statistics
(or some subset of these four),
one for each of the four distinct types of directed three-paths. If the
nodes of the path are written from left to right such that the middle edge
points to the right (R), then the four types are RRR, RRL, LRR, and LRL.
That is, an RRR 3-trail is of the form
, and RRL
3-trail is of the form
, etc.
Like in the undirected case, there is no requirement that the nodes be
distinct in a directed 3-trail. However, the three edges must all be
distinct. Thus, a mutual tie
does not
count as a 3-trail of the form
; however,
in the subnetwork
,
there are two directed 3-trails, one LRR
(
)
and one RRR
(
).
# binary: threetrail(keep=NULL, levels=NULL) # binary: threepath(keep=NULL, levels=NULL)
# binary: threetrail(keep=NULL, levels=NULL) # binary: threepath(keep=NULL, levels=NULL)
keep |
deprecated |
levels |
specify a subset of the four statistics for directed networks. (See Specifying Vertex
attributes and Levels ( |
The argument keep
is retained for backwards compatibility and may be
removed in a future version. When both keep
and levels
are passed,
levels
overrides keep
.
This term used to be (inaccurately) called threepath
. That
name has been deprecated and may be removed in a future version.
ergmTerm
for index of model terms currently visible to the package.
directed, triad-related, undirected, binary
This term adds one statistic to the model, equal to the number of triads in
the network that are transitive. The transitive triads are those of type
120D
, 030T
, 120U
, or 300
in the categorization
of Davis and Leinhardt (1972). For details on the 16 possible triad types,
see ?triad.classify
in the sna package.
Note the distinction from the ttriple
term. This term can only be
used with directed networks.
# binary: transitive
# binary: transitive
ergmTerm
for index of model terms currently visible to the package.
directed, triad-related, binary
This term adds one statistic, equal to the number of ties
such that there exists a two-path from
to
. (Related to the
ttriple
term.)
# binary: transitiveties(attr=NULL, levels=NULL)
# binary: transitiveties(attr=NULL, levels=NULL)
attr |
quantitative attribute (see Specifying Vertex attributes and Levels ( |
levels |
TODO (See Specifying Vertex
attributes and Levels ( |
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, triad-related, undirected, binary
This statistic implements the transitive weights statistic defined by Krivitsky (2012), Equation 13. For each of these options, the first (and the default) is more stable but also more conservative, while the second is more sensitive but more likely to induce a multimodal distribution of networks.
# valued: transitiveweights(twopath="min", combine="max", affect="min")
# valued: transitiveweights(twopath="min", combine="max", affect="min")
twopath |
the minimum
of the constituent dyads ( |
combine |
the maximum of the
2-path strengths ( |
affect |
the minimum of the focus dyad and the
combined strength of the two paths ( |
ergmTerm
for index of model terms currently visible to the package.
directed, nonnegative, triad-related, undirected, valued
For a directed network, this term adds one network statistic for each of
an arbitrary subset of the 16 possible types of triads categorized by
Davis and Leinhardt (1972) as 003, 012, 102, 021D, 021U, 021C, 111D,
111U, 030T, 030C, 201, 120D, 120U, 120C, 210,
and 300
. Note that at
least one category should be dropped; otherwise a linear dependency will
exist among the 16 statistics, since they must sum to the total number of
three-node sets. By default, the category 003
, which is the category
of completely empty three-node sets, is dropped. This is considered category
zero, and the others are numbered 1 through 15 in the order given above. Each statistic is the count of the corresponding triad
type in the network. For details on the 16 types, see ?triad.classify
in the sna package, on which this code is based. For an undirected
network, the triad census is over the four types defined by the number of
ties (i.e., 0, 1, 2, and 3).
# binary: triadcensus(levels)
# binary: triadcensus(levels)
levels |
For directed networks, specify a set of terms to add other than the default value of |
ergmTerm
for index of model terms currently visible to the package.
directed, triad-related, undirected, binary
The network has a high clustering coefficient. This typically results in alternating between the Tie-Non-Tie (TNT) proposal and a triad-focused proposal along the lines of that of Wang and Atchadé (2013).
# triadic(triFocus = 0.25, type="OTP") # .triadic(triFocus = 0.25, type = "OTP")
# triadic(triFocus = 0.25, type="OTP") # .triadic(triFocus = 0.25, type = "OTP")
triFocus |
A number between 0 and 1, indicating how often triad-focused proposals should be made relative to the standard proposals. |
type |
A string indicating the type of shared partner or path to be considered for directed networks: |
While there is only one shared partner configuration in the undirected
case, nine distinct configurations are possible for directed graphs, selected
using the type
argument. Currently, terms may be defined with respect to
five of these configurations; they are defined here as follows (using
terminology from Butts (2008) and the relevent
package):
Outgoing Two-path ("OTP"
): vertex is an OTP shared partner of ordered
pair
iff
. Also known as "transitive
shared partner".
Incoming Two-path ("ITP"
): vertex is an ITP shared partner of ordered
pair
iff
. Also known as "cyclical shared
partner"
Reciprocated Two-path ("RTP"
): vertex is an RTP shared partner of ordered
pair
iff
.
Outgoing Shared Partner ("OSP"
): vertex is an OSP shared partner of
ordered pair
iff
.
Incoming Shared Partner ("ISP"
): vertex is an ISP shared partner of ordered
pair
iff
.
By default, outgoing two-paths ("OTP"
) are calculated. Note that Robins et al. (2009)
define closely related statistics to several of the above, using slightly different terminology.
.triadic()
versus triadic()
If given a bipartite network, the dotted form will skip silently, whereas the plain form will raise an error, since triadic effects are not possible in bipartite networks. The dotted form is thus suitable as a default argument when the bipartitedness of the network is not known a priori.
Wang J, Atchadé YF (2013). “Approximate Bayesian Computation for Exponential Random Graph Models for Large Social Networks.” Communications in Statistics - Simulation and Computation, 43(2), 359–377. ISSN 1532-4141, doi:10.1080/03610918.2012.703359.
ergmHint
for index of constraints and hints currently visible to the package.
dyad-dependent
By default, this term adds one statistic to the model equal to the number of triangles
in the network. For an undirected network, a triangle is defined to be any
set of three edges. For a directed network, a
triangle is defined as any set of three edges
and
and either
or
. The former case is called a "transitive
triple" and the latter is called a "cyclic triple", so in the case of a
directed network,
triangle
equals ttriple
plus ctriple
— thus at most two of these three terms can be in a model.
# binary: triangle(attr=NULL, diff=FALSE, levels=NULL) # binary: triangles(attr=NULL, diff=FALSE, levels=NULL)
# binary: triangle(attr=NULL, diff=FALSE, levels=NULL) # binary: triangles(attr=NULL, diff=FALSE, levels=NULL)
attr , diff
|
quantitative attribute (see Specifying Vertex attributes and Levels ( |
levels |
add one statistic for each value specified if |
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, frequently-used, triad-related, undirected, binary
By default, this term adds one statistic to the model equal to 100 times the ratio of
the number of triangles in the network to the sum of the number of triangles
and the number of 2-stars not in triangles (the latter is considered a
potential but incomplete triangle). In case the denominator equals zero,
the statistic is defined to be zero. For the definition of triangle, see
triangle
. This is often called
the mean correlation coefficient. This term can only be
used with undirected networks; for directed networks, it is difficult to
define the numerator and denominator in a consistent and meaningful way.
# binary: tripercent(attr=NULL, diff=FALSE, levels=NULL)
# binary: tripercent(attr=NULL, diff=FALSE, levels=NULL)
attr , diff
|
quantitative attribute (see Specifying Vertex attributes and Levels ( |
levels |
add one statistic for each value specified if |
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, triad-related, undirected, binary
By default, this term adds one statistic to the model, equal to the number of transitive
triples in the network, defined as a set of edges . Note that
triangle
equals ttriple+ctriple
for a directed network, so at
most two of the three terms can be in a model.
# binary: ttriple(attr=NULL, diff=FALSE, levels=NULL) # binary: ttriad
# binary: ttriple(attr=NULL, diff=FALSE, levels=NULL) # binary: ttriad
attr |
a vertex attribute specification (see Specifying Vertex attributes and Levels ( |
diff |
If |
levels |
add one statistic for each value specified if |
This term can only be used with directed networks.
ergmTerm
for index of model terms currently visible to the package.
categorical nodal attribute, directed, triad-related, binary
This term adds one statistic to the model, equal to the number of 2-paths in
the network. For a directed network this is defined as a pair of edges
, where
and
must be distinct. That is, it is a directed path of length 2 from
to
via
. For directed networks a 2-path is also a
mixed 2-star but the interpretation is usually different; see
m2star
.
For undirected networks a twopath is defined as a pair of edges
. That is, it is an undirected path of length 2 from
to
via
, also known as a 2-star.
# binary: twopath
# binary: twopath
ergmTerm
for index of model terms currently visible to the package.
directed, undirected, binary
Specifies each dyad's baseline distribution to be continuous uniform
between a
and b
: , with the support being
[a, b]
.
# Unif(a,b)
# Unif(a,b)
a , b
|
minimum and maximum to the baseline discrete uniform distribution, both inclusive. Both values must be finite. |
ergmReference
for index of reference distributions currently visible to the package.
continuous
Replaces the edges in a network
object with the edges corresponding
to the sociomatrix or edge list specified by new
.
## S3 method for class 'network' update(object, ...) update_network(object, new, ...) ## S3 method for class 'matrix_edgelist' update_network(object, new, attrname = if (ncol(new) > 2) names(new)[3], ...) ## S3 method for class 'data.frame' update_network(object, new, attrname = if (ncol(new) > 2) names(new)[3], ...) ## S3 method for class 'matrix' update_network(object, new, matrix.type = NULL, attrname = NULL, ...) ## S3 method for class 'ergm_state' update_network(object, new, ...)
## S3 method for class 'network' update(object, ...) update_network(object, new, ...) ## S3 method for class 'matrix_edgelist' update_network(object, new, attrname = if (ncol(new) > 2) names(new)[3], ...) ## S3 method for class 'data.frame' update_network(object, new, attrname = if (ncol(new) > 2) names(new)[3], ...) ## S3 method for class 'matrix' update_network(object, new, matrix.type = NULL, attrname = NULL, ...) ## S3 method for class 'ergm_state' update_network(object, new, ...)
object |
a |
... |
Additional arguments; currently unused. |
new |
Either an adjacency matrix (a matrix of values indicating the presence and/or the value of a tie from i to j) or an edge list (a two-column matrix listing origin and destination node numbers for each edge, with an optional third column for the value of the edge). |
attrname |
For a network with edge weights gives the name of the edge attribute whose names to set. |
matrix.type |
One of |
A new network
object with the edges specified by
new
and network and vertex attributes copied from
the input network object
. Input network is not modified.
update_network()
: dispatcher for network update based on the type of updating information.
update_network(matrix_edgelist)
: a method for updating a network based on a matrix-form edgelist
update_network(data.frame)
: a method for updating a network based on an edgelist
update_network(matrix)
: a method for updating a network based on a matrix
update_network(ergm_state)
: a method for updating a network based on an ergm_state
object.
# data(florentine) # # test the network.update function # # Create a Bernoulli network rand.net <- network(network.size(flomarriage)) # store the sociomatrix rand.mat <- rand.net[,] # Update the network update(flomarriage, rand.mat, matrix.type="adjacency") # Try this with an edgelist rand.mat <- as.matrix.network.edgelist(flomarriage)[1:5,] update(flomarriage, rand.mat, matrix.type="edgelist")
# data(florentine) # # test the network.update function # # Create a Bernoulli network rand.net <- network(network.size(flomarriage)) # store the sociomatrix rand.mat <- rand.net[,] # Update the network update(flomarriage, rand.mat, matrix.type="adjacency") # Try this with an edgelist rand.mat <- as.matrix.network.edgelist(flomarriage)[1:5,] update(flomarriage, rand.mat, matrix.type="edgelist")
Compute weighted median.
wtd.median(x, na.rm = FALSE, weight = FALSE)
wtd.median(x, na.rm = FALSE, weight = FALSE)
x |
Vector of data, same length as |
na.rm |
Logical: Should NAs be stripped before computation proceeds? |
weight |
Vector of weights |
Uses a simple algorithm based on sorting.
Returns an empirical .5 quantile from a weighted sample.
sociality(attr, base, levels, nodes, form) (val)