API for MCMC Proposal Selection

Summary

This document describes the process by which ergm and related packages select the MCMC proposal for a particular analysis. Note that it is not intended to be a tutorial as much as a description of what inputs and outputs different parts of the system expect. Nor does it cover the C API.

Description

Inputs

There is a number of factors that can affect MCMC sampling, some of them historical and some of them new:

Globals

functions and other structures defined in an accessible namespace

  • ergm_proposal_table() a function that if called with no arguments returns a table of registered proposals and updates it otherwise. See ? ergm_proposal_table for documentation and the meaning of its columns. Of particular interest is its Constraints column, which encodes which constraints the proposal does (always) enforce and which it can enforce.
  • InitErgmReference.<REFERENCE> a family of initializers for the reference distribution. For the purposes of the proposal selection, among its outputs should be $name specifying the name of the reference distribution.
  • InitErgmConstraint.<CONSTRAINT> a family of initializers for constraints, weightings, and other high-level specifiers of the proposal distribution. Hard constraints, probabilistic weights, and hints all use this API. For the purposes of the proposal selection, its outputs include
    • $constrain (defaulting to <CONSTRAINT>) a character vector specifying which constraints are enforced, and can include several semantically nested elements;
    • $dependence (defaulting to TRUE) specifying whether the constraint is dyad-dependent;
    • $priority (defaulting to Inf) specifying how important it is that the constraint is met (with Inf meaning that it must be met); and
    • $implies/$impliedby specifying which other constraints this constraint enforces or is enforced by, and this can include itself for constraints, such as edges that can only be applied once.
    • $free_dyads either an RLEBDM or a function with no arguments that returns an RLEBDM specifying which dyads are not constrained by this constraint.
Arguments

arguments and settings passed to the call or as control parameters.

  • constraints= argument (top-level): A one-sided formula containing a +- or --separated list of constraints. + terms add additional constraints to the model whereas - constraints relax them. - constraints are primarily used internally for observational process estimation and are not described in detail, except to note that 1) they must be dyad-independent and 2) they necessitate falling back to the RLEBDM sampling API. A term_list or a list of formulas or term_lists is also supported.
  • reference= argument (top-level): A one-sided formula specifying the ERGM reference distribution, usually as a name with parameters if appropriate.
  • control$MCMC.prop= control parameter: A formula whose RHS containing +-separated “hints” to the sampler; an optional LHS may contain the proposal name directly.
  • control$MCMC.prop.weights= control parameter: A string selecting proposal weighting (probably deprecated)
  • control$MCMC.prop.args= control parameter: A list specifying information to be passed to the proposal

Code Path

Most of this is implemented in the ergm_proposal.formula() method:

  1. InitErgmReference.<REFERENCE> is called with arguments of reference=’s LHS, obtaining the name of the reference.
  2. For each term in the following formula’s RHS, the corresponding InitErgmConstraint.<CONSTRAINT> function is called and their outputs are stored in a list of initialized constraints (an ergm_conlist object). .dyads pseudo-constraint is added to dyad-independent constraints (not to hints with $priority < Inf). For hints, $dependence element is overwritten to FALSE. The list is named, with the name taken from the first element of the $constrains.
    1. constraints=
    2. MCMC.prop=
  3. Constraint lists from the previous two steps are concatenated, with redundant constraints removed based on their $implies/$impliedby settings.
  4. Proposal candidates returned by ergm_proposal_table() are filtered by Class, Reference, Weights (if MCMC.prop.weights differs from "default"), and Proposal (if the LHS of MCMC.prop is provided).
  5. Each candidate proposal is “scored” as follows:
    1. If a proposal does enforce a constraint that is not among the requested by the constraints list, it is discarded.
    2. If a proposal cannot enforce a constraint that is among the requested with priority==Inf, it is discarded.
    3. For each constraint that is among requested with priority<Inf and that the proposal doesn’t and can’t enforce, its (innate, specified in the column of the ergm_proposal_table()) Priority value is penalised by the priority of that constraint. Proposals whose Constraints include * are exempt from this penalty.
  6. If there are no candidate proposals left, an error is raised.
  7. If more than one is left,
    1. Calls to InitErgmProposal.* functions are attempted. If a call returns NULL, next proposal is attempted. (This can be useful if a proposal handles a particular special case that is not accounted for by constraints.)