Index

Types and functions

Pigeons.AAPSType

The Apogee to Apogee Path Sampler (AAPS) by Sherlock et al. (2022).

AAPS is a simple alternative to the No U-Turn Sampler (NUTS). It serves a similar purpose as NUTS: the method should be robust to its choice of tuning parameters when compared to standard HMC. For a given starting position and momentum (x, v), AAPS explores both forward and backward trajectories. The trajectories are divided into segments, with segments being separated by apogees (local maxima) in the energy landscape of -log pi(x). The tuning parameter K defines the number of segments to explore.

source
Pigeons.AugmentationType

A state augmentation used by explorers. Internally, it hijacks the recorders machinery to store (usually volatile) data in a Replica. This helps with writing allocation-light code by pre-allocating objects inside the Augmentation, while avoiding race-conditions between replicas. For an application, see buffers.

  • contents: The payload. Can be nothing for efficiency purposes.
  • serialize: When it is volatile, i.e. can be reconstructed on the fly and is only stored for efficiency purpose, it is not worth serializing it
source
Pigeons.AutoMALAType

The Metropolis-Adjusted Langevin Algorithm with automatic step size selection.

Briefly, at each iteration, the step size is exponentially shrunk or grown until the acceptance rate is in a reasonable range. A reversibility check ensures that the move is reversible with respect to the target. The process is started at step_size, which at the end of each round is set to the average exponent used across all chains.

The number of steps per exploration is set to base_n_refresh * ceil(Int, dim^exponent_n_refresh).

At each round, an empirical diagonal marginal standard deviation matrix is estimated. At each step, a random interpolation between the identity and the estimated standard deviation is used to condition the problem.

In normal circumstance, there should not be a need for tuning, however the following optional keyword parameters are available:

  • base_n_refresh: The base number of steps (equivalently, momentum refreshments) between swaps. This base number gets multiplied by ceil(Int, dim^(exponent_n_refresh)).
  • exponent_n_refresh: Used to scale the increase in number of refreshment with dimensionality.
  • default_autodiff_backend: The default backend to use for autodiff. See https://github.com/tpapp/LogDensityProblemsAD.jl#backends

    Certain targets may ignore it, e.g. if a manual differential is offered or when calling an external program such as Stan.

    For tape-based AD backends like ReverseDiff, compilation can be controlled using Pigeons.set_tape_compilation_strategy!.

  • step_size: Starting point for the automatic step size algorithm. Gets updated automatically between each round.
  • preconditioner: A strategy for building a preconditioner.
  • estimated_target_std_deviations: This gets updated after first iteration; initially nothing in which case an identity mass matrix is used.

Reference: Biron-Lattes, M., Surjanovic, N., Syed, S., Campbell, T., & Bouchard-Côté, A.. (2024). autoMALA: Locally adaptive Metropolis-adjusted Langevin algorithm. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:4600-4608.

source
Pigeons.BlangTargetType

A StreamTarget delegating exploration to Blang worker processes.

using Pigeons

Pigeons.setup_blang("blangDemos", "UBC-Stat-ML") # pre-compile the blang models in the github repo UBC-Stat-ML/blangDemos
pigeons(target = Pigeons.blang_ising());

Type Pigeons.blang followed by tab to find other examples.

source
Pigeons.BufferedADType

Holds a buffer for in-place auto-differentiation. For example, used by stan log potentials.

Fields:

  • enclosed: A struct satisfying the LogDensityProblems informal interface.

  • buffer: The buffer used for in-place gradient computation.

  • logd_buffer: A buffer for logdensity eval.

  • err_buffer: A buffer to hold error flags.

source
Pigeons.ChildProcessType

Flag to run to a new julia process. Useful e.g. to dynamically control the number of threads to use or to use MPI on a single machine.

Fields:

  • n_threads: The number of threads to provide in the child julia process, the same as the current process by default.
  • dependencies: Julia modules (if of type Module) or paths to include (if of type String) needed by the child process.
  • n_local_mpi_processes: If greater than one, run the code locally over MPI using that many MPI processes. In most cases, this is useful only for debugging purpose, as multi-threading should typically perform better. This could also potentially be useful if using a third-party target distribution which somehow does not support multi-threading.
  • wait: If wait is false, the process runs asynchronously. When wait is false, the process' I/O streams are directed to devnull.
  • mpiexec_args: Extra arguments passed to mpiexec.
source
Pigeons.ComposeType

A deterministic composition of two explorers. E.g. Compose(SliceSampler(), AutoMALA())

source
Pigeons.EntangledReplicasType

An implementation of replicas for distributed PT. Contains:

  • locals: The subset of replicas hosted in this process
  • chain_to_replica_global_indices: A specialized distributed array that maps chain indices to replica indices (global indices). This corresponds to the mapping $\boldsymbol{j}$ in line 2 of Algorithm 5 in Syed et al, 2021.
source
Pigeons.EntanglerType

Assume all the MPI processes linked by this communicator will all call the key operations listed below the same number of times in their lifetime, at logically related occasions (e.g. a set number of times per iteration for algorithms running the same number of iterations). We call these 'occasions' a micro-iteration.

This datastructure keeps track internally of appropriate unique tags to coordinate the communication between MPI processes without having to do any explicit synchronization.

This struct contains:

  • communicator: An MPI Comm object (or nothing if a single process is involved).
  • load: How a set of tasks or "global indices" are distributed across processes.
  • current_received_bits: An internal datastructure used during MPI calls.
  • n_transmits: The current micro-iteration. Do not rely on it to count logical steps as it is reset to zero after transmit_counter_bound micor-iterations to avoid underflows to negative tags which cause MPI to crash.
  • transmit_counter_bound: Calculated from MPI.tagub and nglobal_indices to ensure MPI tags stay valid (i.e. do not overflow into negative values).

The key operations supported:

source
Pigeons.IndexerType

A bijection between integers and some type T. T is assumed to have consistent hash and ==. The two sides of the bijection can be obtained with the fields:

  • i2t: A Vector mapping integers to objects t of type T.
  • t2i: A Dict mapping objects t of type T to integers.
source
Pigeons.InputsType

A Base.@kwdef struct used to create Parallel Tempering algorithms.

Fields (see source file for default values):

  • target: The target distribution.

  • seed: The master random seed.

  • n_rounds: The number of rounds to run.

  • n_chains: The number of chains to use (but see also n_chains_variational).

  • n_chains_variational: The number of chains to use for a variational leg of parallel tempering. The default value is zero. If the variational argument is not specified, this argument does nothing. Otherwise, a value of zero corresponds to basic single-leg variational parallel tempering, where the number of chains is taken from the n_chains argument. A positive value yields "stabilized" two-leg variational parallel tempering. In that case, the number of chains for the prior-reference leg is taken from n_chains, and the number of chains for the variational leg is taken from this argument. See https://arxiv.org/abs/2206.00080

  • reference: The reference distribution (e.g. a prior), or if nothing and a fixed reference is needed (i.e. variational inference is disabled or two-legged variational inference is used), then default_reference() will be called to automatically determine the reference based on the type of the target.

  • variational: The variational reference family, or nothing to disable variational inference.

  • checkpoint: Whether a checkpoint should be written to disk at the end of each round.

  • record: Determine what should be stored from the simulation. A Vector with elements of type recorder_builder.
  • checked_round: The round index where run_checks() will be performed. Set to 0 to skip these checks.
  • multithreaded: If multithreaded explorers should be allowed. False by default since it incurs an overhead.
  • explorer: The explorer to use, or if nothing, will use default_explorer() to automatically determine the explorer based on the type of the target.
  • show_report: Show sampling report?
  • extended_traces: Whether the traces and disk recorders will store samples for all the chains (extended = true) or just for the target(s) (extended = false).
source
Pigeons.InterpolatedADType

The target and reference may used different autodiff frameworks; provided both are non-allocating, this allows autodiff of InterpolatedLogPotential's to also be non-allocating. For example, this is useful when the target is a stan log potential and the reference is a variational distribution with a hand-crafted, also allocation-free differentiation.

Fields:

  • enclosed: The enclosed InterpolatedLogPotential.

  • ref_ad: The result of LogDensityProblemsAD.ADgradient() on the reference, often a BufferedAD.

  • target_ad: The same as ref_ad but with the target.
  • buffer: An extra buffer to combine the two distribution endpoints gradients.
source
Pigeons.IteratorsType

Iterators used in Parallel Tempering. Stored in a struct so that recorder's can access it when outputting sample statistics.

Fields:

  • scan: Number of (exploration, communication) pairs performed so far, corresponds to $n$ in Algorithm 1 of Syed et al., 2021. Round $i$ typically performs $2^i$ scans. Set to zero when runoneround!() is not yet started.
source
Pigeons.LazyTargetType

Use when a target contains information that cannot be serialized, e.g. FFT plans (https://discourse.julialang.org/t/distributing-a-function-that-uses-fftw/69564) so that the target is constructed just in time by each MPI node.

# in a script.jl:
struct MyTargetFlag end 
import Pigeons.instantiate_target
Pigeons.instantiate_target(flag::MyTargetFlag) = toy_mvn_target(1)

# to run
pigeons(target = Pigeons.LazyTarget(MyTargetFlag()), on = ChildProcess(dependencies = ["script.jl"]))
source
Pigeons.LoadBalanceType

Split a list of indices across processes. These indices are denoted $1, 2, .., N$. They are usually some kind of task, for example in the context of parallel tempering, two kinds of tasks arise:

  • in replicas.state, task $i$ consists in keeping track of the state of replica $i$.
  • in replicas.chain_to_replica_global_indices, task $i$ consists in storing which replica index corresponds to chain $i$.

One such task index is called a global_index.

LoadBalance splits the global indices among n_processes. LoadBalance is constructed so that the difference in the number of global indices a process is responsible of (its "load") is at most one.

A LoadBalance contains:

  • my_process_index: A unique index for this process. We use 1-indexed, i.e. hide MPI's 0-indexed ranks.
  • n_processes: Total number of processes involved.
  • n_global_indices: The total number of global indices shared between all the processes.

The set {1, 2, .., load()} is called a set of local indices. A local index indexes a slice in {1, 2, ..., n_global_indices}. Collectively over the n_processes, these slices form a partition of the global indices.

Key functions to utilize a LoadBalance struct:

source
Pigeons.MALAType

The Metropolis-Adjusted Langevin Algorithm (MALA).

MALA is based on an approximation to overdamped Langevin dynamics followed by a Metropolis-Hastings correction to ensure that we target the correct distribution.

This round-based version of MALA allows for the use of a preconditioner, which is updated after every PT tuning round. This setting can also be turned off by specifying the type of preconditioner to use. However, MALA will not automatically adjust the step size. For such functionality, use autoMALA.

As for autoMALA, the number of steps per exploration is base_n_refresh * ceil(Int, dim^exponent_n_refresh).

source
Pigeons.MPIProcessesType

Flag to run on MPI. Before using, you have to call once setup_mpi.

Fields:

  • n_threads: The number of threads per MPI process, 1 by default.
  • walltime: The walltime limit, 00:30:00 by default (i.e., 30 minutes).
  • n_mpi_processes: The number of MPI processes, 2 by default.
  • memory: The memory allocated to each MPI process, 8gb by default.
  • dependencies: Julia modules (if of type Module) or paths to include (if of type String) needed by the child process.
  • mpiexec_args: Extra arguments passed to mpiexec.
source
Pigeons.MPISettingsType

Global settings needed for MPI job submission:

  • submission_system: E.g.: :pbs, :slurm, etc

    Use Pigeons.supported_submission_systems() to see the list of available options.

  • add_to_submission: Add lines to the submission scripts.

    E.g. used in UBC Sockeye for custom allocation code via

    add_to_submission = ["#PBS -A my_user_allocation_code"]

    or in Compute Canada (optional if member of only one account, see https://docs.alliancecan.ca/wiki/Running_jobs):

    add_to_submission = ["#SBATCH --account=my_user_name"]`

  • environment_modules: "Envirnonment modules" to load (not to be confused with Julia modules). Run module avail in the HPC login node to see what is available on your HPC. For example: ["git", "gcc", "intel-mkl", "openmpi"] on Sockeye, and ["intel", "openmpi", "julia"] on Compute Canada
  • library_name: In most case, leave empty as MPIPreferences.usesystembinary() will autodetect, but if it does not, the path to libmpi.so can be specified this way, e.g. this is needed on compute Canada clusters (as they are not setting that environment variable correctly) where it needs to be set to "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Compiler/intel2020/openmpi/4.0.3/lib/libmpi" (notice the .so is not included).
source
Pigeons.MixType

Randomly alternate between different explorers.

  • explorers: A tuple consisting of exploration kernels
source
Pigeons.MixDiagonalPreconditionerType

Similar to DiagonalPreconditioner but the actual preconditioner used at each iteration is a random mixture of the identity and the adapted diagonal matrix. This helps with targets featuring distantly separated modes, which induces average standard deviations that are much higher than the ones within each mode. Even in the family of Gaussian targets, Hird & Livingstone (2023) identify cases where a fixed diagonal preconditioner performs worse than using no preconditioner at all. We use a zero-one-inflated Uniform(0,1) distribution for the mixing proportion in order to make the preconditioner robust to extreme mismatch of scales (see the automala paper for more details).

  • p0: Proportion of zeros

  • p1: Proportion of ones

source
Pigeons.PTType

Storage involved in PT algorithms:

  • inputs: The user-provided Inputs that determine the execution of a PT algorithm.
  • replicas: The replicas held by this machine.
  • shared: Information shared across all machines, updated between rounds.
  • exec_folder: Either a path to a folder shared by all processes, which is used to save information to disk (checkpoints, samples etc); or nothing if a completely in-memory algorithm is used.
source
Pigeons.PTMethod
PT(source_exec_folder; round, exec_folder)

Create a PT struct from a saved checkpoint. The input source_exec_folder should point to a folder of the form results/all/[exec_folder].

The checkpoint carries all the information stored in a PT struct. It is possible for an MPI-based execution to load a checkpoint written by a single-process execution and vice versa.

A new unique folder will be created with symlinks to the source one, so that e.g. running more rounds of PT will results in a new space-efficient checkpoint containing all the information for the new run.

source
Pigeons.PTMethod
PT(inputs; exec_folder)

Create a PT struct from provided Inputs. Optionally, provide a specific exec_folder path (AbstractString), if not one will be created via next_exec_folder().

source
Pigeons.PermutedDistributedArrayType

A distributed array making special assumptions on how it will be accessed and written to. The indices of this distributed array correspond to the notion of "global indices" defined in LoadBalance. Several MPI processes cooperate, each processing storing data for a slice of this distributed array.

We make the following assumptions:

  • Each MPI process will set/get entries the same number of times in their lifetime, at logically related episodes (e.g. a set number of times per iteration for algorithms running the same number of iterations). These episodes are called micro-iterations as in Entangler, which this datastructure is built on.

  • Moreover, at each time all processes perform a get or a set, we assume that each global index is manipulated by exactly one process (i.e. an implicit permutation of the global indices).

We use these assumptions to achieve read/write costs that are near-constant in the number of machines participating.

This struct contains:

  • local_data: The slice of the distributed array maintained by this MPI process.
  • entangler: An Entangler used to coordinate communication.

The operations supported are:

source
Pigeons.ReplicaType

One of the $N$ components that forms the state maintained by a PT algorithm. A Replica contains:

  • state: Configuration in the state space.

  • chain: The index of the distribution currently associated with this replica, modified during swaps.

  • rng: Random operations involving this state should use only this random number generator.
  • recorders: Records statistics. Each replica carries its own for thread safety/distribution; then they are reduced at end of each round.
  • replica_index: A global id associated with this replica.
source
Pigeons.ResultType

A link to an execution folder able to deserialize type T via a string constructor.

source
Pigeons.SampleArrayType
struct SampleArray{T, PT} <: AbstractArray{T, 1}

Array convience wrapper for traces reduced recorder. We require a PT object, and the chain number which specifies the chain index (has to be a target chain) you wish to extract.

This should not be called directly and the user should instead look at get_sample.

source
Pigeons.ScheduleType

A partition of $[0, 1]$ encoded by monotonically increasing grid points starting at zero and ending at one.

source
Pigeons.SharedType

Information shared by all processes involved in a round of distributed parallel tempering. This is updated between rounds but only read during a round.

Fields:

  • reports: Named tuple of DataFrame's

Only one instance maintained per process.

source
Pigeons.SliceSamplerType

Slice sampler based on Neal, 2003.

Fields:

  • w: Initial slice size.

  • p: Slices are no larger than 2^p * w

  • n_passes: Number of passes through all variables per exploration step.

  • max_iter: Maximum number of interations inside shrink_slice! before erroring out

source
Pigeons.StabilizedPTType

Stabilized Variational Parallel Tempering as described in Surjanovic et al., 2022.

Fields:

  • variational_leg: The variational leg of stabilized PT.

  • swap_graphs: A swap_graphs spanning both legs.

  • log_potentials: The log_potentials.

  • indexer: An Indexer mapping between global indices and leg-specific indices.

source
Pigeons.StanLogPotentialType

Uses BridgeStan to perform efficient ccall loglikelihood and allcoation-free gradient calls to a Stan model.

To work with Pigeons BridgeStan needs to be imported into the current session.

source
Pigeons.StanStateType

A state for stan target. Holds a vector in BridgeStan's unconstrained parameterization.

source
Pigeons.StreamTargetType

A target based on running worker processes, one for each replica, each communicating with Pigeons using standard streams. These worker processes can be implemented in an arbitrary programming language.

StreamTarget implements log_potential and explorer by invoking worker processes via standard stream communication. The standard stream is less efficient than alternatives such as protobuff, but it has the advantage of being supported by nearly all programming languages in existence. Also in many practical cases, since the worker process is invoked only three times per chain per iteration, it is unlikely to be the bottleneck (overhead is in the order of 0.1ms).

The worker process should be able to reply to commands of the following forms (one command per line):

  • log_potential(0.6) in the worker's stdin to which it should return a response of the form response(-124.23) in its stdout, providing in this example the joint log density at beta = 0.6;
  • call_sampler!(0.4) signaling that one round of local exploration should be performed at beta = 0.4, after which the worker should signal it is done with response().
source
Pigeons.TestSwapperType

For testing/benchmarking purposes, a simple pair_swapper where all swaps have equal acceptance probability.

Could also be used to warm-start swap connections during exploration phase by setting that constant probability to zero.

source
Pigeons.TuringLogPotentialType

Uses DynamicPPL i.e. Turing's backend to construct the log density.

To work with Pigeons DynamicPPL or Turing needs to be imported into the current session.

  • model: A DynamicPPL.Model.
  • context: Either DynamicPPL.DefaultContext for evaluating the full joint, or DynamicPPL.PriorContext for evaluating only the prior.
  • dimension: The total number of scalar values observed in a single random sample from model. It is used by the LogDensityProblems and LogDensityProblemsAD interfaces when a gradient-based sampler is used as explorer in models with static computational graphs.

    Warning

    Explorers targeting models with dynamic computational graphs should not depend on the value of this field.

source
Pigeons._transformed_onlineMethod

Online statistics on potentially transformed samples for the target chain. For example, if a gradient-based method is used, the target is often transformed to be defined on an unconstrained space. This is used internally by explorer's for adaptation purposes (in particular, pre-conditioning and variational references).

source
Pigeons.aaps!Method

Main function for AAPS. Note that this implementation uses scheme (1) from the AAPS paper, which results in an acceptance probability of one.

source
Pigeons.adapt_explorerMethod
adapt_explorer(
    explorer,
    reduced_recorders,
    current_pt,
    new_tempering
)

Called between successive rounds (run_one_round!).

By default, return the explorer without further adaptation.

source
Pigeons.all_reduce_deterministicallyMethod
all_reduce_deterministically(operation, source_data, e)

Same as reduce_deterministically() except that the result at the root of the tree is then broadcast to all machines so that the output of all_reduce_deterministically() is the root of the reduction tree for all MPI processes involved.

source
Pigeons.all_reportsMethod
all_reports()

The iterim diagnostics computed and printed to standard out at the end of every iteration (this can be disabled using show_report = false).

source
Pigeons.blang_isingMethod
blang_ising(model_options)

Two-dimensional Ising model.

For more information:

using Pigeons

Pigeons.setup_blang("blangDemos") 
run(Pigeons.blang_ising(`--help`).command);

E.g., use arguments `model.N` to set the size 
of the grid. 
source
Pigeons.blang_sitkaMethod
blang_sitka(model_options)

Model for phylogenetic inference from single-cell copy-number alteration from Salehi et al., 2020.

For more information:

using Pigeons

Pigeons.setup_blang("nowellpack") 
run(Pigeons.blang_sitka(`--help`).command);
source
Pigeons.buffersMethod
buffers()
buffers()

A buffering system used internally by explorers in Pigeons.

source
Pigeons.check_against_serialMethod
check_against_serial(pt)

Run a separate, fully serial version of the PT algorithm, and compare the checkpoint files to ensure the two produce exactly the same output.

source
Pigeons.communication_barriersMethod
communication_barriers(intensity, schedule)

Compute the local communication barrier and cumulative barrier functions from the intensity rates (i.e. rejection rates in the context of Parallel Tempering) and the current annealing schedule. The estimation of the barriers is based on Fritsch-Carlson monotonic interpolation.

Returns a NamedTuple with fields:

  • localbarrier
  • cumulativebarrier
  • globalbarrier
source
Pigeons.create_explorerMethod
create_explorer(inputs)

Given an Inputs object, either use inputs.explorer, of if it is equal to nothing dispatch on default_explorer(inputs.target) to construct the explorer associated with the input target distribution.

source
Pigeons.create_replica_indexerMethod
create_replica_indexer(n_chains_fixed, n_chains_var)

Create an Indexer for stabilized variational PT. Given a chain number, return a tuple indicating the relative chain number within a leg of PT and the leg in which it is located. Given a tuple, return the global chain number.

source
Pigeons.create_replicasFunction
create_replicas(inputs, shared)
create_replicas(inputs, shared, source)

Create replicas, detecting automatically if MPI is needed.

Argument source is either nothing, when creating new states, or FromCheckpoint to load from a saved checkpoint.

source
Pigeons.diskMethod

Save the full trace for the target chain to disk.

The disk recorders are safe to use in a multi-threaded and/or distributed context as each replica uses its own file.

To post-process files in the correct order, use process_sample.

source
Pigeons.energy_ac1sFunction
energy_ac1s(reduced_recorders)
energy_ac1s(reduced_recorders, skip_reference)
energy_ac1s(reduced_recorders, skip_reference, pt)
source
Pigeons.energy_ac1sFunction
energy_ac1s(pt)
energy_ac1s(pt, skip_reference)

Auto-correlations between energy before and after an exploration step, for each chain. Organized as a Vector where component i corresponds to chain i.

It is often useful to skip the reference chain, for two reasons, first, exploration should be iid there, second, if the prior is flat the auto-correlation of the energy will be NaN for the reference.

source
Pigeons.explore!Method
explore!(pt, explorer, multithreaded)

The @threads macro brings a large overhead even when Threads.nthreads == 1, so a separate method is used for the single thread mode.

source
Pigeons.explorer_stepMethod
explorer_step(rng, target, explorer, init_state)

Utility for taking a single step of an explorer under a given target and initial state init_state. Returns the modified state.

Taking multiple steps

Note that taking more than a single step can be achieved in many Pigeons explorers by modifying their arguments for number of passes or refreshements. For example, SliceSampler takes an n_passes::Int argument, while AutoMALA takes the base_n_refresh::Int argument.

source
Pigeons.extract_sampleMethod
extract_sample(state, log_potential, extractor)

Extract a sample for postprocessing. By default, calls copy() but many overloads are defined for different kinds of states.

Typically, this will be a flattened vector (i.e. concatenation of all variables, with discrete ones converted to Float64) ready for post-processing.

The corresponding un-normalized log density might be appended at the very end.

If the state is transformed (e.g. for HMC), this will create a fresh vector with an un-transformed (i.e. original parameterization) state in it.

The argument extractor is passed from the Inputs.

source
Pigeons.find_local_indexMethod
find_local_index(lb, global_idx)

Find the local index corresponding to the given global_index. Assumes the given global_index is one of this process'.

source
Pigeons.find_processMethod
find_process(lb, global_idx)

Find the process id (1-indexed) responsible for the given global_idx.

source
Pigeons.forward_sample_condition_and_exploreFunction

The workhorse under invariance_test. It starts with a full forward pass for the probabilistic model underlying target, thats simulates latent variables and observations. Then a modified model is created that conditions the original model on the observations produced. Finally, the function takes a step using the explorer targetting the conditioned model and the final state is returned. The exploration can be optionally disabled by passing run_explorer=false, in which case the initial simulated state is returned.

source
Pigeons.get_bufferMethod
get_buffer(a, key, args)

Return a Pigeons.BufferedAD if it exists in the Augmentation. Otherwise it constructs one and then stores it to avoid reconstructing it in the future.

Note

This implementation is not type stable (the value type of the Dict is not concrete). However, the runtime dispatch cost incurred should be more than compensated by the ability to avoid reconstructing AD objects at each exploration step.

source
Pigeons.get_bufferMethod
get_buffer(a, key, dims)

Return an array in the buffer. Allocating only the first time; after that, the buffer is recycled and stored in the Replica's recorders.

source
Pigeons.get_sampleFunction
get_sample(pt)
get_sample(pt, chain)

The chain option can be omitted and by default the first chain targetting the distribution of interest will be used (in many cases, there will be only one, in variational cases, two).

source
Pigeons.global_barrierMethod
global_barrier(pt)

The global communication barrier. If the PT algorithm has both a fixed and variational references, return the barrier to the fixed one.

source
Pigeons.informal_docMethod
informal_doc(doc_dir, mod)

Generate informal interface documentation, e.g.:

makedocs(;
    ...
    pages=[
        "Home" => "index.md", 
        "Interfaces" => informal_doc(@__DIR__, MyModuleName),
        ...
    ]
)
source
Pigeons.initializationMethod
initialization(target, rng, replica_index)

Create a fresh state used to populate the states at the beginning of the first round of Parallel Tempering.

source
Pigeons.initializationMethod
initialization(target, rng, replica_index)

Return StreamState by following these steps:

  1. create a Cmd that uses the provided rng to set the random seed properly, as well as target-specific configurations provided by target.
  2. Create StreamState from the Cmd created in step 1 and return it.
source
Pigeons.invariance_testFunction

Run an invariance test of explorer on the provided target. Corresponds to a modified Geweke test where the simulated data is kept fixed.

References

Bouchard-Côté, A., Chern, K., Cubranic, D., Hosseini, S., Hume, J., Lepur, M., Ouyang, Z., & Sgarbi, G. (2022). Blang: Bayesian Declarative Modeling of General Data Structures and Inference via Algorithms Based on Distribution Continua. Journal of Statistical Software, 103(11), 1–98.

Geweke, J. (2004). Getting It Right: Joint Distribution Tests of Posterior Simulators. Journal of the American Statistical Association, 99(467), 799–804.

source
Pigeons.is_finishedMethod
is_finished(checkpoint_folder, inputs)

Is the provided path to a checkpoint folder complete? I.e. check in the .signal subfolder that all MPI processes have signaled that they are done.

source
Pigeons.jsonMethod
json(; variables...)

Create a JSON string based on the scalar or array variables provided.

source
Pigeons.localsMethod
locals(replicas)

Return the replica's that are stored in this machine

source
Pigeons.log_sum_ratioMethod

Log of the sum of density ratios between neighbour chains, used to compute stepping stone estimators of lognormalization contants.

source
Pigeons.log_unnormalized_ratioMethod
log_unnormalized_ratio(
    log_potentials,
    numerator,
    denominator,
    state
)

Assumes the input log_potentials is a vector where each element is a log_potential.

This default implementation is sufficient in most cases, but in less standard scenarios, e.g. where the state space is infinite dimensional, this can be overridden.

source
Pigeons.log_unnormalized_ratioMethod
log_unnormalized_ratio(
    log_potentials,
    numerator,
    denominator,
    state
)

The argument numerator selects one distribution $\pi_i$ from the collection log_potentials, and similarly denominator selects $\pi_j$. Let $x$ denote the input state. The ratio:

\[f(x) = \frac{\text{d}\pi_i}{\text{d}\pi_j}(x)\]

may only be known up to a normalization constant which can depend on $i$ and $j$ but not $x$, $g(x) = C_{i,j} f(x)$.

This function should return $\log g$ evaluated at state.

source
Pigeons.mpi_activeMethod
mpi_active()

A flag is set by launch scripts (see ChildProcess.jl) to indicate if this process is a child MPI process under an mpiexec. Otherwise, that flag is false by default.

This function retrieves the value of that flag.

source
Pigeons.my_loadMethod
my_load(lb)

Return the number of indices (task) this process is responsible for.

source
Pigeons.next_exec_folderMethod

Return a unique subfolder of results/all/, making sure the unique folder and its parents are created. It will also create a soft symlink to it called results/latest`

source
Pigeons.onlineMethod

Online statistics on the target chain. The samples are processed in the original model parameterization.

source
Pigeons.only_one_processMethod
only_one_process(task, pt)

A task that should be ran on only one of the processes. Using the do .. end syntax, this can be used as:

only_one_process(pt) do 
    ...
end
source
Pigeons.partner_chainMethod
partner_chain(swap_graph, chain)

For a given swap_graph and input chain index, what chain will it interact with at the current iteration? Convention: if a chain is not interacting, return its index.

source
Pigeons.permuted_getMethod
permuted_get(p, indices)

Retreive the values for the given indices, using MPI communication when needed.

We make the following assumptions:

  • length(indices) == my_load(p.entangler.load)
  • the indices across all participating processes form a permutation of the global indices.
source
Pigeons.permuted_set!Method
permuted_set!(p, indices, new_values)

Set the values for the given indices to the given new_values, using MPI communication when needed.

We make the same assumptions as in permuted_get().

source
Pigeons.pigeonsMethod
pigeons(pt_arguments; on)

pt_arguments can be either an Inputs, to start a new Parallel Tempering algorithm, or a string pointing to an execution to resume.

source
Pigeons.pigeonsMethod
pigeons(; on, args...)

Passes the args... to Inputs and start a new Parallel Tempering algorithm with that inputs.

source
Pigeons.process_sampleMethod
process_sample(processor, exec_folder, round)

Process samples that were saved to disk using the disk recorder, at the given round.

Each sample is passed to the processor function, by calling processor(chain_index, scan_index, sample) where chain_index is the index of the target chain (in classical parallel tempering, there is only one chain at target temperature, so in that case it can be ignored, but it will be non-trivial in e.g. stabilized variational parallel tempering), scan_index is the iteration index within the round, starting at 1, and sample is the deserialized sample.

This iterates over the samples in increasing order, looping over chain_index in the outer loop and scan_index in the inner loop.

source
Pigeons.providersMethod
providers(mod, name)

Provides a Set{Expr} containing all the providers of the given name in the given module.

source
Pigeons.record!Method
record!(recorder, value)

Given a value, a pair (a, b), and a Dict{K, Vector{V}} backed recorder, append b to the vector corresponding to a, inserting an empty vector into the dictionary first if needed.

source
Pigeons.record_swap_stats!Method
record_swap_stats!(
    pair_swapper,
    recorders,
    chain1,
    stat1,
    chain2,
    stat2
)

Given a pair_swapper, a recorders, the provided chain indices, and the sufficient statistics computed by swap_stat(), record statistics.

To avoid accumulating twice the same statistic with (chain1, chain2) and (chain2, chain2), swap!() only calls this for the pair with chain1 < chain2.

source
Pigeons.recorder_valuesMethod
recorder_values(pt, recorder_name)

Returns a generator for the values of a recorder of type OnlineStatsBase.GroupBy.

source
Pigeons.recursive_equalMethod
recursive_equal(a, b)

Recursively check equality between two objects by comparing their fields. By default calls == but for certain types we dispatch a custom method. This is necessary because for some mutable structs (and even immutable ones with mutable fields) == actually dispatches ===. The latter is too strict for the purpose of checking that two checkpoints are equal.

If you are using custom struct and encounter a failed correctness check, you may need to provide a special equality check for this type. In most cases it will be enough to overload recursive_equal as follows

Pigeons.recursive_equal(a::MyType, b::MyType) = Pigeons._recursive_equal(a,b)

For examples of more specific checks, refer to the code of PigeonsBridgeStanExt.

source
Pigeons.reduce_deterministicallyMethod
reduce_deterministically(operation, source_data, e)

Perform a binary reduction of the source_data, using MPI when needed.

Consider the binary tree with leaves given by the global indices specified in e.load and stored in the different MPI processes' input source_data vectors. At each node of the tree, a reduction is performed using operation, i.e. by calling operation(left_child, right_child). When, and only when a branch of the tree crosses from one MPI process to another one, MPI communication is used to transmit the intermediate reduction.

At the end, for process 1, reduce_deterministically() will return the root of the binary tree, and for the other processes, reduce_deterministically() will return nothing.

Note that even when the operation is only approximately associative (typical situation for floating point reductions), the output of this function is invariant to the number of MPI processes involved (hence the terminology 'deterministically'). This contrasts to direct use of MPI collective communications where the leaves are MPI processes and hence will give slightly different outputs given different numbers of MPI processes. In the context of randomized algorithms, these minor differences are then amplified.

In contrast to transmit!(), we do not assume isbitstype(T) == true and use serialization when messages are transmitted over MPI.

source
Pigeons.reduce_recorders!Method
reduce_recorders!(pt, replicas)

Perform a reduction across all the replicas' individual recorders, using Base.merge() on each individual recorder held. Returns a recorders with all the information merged.

Will reset the replicas' recorders at the same time using Base.empty!().

Since this uses all_reduce_deterministically, the output is identical, no matter how many MPI processes are used, even when the reduction involves only approximately associative Base.merge() operations (e.g. most floating point ones).

source
Pigeons.register_online_typeMethod
register_online_type(type)

Register an additional OnlineStat sub-types to be computed when the [online()] recorder is enabled.

The provided type should have a zero-argument constructor.

source
Pigeons.rejectionsMethod

Similar to above except that instead of the number of chains, provide the full vector of chain indices. Note that chain_indices starts at the reference and ends at the chain one before the target.

source
Pigeons.run_one_round!Method
run_one_round!(pt)

From a PT object, run one round of a generalized version of Algorithm 1 in Syed et al., 2021.

Alternates between communicate!(), which consists of any pairwise communicating moves and [explore!()], which consists of moves independent to each chain.

Concrete specification of how to communicate and explore are specified by the field of type Shared contained in the provided PT.

source
Pigeons.safelinkMethod
safelink(target, link)

Work around two issues with symlink():

  • naively calling symlink() when there are relative paths leads to broken links
  • on windows, one needs admin permission to do symlinks, so print a helpful error message in that case
source
Pigeons.sample_arrayMethod
sample_array(pt)

Copy the target chain(s) samples into an array with axes: iteration x variable x target chain. For example, with StabilizedPT there are two target chains. By default, there is only one chain produced.

See extract_sample() for information how the variables are flattened, and use sample_names() to obtain string names for the flattened variables.

The combination of this function and sample_names() is useful for creating MCMCChains which can then be used to obtain summary statistics, diagnostics, create trace plots, and pair plots (via PairPlots).

source
Pigeons.sample_namesMethod
sample_names(state, log_potential, extractor)

A list of string labels for the flattened vectors returned by extract_sample().

The key :log_density is used when the un-normalized log density is included.

source
Pigeons.setup_blangFunction
setup_blang(repo_name)
setup_blang(repo_name, organization)

Download the github repo with the given repo_name and organization in ~.pigeons, and compile the blang code.

source
Pigeons.setup_mpiMethod
setup_mpi(settings)

Run this function once before running MPI jobs. This should be done on the head node of a compute cluster. The setting are permanently saved. See MPISettings.

source
Pigeons.setup_mpiMethod
setup_mpi(; args...)

Look first at the list of clusters that have "presets" available, by typing Pigeons.setup_mpi_ and then tab. These are the most straightforward to use.

If presets are not available, use setup_mpi(). To see the documentation of the arguments of setup_mpi(), see MPISettings (i.e. args... are passed to the constructor of MPISettings).

Pull requests to Pigeons/src/submission/presets.jl are welcome if you would like to add a new "preset" functions of the form Pigeons.setup_mpi_...().

source
Pigeons.sort_includes!Method
sort_includes!(main)

Heuristic to automate the process of sorting include()'s.

Topological sorting of the source files under src (excluding main) is attempted, if successful, print the include string to copy and paste to the main file, otherwise, print the detected loops.

Internally, this function will:

  1. Construct a graph where the vertices are the .jl files under src, excluding the provided main file (i.e. where the module is defined and the includes will sit in).
  2. Each file starting with a capital letter is assumed to contain a struct with the same name as the file after removal of the .jl suffix. Similarly, files starting with @ are assumed to contain a macro with the similarly obtained name.
  3. Each source file is inspected to see if the above struct and macro strings are detected. This defines edges in the graph. (known limitation: this includes spurious edges when e.g. the string occurs in a comment).
source
Pigeons.split_sliceMethod
split_slice(slice, rng)

From one splittable random object, one can conceptualize an infinite list of splittable random objects. Return a slice from this infinite list.

source
Pigeons.stepping_stone_pairMethod
stepping_stone_pair(pt)

Return a pair, one such that its exponential is unbiased under Assumptions (A1-2) in Syed et al., 2021 for $Z$ and the other, for $1/Z$. Both are consistent in the number of MCMC iterations without these strong assumptions.

source
Pigeons.swap!Method
swap!(pair_swapper, replicas, swap_graph)

For each pair of chains encoded in swap_graph, use pair_swapper to decide if the pair will swap or not, and write the changes in-place into replicas (i.e. exchanging the Replica's chain fields for those that swapped.)

source
Pigeons.swap!Method
swap!(pair_swapper, replicas, swap_graph)

Entangled MPI swap! implementation.

This implementation is designed to support distributed PT with the following guarantees

  • The running time is independent of the size of the state space ('swapping annealing parameters rather than states')
  • The output is identical no matter how many MPI processes are used. In particular, this means that we can check correctness by comparing to the serial, single-process version.
  • Scalability to 1000s of processes communicating over MPI (see details below).
  • The same function can be used when a single process is used and MPI is not available.
  • Flexibility to extend PT to e.g. networks of targets and general paths.

Running time analysis:

Let $N$ denote the number of chains, $P$, the number of processes, and $K = \text{ceil}(N/P)$, the maximum number of chains held by one process. Assuming the running time is dominated by communication latency and a constant time for the latency of each peer-to-peer communication, the theoretical running time is $O(K)$. In practice, latency will grow as a function of $P$, but empirically, this growth appears to be slow enough that for say $P = N =$ a few 1000s, swapping will not be the computational bottleneck.

source
Pigeons.swap!Method
swap!(pair_swapper, replicas, swap_graph)

Single process, non-allocating swap! implementation.

source
Pigeons.swap_decisionMethod
swap_decision(pair_swapper, chain1, stat1, chain2, stat2)

Given a pair_swapper, a recorders, the provided chain indices, and the sufficient statistics computed by swap_stat(), make a swap decision.

By default, this is done as follows:

  1. compute the standard swap acceptance probability min(1, exp(stat1.log_ratio + stat2.log_ratio))
  2. make sure the two chains share the same uniform by picking the uniform from the chain with the smallest chain index
  3. swap if the shared uniform is smaller than the swap acceptance probability.
source
Pigeons.swap_statMethod
swap_stat(pair_swapper, replica, partner_chain)

By default, two sufficient statistics are computed and stored in the SwapStat struct:

This can be extended by dispatching on other pair_swapper types, with the constraint that the returned sufficient statistics should satisfy isbitstype().

source
Pigeons.tracesMethod

Save the full trace for the target chain in memory. Call copy() on each state on the target chain. Index them by the (chain index, scan index).

source
Pigeons.transmit!Method
transmit!(
    e,
    source_data,
    to_global_indices,
    write_received_data_here
)

Use MPI point-to-point communication to permute the contents of source_data across MPI processes, writing the permuted data into write_received_data_here. The permutation is specified by the load balance in the input argument e as well as the argument to_global_indices.

More precisely, assume the Vectors source_data, to_global_indices, and write_received_data_here are all of the length specified in my_load(e.load).

For each i, source_data[i] is sent to MPI process p = find_process(e.load, g), where g = to_global_indices[i] and written into this p 's write_received_data_here[j], where j = find_local_index(e.load, g)

See Entangler's comments regarding the requirement that all machines call transmit() the same number of times and at logically related intervals.

Additionally, at each micro-iteration, we assume that {to_global_indices_p : p ranges over the different processes} forms a partition of {1, ..., e.load.n_global_indices} If ran in single-process mode, this 'partition property' is checked; if ran in multi-process, opportunistic checks will be made, namely when several entries in to_global_indices lie in the same process, but systematic checks are not made for performance reasons.

We also assume isbitstype(T) == true.

source
Pigeons.transmitMethod
transmit(e, source_data, to_global_indices)

The same as transmit!() but instead of writing the result to an input argument, provide the result as a returned Vector.

source
Pigeons.update_reference!Method
update_reference!(reduced_recorders, variational, state)

Update the variational reference and the annealing path. Returns the new annealing path.

source
Pigeons.update_state!Method
update_state!(state, name, index, value)

Update the state's entry at symbol name and index with value.

source
Pigeons.variableMethod
variable(state, name)

The storage within the state of the variable of the given name, typically an Array.

source
Pigeons.watchMethod
watch(result; machine, last, interactive)

Print the queue status as well as the standard out and error streams (merged) for the given machine.

Note: when using control-c on interactive = true, julia tends to crash as of version 1.8.

source
Pigeons.write_checkpointMethod
write_checkpoint(pt)

If pt.inputs.checkpoint == true, save a checkpoint under [pt.exec_folder]/[unique folder]/round=[x]/checkpoint.

By default, pt.exec_folder is results/all/[unique folder].

In an MPI context, each MPI process will write its local replicas, while only one of the MPI processes will write the Shared and reduced recorders data. Moreover, only one MPI process will write once at the first round the Inputs data.

In cases where the sampled model contains large immutable data, consider using Immutable's to save disk space (Immutables will be written only by one MPI process at the first round).

source
RecipesBase.apply_recipeMethod
using Pigeons
using Plots 
pt = pigeons(
        target = toy_mvn_target(1), 
        record = [index_process], 
        n_rounds = 5)
plot(pt.reduced_recorders.index_process)
source
RecipesBase.apply_recipeMethod
using Pigeons
using Plots 
pt = pigeons(target = toy_mvn_target(1))
plot(pt.shared.tempering.communication_barriers.localbarrier)
source
Pigeons.@abstractMacro
my_fct() = @abstract()

Define an abstract function (i.e. which gives an error message if calling it is attempted).

source
Pigeons.@autoMacro

Based on ConcreteStruct.jl, but (1) with a more descriptive name and (2) outputs elided type information (ConcreteStruct.jl has this feature but does not seem to work at the moment).

source
Pigeons.@informalMacro
@informal name begin ... end

Document an informal interface with provided name, and functions specified in a begin .. end block.

@informal will spit back the contents of the begin .. end block so this macro can be essentially ignored at first read.

When building documentation, this allows us to use the function informal_doc() to automatically document the informal interface.

source