Pigeons

Summary

Pigeons is a Julia package to approximate challenging posterior distributions, and more broadly, Lebesgue integration problems. Pigeons can be used in a multi-threaded context, and/or distributed over hundreds or thousands of MPI-communicating machines.

Pigeons supports many different ways to specify integration/expectation problems and provides rich and configurable output.

Pigeons' core algorithm is a distributed and parallel implementation of the following algorithms:

These algorithms achieve state-of-the-art performance for approximation of challenging probability distributions.

Note

We are recruiting graduate students! Click here for more information.

Installing Pigeons

  1. If you have not done so, install Julia. Julia 1.8 and higher are supported.
  2. Install Pigeons using
using Pkg; Pkg.add("Pigeons")

Basic usage

Specify the target distribution and, optionally, parameters like random seed, etc by creating an Inputs:

using Pigeons

inputs = Inputs(target = toy_mvn_target(100))
Inputs{Pigeons.ScaledPrecisionNormalPath, Nothing, Nothing, Nothing, Nothing}(Pigeons.ScaledPrecisionNormalPath(1.0, 10.0, 100), 1, 10, 10, 0, nothing, nothing, false, Function[Pigeons.log_sum_ratio, Pigeons.timing_extrema, Pigeons.allocation_extrema], 0, false, nothing, nothing, true, false)

Have a look at the Inputs documentation for an overview of the many options available to configure pigeons. You will find information there on setting the random seed, controlling the number of iterations (via n_rounds), and many more options

Then, run parallel tempering (PT) locally on one process using the function pigeons():

pt = pigeons(inputs);
┌ Info: Neither traces, disk, nor online recorders included.
   You may not have access to your samples (unless you are using a custom recorder, or maybe you just want log(Z)).
   To add recorders, use e.g. pigeons(target = ..., record = [traces; record_default()])
────────────────────────────────────────────────────────────────────────────
  scans        Λ        time(s)    allc(B)  log(Z₁/Z₀)   min(α)     mean(α)
────────── ────────── ────────── ────────── ────────── ────────── ──────────
        2        7.5      3e-05   1.28e+04       -122   2.66e-15      0.167
        4       5.59   3.86e-05   1.47e+04       -119    2.6e-07      0.378
        8       6.04   5.97e-05   1.86e+04       -115    0.00129      0.329
       16       7.27    0.00012   2.42e+04       -118     0.0134      0.193
       32       6.97   0.000192   3.05e+04       -114      0.107      0.225
       64       7.03   0.000367   4.13e+04       -117     0.0531      0.219
      128       7.23   0.000731   6.08e+04       -114     0.0944      0.196
      256       7.05    0.00138   6.77e+04       -115       0.13      0.217
      512       7.14    0.00273   7.18e+04       -115      0.171      0.207
 1.02e+03       7.19    0.00542   7.91e+04       -115      0.172      0.201
────────────────────────────────────────────────────────────────────────────

This runs PT on a 100-dimensional MVN toy example with 10 chains for $2047 = 2^{11} - 1$ iterations, and returns a PT struct containing the results of this run (more later on how to access information inside a PT struct). Each line in the output provides information on a round, where the number of iteration per round doubles at each round and adaptation is performed between rounds.

Since the above two julia lines are the most common operations in this package, creating inputs and running PT can be done in one line as follows:

pt = pigeons(target = toy_mvn_target(100));
┌ Info: Neither traces, disk, nor online recorders included.
   You may not have access to your samples (unless you are using a custom recorder, or maybe you just want log(Z)).
   To add recorders, use e.g. pigeons(target = ..., record = [traces; record_default()])
────────────────────────────────────────────────────────────────────────────
  scans        Λ        time(s)    allc(B)  log(Z₁/Z₀)   min(α)     mean(α)
────────── ────────── ────────── ────────── ────────── ────────── ──────────
        2        7.5   2.34e-05   1.28e+04       -122   2.66e-15      0.167
        4       5.59   3.62e-05   1.47e+04       -119    2.6e-07      0.378
        8       6.04   5.76e-05   1.86e+04       -115    0.00129      0.329
       16       7.27   0.000103   2.42e+04       -118     0.0134      0.193
       32       6.97   0.000191   3.05e+04       -114      0.107      0.225
       64       7.03   0.000367   4.13e+04       -117     0.0531      0.219
      128       7.23   0.000711   6.08e+04       -114     0.0944      0.196
      256       7.05    0.00138   6.77e+04       -115       0.13      0.217
      512       7.14    0.00276   7.18e+04       -115      0.171      0.207
 1.02e+03       7.19    0.00534   7.91e+04       -115      0.172      0.201
────────────────────────────────────────────────────────────────────────────

where the args... passed to pigeons are forwarded to Inputs.

Continuing on the above example, to perform two additional rounds of sampling, use the following (see also: more advanced checkpoint/resume options at the checkpoint page):

pt = increment_n_rounds!(pt, 2)
pigeons(pt)
┌ Info: Neither traces, disk, nor online recorders included.
   You may not have access to your samples (unless you are using a custom recorder, or maybe you just want log(Z)).
   To add recorders, use e.g. pigeons(target = ..., record = [traces; record_default()])
┌ Warning: The set of successful reports changed
@ Pigeons ~/work/Pigeons.jl/Pigeons.jl/src/pt/report.jl:46
────────────────────────────────────────────────────────────────────────────
  scans        Λ        time(s)    allc(B)  log(Z₁/Z₀)   min(α)     mean(α)
────────── ────────── ────────── ────────── ────────── ────────── ──────────
 2.05e+03       7.19     0.0106   7.79e+04       -115      0.188      0.201
  4.1e+03       7.16     0.0211   7.79e+04       -115      0.184      0.205
────────────────────────────────────────────────────────────────────────────

Scope

We describe here the class of problems that can be approached using Pigeons. In summary: computational Lebesgue integration.

Let $\pi(x)$ denote a probability density called the target. In many problems, e.g. in Bayesian statistics, the density $\pi$ is typically known only up to a normalization constant,

\[\pi(x) = \frac{\gamma(x)}{Z},\]

where $\gamma$ can be evaluated pointwise, but $Z$ is unknown. Pigeons takes as input the function $\gamma$.

log_potential

Since we work in log-scale, we use the terminology log_potential as a shorthand for the unnormalized log density $\log \gamma(x)$. See informal interface log_potential.

Pigeons' outputs can be used for two tasks:

Pigeons shines in the following scenarios:

How to cite Pigeons

Our team works hard to maintain and improve the Pigeons package. Please consider citing our work by referring to our Pigeons paper.

BibTeX code for citing Pigeons

@article{surjanovic2023pigeons,
  title={Pigeons.jl: {D}istributed sampling from intractable distributions},
  author={Surjanovic, Nikola and Biron-Lattes, Miguel and Tiede, Paul and Syed, Saifuddin and Campbell, Trevor and Bouchard-C{\^o}t{\'e}, Alexandre},
  journal={arXiv:2308.09769},
  year={2023}
}

APA

Surjanovic, N., Biron-Lattes, M., Tiede, P., Syed, S., Campbell, T., & Bouchard-Côté, A. (2023). Pigeons.jl: Distributed sampling from intractable distributions. arXiv:2308.09769.