Julia code as input to pigeons

In typical Bayesian statistics applications, it is easiest to specify the model in a modelling language, such as Turing, but sometimes to get more flexibility or speed it is useful to implement the density evaluation manually as a "black-box" Julia function.

Here we show how this is done using our familiar unidentifiable toy example.

We first create a custom type, MyLogPotential to control dispatch on the interface target.

using Pigeons
using Random
using StatsFuns

struct MyLogPotential
    n_trials::Int
    n_successes::Int
end

Next, we make MyLogPotential a function-like object, so that we can write expressions of the form my_log_potential([0.5, 0.5]) and hence MyLogPotential satisfies the log_potential interface:

function (log_potential::MyLogPotential)(x)
    p1, p2 = x
    if !(0 < p1 < 1) || !(0 < p2 < 1)
        return -Inf64
    end
    p = p1 * p2
    return StatsFuns.binomlogpdf(log_potential.n_trials, p, log_potential.n_successes)
end

# e.g.:
my_log_potential = MyLogPotential(100, 50)
my_log_potential([0.5, 0.5])
-16.91498002656617

Next, we need to specify how to create fresh state objects:

Pigeons.initialization(::MyLogPotential, ::AbstractRNG, ::Int) = [0.5, 0.5]

We can now run the sampler:

pt = pigeons(
        target = MyLogPotential(100, 50),
        reference = MyLogPotential(0, 0)
    )
┌ Info: Neither traces, disk, nor online recorders included.
   You may not have access to your samples (unless you are using a custom recorder, or maybe you just want log(Z)).
   To add recorders, use e.g. pigeons(target = ..., record = [traces; record_default()])
┌ Warning: It looks like sample_iid!() is not implemented for a
reference_log_potential of type Main.MyLogPotential.
Instead, using step!().
@ Pigeons ~/work/Pigeons.jl/Pigeons.jl/src/targets/target.jl:51
──────────────────────────────────────────────────────────────────────────────────────────────────
  scans        Λ        time(s)    allc(B)  log(Z₁/Z₀)   min(α)     mean(α)    min(αₑ)   mean(αₑ)
────────── ────────── ────────── ────────── ────────── ────────── ────────── ────────── ──────────
        2       2.03   0.000432   1.84e+05      -8.65   6.03e-05      0.775          1          1
        4      0.994   0.000283   3.27e+04      -4.82      0.578       0.89          1          1
        8       2.25   0.000469   5.92e+04      -5.81      0.201       0.75          1          1
       16       1.62   0.000762   7.25e+04         -5      0.609       0.82          1          1
       32       1.25    0.00144   1.39e+05      -4.69      0.706      0.861          1          1
       64       1.52    0.00257   1.26e+05      -5.05      0.684      0.831          1          1
      128       1.67    0.00476   1.22e+05      -5.16      0.661      0.815          1          1
      256       1.56    0.00956   1.22e+05      -5.09      0.762      0.827          1          1
      512       1.54     0.0183   1.22e+05      -5.07      0.786      0.829          1          1
 1.02e+03       1.51     0.0364   1.22e+05      -4.91      0.807      0.832          1          1
──────────────────────────────────────────────────────────────────────────────────────────────────

Notice that we have specified a reference distribution, in this case the same model but with no observations (hence the prior). Indeed, in contrast to targets specified using Turing.jl, it is not possible to construct a reference automatically from Julia "black-box" targets.

The default_explorer() is the SliceSampler.

Sampling from the reference distribution

Ability to sample from the reference distribution can be beneficial, e.g. to jump modes in multi-modal distribution. For black-box Julia function targets, this is done as follows:

function Pigeons.sample_iid!(::MyLogPotential, replica, shared)
    state = replica.state
    rng = replica.rng
    rand!(rng, state)
end

pt = pigeons(
        target = MyLogPotential(100, 50),
        reference = MyLogPotential(0, 0)
    )
┌ Info: Neither traces, disk, nor online recorders included.
   You may not have access to your samples (unless you are using a custom recorder, or maybe you just want log(Z)).
   To add recorders, use e.g. pigeons(target = ..., record = [traces; record_default()])
──────────────────────────────────────────────────────────────────────────────────────────────────
  scans        Λ        time(s)    allc(B)  log(Z₁/Z₀)   min(α)     mean(α)    min(αₑ)   mean(αₑ)
────────── ────────── ────────── ────────── ────────── ────────── ────────── ────────── ──────────
        2       2.03   0.000167   2.37e+04      -12.1   5.75e-08      0.775          1          1
        4       1.13   0.000262   3.08e+04      -5.26     0.0854      0.874          1          1
        8      0.989   0.000393   5.56e+04      -4.09      0.509       0.89          1          1
       16        1.5   0.000734   9.34e+04      -5.16      0.675      0.834          1          1
       32       1.58    0.00137   1.17e+05      -5.29       0.73      0.825          1          1
       64       1.34    0.00248   1.25e+05      -4.79      0.758      0.852          1          1
      128       1.65     0.0046   1.18e+05      -4.91      0.739      0.817          1          1
      256       1.46    0.00885   1.18e+05      -5.01      0.732      0.838          1          1
      512       1.52     0.0178   1.18e+05       -5.1      0.759      0.831          1          1
 1.02e+03       1.55     0.0348   1.18e+05         -5      0.796      0.828          1          1
──────────────────────────────────────────────────────────────────────────────────────────────────

Changing the explorer

Here is an example using AutoMALA instead of the default SliceSampler. We only need to add methods to make our custom type MyLogPotential conform the LogDensityProblems interface:

using LogDensityProblems

LogDensityProblems.dimension(lp::MyLogPotential) = 2
LogDensityProblems.logdensity(lp::MyLogPotential, x) = lp(x)

pt = pigeons(
        target = MyLogPotential(100, 50),
        reference = MyLogPotential(0, 0),
        explorer = AutoMALA(default_autodiff_backend = :ForwardDiff)
    )
┌ Info: Neither traces, disk, nor online recorders included.
   You may not have access to your samples (unless you are using a custom recorder, or maybe you just want log(Z)).
   To add recorders, use e.g. pigeons(target = ..., record = [traces; record_default()])
──────────────────────────────────────────────────────────────────────────────────────────────────
  scans        Λ        time(s)    allc(B)  log(Z₁/Z₀)   min(α)     mean(α)    min(αₑ)   mean(αₑ)
────────── ────────── ────────── ────────── ────────── ────────── ────────── ────────── ──────────
        2       2.07       1.47   1.56e+08        -12   4.85e-08      0.771      0.266      0.508
        4       1.24      0.112   6.68e+06      -5.87      0.282      0.862      0.369      0.549
        8       1.28     0.0263   2.71e+06      -5.71      0.554      0.858      0.413      0.511
       16       1.18     0.0071    5.4e+06      -4.92       0.63      0.869      0.426      0.519
       32       1.61     0.0162    1.1e+07      -4.57      0.391      0.821       0.41      0.525
       64       1.18     0.0307   2.16e+07      -4.91      0.753      0.869      0.455      0.531
      128       1.42     0.0841   4.17e+07      -4.84      0.767      0.842      0.475      0.544
      256       1.45      0.122   8.27e+07      -4.83      0.763      0.839      0.501       0.55
      512       1.55      0.219   1.65e+08      -4.93      0.769      0.828        0.5       0.55
 1.02e+03       1.53      0.439   3.28e+08         -5      0.808       0.83      0.498      0.554
──────────────────────────────────────────────────────────────────────────────────────────────────

Pigeons have several built-in explorer kernels such as AutoMALA and a SliceSampler. However when the state space is neither the reals nor the integers, or for performance reasons, it may be necessary to create custom exploration MCMC kernels. This is described on the custom explorers page.

Manipulating the output

Some common post-processing are shown below, see the section on output processing for more information.

using MCMCChains
using StatsPlots
plotlyjs()

pt = pigeons(
        target = MyLogPotential(100, 50),
        reference = MyLogPotential(0, 0),
        explorer = AutoMALA(default_autodiff_backend = :ForwardDiff),
        record = [traces])
samples = Chains(pt)
my_plot = StatsPlots.plot(samples)
StatsPlots.savefig(my_plot, "julia_posterior_densities_and_traces.html");

samples
Chains MCMC chain (1024×3×1 Array{Float64, 3}):

Iterations        = 1:1:1024
Number of chains  = 1
Samples per chain = 1024
parameters        = param_1, param_2
internals         = log_density

Summary Statistics
  parameters      mean       std      mcse   ess_bulk   ess_tail      rhat   e     Symbol   Float64   Float64   Float64    Float64    Float64   Float64     ⋯

     param_1    0.7187    0.1466    0.0060   613.9823   744.7188    0.9992     ⋯
     param_2    0.7134    0.1486    0.0061   618.1775   741.6100    1.0024     ⋯
                                                                1 column omitted

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

     param_1    0.4669    0.6015    0.7062    0.8433    0.9822
     param_2    0.4699    0.5940    0.6965    0.8393    0.9771