Turing.jl model as input to pigeons

To target the posterior distribution specified by a Turing.jl model first load Turing or DynamicPPL and use TuringLogPotential:

using DynamicPPL, Pigeons, Distributions

DynamicPPL.@model function my_turing_model(n_trials, n_successes)
    p1 ~ Uniform(0, 1)
    p2 ~ Uniform(0, 1)
    n_successes ~ Binomial(n_trials, p1*p2)
    return n_successes
end

my_turing_target = TuringLogPotential(my_turing_model(100, 50))
pt = pigeons(target = my_turing_target);
┌ Info: Neither traces, disk, nor online recorders included.
   You may not have access to your samples (unless you are using a custom recorder, or maybe you just want log(Z)).
   To add recorders, use e.g. pigeons(target = ..., record = [traces; record_default()])
──────────────────────────────────────────────────────────────────────────────────────────────────
  scans        Λ        time(s)    allc(B)  log(Z₁/Z₀)   min(α)     mean(α)    min(αₑ)   mean(αₑ)
────────── ────────── ────────── ────────── ────────── ────────── ────────── ────────── ──────────
        2       3.24      0.008   1.13e+06      -8.14    0.00178       0.64          1          1
        4       1.64    0.00222   2.08e+06      -5.04     0.0352      0.818          1          1
        8       1.17    0.00401   4.07e+06      -4.42      0.708      0.871          1          1
       16        1.2    0.00807   8.59e+06      -4.03      0.549      0.867          1          1
       32       1.11     0.0157   1.68e+07      -4.77      0.754      0.877          1          1
       64       1.35     0.0648   3.37e+07      -4.79      0.698       0.85          1          1
      128        1.6      0.062   6.68e+07      -4.97      0.725      0.823          1          1
      256       1.51      0.159   1.32e+08      -4.92      0.758      0.832          1          1
      512       1.46      0.342   2.66e+08         -5      0.806      0.838          1          1
 1.02e+03       1.49       0.63   5.33e+08      -4.92      0.798      0.834          1          1
──────────────────────────────────────────────────────────────────────────────────────────────────

At the moment, only Turing models with fixed dimensionality are supported. Both real and integer-valued random variables are supported. For a TuringLogPotential, the default_explorer() is the SliceSampler and the default_reference() is the prior distribution encoded in the Turing model.

Gradient-based sampling with AutoMALA

For Turing models with fully continuous state-spaces—as is the case for my_turing_model defined above—AutoMALA can be an effective alternative to SliceSampler—especially for high-dimensional problems. Because Turing targets conform to the LogDensityProblemsAD.jl interface, Automatic Differentiation (AD) backends can be used to obtain the gradients needed by AutoMALA.

The default AD backend for AutoMALA is ForwardDiff. However, when the Turing model does not involve branching decisions (if, while, etc...) depending on latent variables, ReverseDiff can provide accelerated performance. Since my_turing_target satisfies this criterion, we can use AutoMALA with the ReverseDiff AD backend via

using ReverseDiff
pt = pigeons(
    target = my_turing_target,
    explorer = AutoMALA(default_autodiff_backend = :ReverseDiff)
);
┌ Info: Neither traces, disk, nor online recorders included.
   You may not have access to your samples (unless you are using a custom recorder, or maybe you just want log(Z)).
   To add recorders, use e.g. pigeons(target = ..., record = [traces; record_default()])
┌ Info: 
Using ReverseDiff with tape compilation, which usually results in huge performance gains.
However, if your model does branching on latent variables, you will get inconsistent results.
You can turn this feature off using `Pigeons.set_tape_compilation_strategy!(false)`.
──────────────────────────────────────────────────────────────────────────────────────────────────
  scans        Λ        time(s)    allc(B)  log(Z₁/Z₀)   min(α)     mean(α)    min(αₑ)   mean(αₑ)
────────── ────────── ────────── ────────── ────────── ────────── ────────── ────────── ──────────
        2       1.88       4.41   3.14e+08      -7.83    0.00499      0.791      0.355      0.665
        4       2.07      0.123   7.36e+06      -4.75      0.344       0.77     0.0842      0.573
        8       1.79     0.0187   6.36e+06      -6.21      0.433      0.801      0.546      0.601
       16       1.57     0.0358   1.26e+07      -4.79      0.545      0.826      0.559      0.628
       32       1.44     0.0734   2.58e+07      -5.09      0.655       0.84      0.577      0.638
       64       1.38      0.152   5.32e+07      -4.66      0.771      0.847      0.558      0.631
      128       1.45      0.378   1.05e+08       -4.9      0.774      0.838      0.544      0.631
      256       1.56      0.638   2.11e+08      -5.02      0.753      0.827      0.572      0.631
      512       1.54       1.33   4.21e+08      -5.05      0.779      0.829      0.524      0.627
 1.02e+03       1.54       2.75   8.51e+08      -4.98      0.797      0.829      0.529       0.62
──────────────────────────────────────────────────────────────────────────────────────────────────

Using DynamicPPL.@addlogprob!

The macro DynamicPPL.@addlogprob! is sometimes used when additional flexibility is needed while incrementing the log probability. To do so with Pigeons.jl, you will need to enclose the call to DynamicPPL.@addlogprob! within an if statement as shown below. Failing to do so will lead to invalid results.

DynamicPPL.@model function my_turing_model(my_data)
    # code here
    if DynamicPPL.leafcontext(__context__) !== DynamicPPL.PriorContext() 
        DynamicPPL.@addlogprob! logpdf(MyDist(parms), my_data)
    end
end

Manipulating the output

Internally, Turing target's states (of type DynamicPPL.TypedVarInfo) are stored in an unconstrained parameterization provided by Turing (for example, bounded support variables are mapped to the full real line). However, sample post-processing functions such as sample_array() and process_sample() convert back to the original ("constrained") parameterization via extract_sample().

As a result parameterization issues can be essentially ignored when post-processing, for example some common post-processing are shown below, see the section on output processing for more information.

using MCMCChains
using StatsPlots
plotlyjs()

pt = pigeons(target = my_turing_target, record = [traces])
samples = Chains(pt)
my_plot = StatsPlots.plot(samples)
StatsPlots.savefig(my_plot, "turing_posterior_densities_and_traces.html");

samples
Chains MCMC chain (1024×3×1 Array{Float64, 3}):

Iterations        = 1:1:1024
Number of chains  = 1
Samples per chain = 1024
parameters        = p1, p2
internals         = log_density

Summary Statistics
  parameters      mean       std      mcse   ess_bulk   ess_tail      rhat   e     Symbol   Float64   Float64   Float64    Float64    Float64   Float64     ⋯

          p1    0.7239    0.1520    0.0065   574.1139   703.2334    0.9999     ⋯
          p2    0.7086    0.1507    0.0062   614.1593   920.7039    0.9998     ⋯
                                                                1 column omitted

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

          p1    0.4735    0.5934    0.7108    0.8487    0.9874
          p2    0.4647    0.5850    0.6959    0.8270    0.9815

Custom initialization

It is sometimes useful to provide a custom initialization, for example to start in a feasible region. This can be done as follows:

using DynamicPPL, Pigeons, Distributions, Random

DynamicPPL.@model function toy_beta_binom_model(n_trials, n_successes)
    p ~ Uniform(0, 1)
    n_successes ~ Binomial(n_trials, p)
    return n_successes
end

function toy_beta_binom_target(n_trials = 10, n_successes = 2)
    return Pigeons.TuringLogPotential(toy_beta_binom_model(n_trials, n_successes))
end

const ToyBetaBinomType = typeof(toy_beta_binom_target())

function Pigeons.initialization(target::ToyBetaBinomType, rng::AbstractRNG, ::Int64)
    result = DynamicPPL.VarInfo(rng, target.model, DynamicPPL.SampleFromPrior(), DynamicPPL.PriorContext())
    DynamicPPL.link!!(result, DynamicPPL.SampleFromPrior(), target.model)

    # custom init goes here: for example here setting the variable p to 0.5
    Pigeons.update_state!(result, :p, 1, 0.5)

    return result
end

pt = pigeons(target = toy_beta_binom_target(), n_rounds = 0)
@assert Pigeons.variable(pt.replicas[1].state, :p) == [0.5]
┌ Info: Neither traces, disk, nor online recorders included.
   You may not have access to your samples (unless you are using a custom recorder, or maybe you just want log(Z)).
   To add recorders, use e.g. pigeons(target = ..., record = [traces; record_default()])