Targeting a non-Julian model

Suppose you have some code implementing vanilla MCMC, written in an arbitrary "foreign" language such as C++, Python, R, Java, etc. You would like to turn this vanilla MCMC code into a Parallel Tempering algorithm able to harness large numbers of cores, including distributing this algorithm over MPI. However, you do not wish to learn anything about MPI/multi-threading/Parallel Tempering.

Surprisingly, it is very simple to bridge such code with Pigeons. The only requirement on the "foreign" language is that it supports reading the standard in and writing to the standard out, hence virtually any languages can be interfaced in this fashion. Based on this minimalist "standard stream bridge" with worker processes running foreign code (one such process per replica; not necessarily running on the same machine), Pigeons will coordinate the execution of an adaptive non-reversible parallel tempering algorithm.

This behaviour is implemented in StreamTarget, see its documentation for details. In a nutshell, there will be one child process for each PT chain. These processes will not necessarily be on the same machine: indeed distributed sampling is the key use case of this bridge. Pigeons will do some lightweight coordination with these child processes to orchestrate non-reversible parallel tempering. Interprocess communication only involves pigeons telling each child process to perform exploration at a pigeons-provided annealing parameter.

StreamTarget implements log_potential and explorer by invoking worker processes via standard stream communication. The standard stream is less efficient than alternatives such as protobuff, but it has the advantage of being supported by nearly all programming languages in existence. Also in many practical cases, since the worker process is invoked only three times per chain per iteration, it is unlikely to be the bottleneck (overhead is in the order of 0.1ms per interprocess call).

Usage example

To demonstrate this capability, we show here how it enables running Blang models in pigeons. Blang is a Bayesian modelling language designed for sampling combinatorial spaces such as phylogenetic trees.

We first setup Blang as follows (assuming Java 11 is accessible in the PATH variable):

using Pigeons

redirect_stdout(devnull) do
    Pigeons.setup_blang("blangDemos")
end

┌ Warning: Using a precompiled build for blangDemos: double check it is up to date
└ @ Pigeons ~/work/Pigeons.jl/Pigeons.jl/src/targets/BlangTarget.jl:141
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0  152M    0  109k    0     0   172k      0  0:15:10 --:--:--  0:15:10  171k
  1  152M    1 1913k    0     0  1179k      0  0:02:12  0:00:01  0:02:11 1178k
  2  152M    2 4476k    0     0  1717k      0  0:01:31  0:00:02  0:01:29 1717k
  4  152M    4 7093k    0     0  1955k      0  0:01:20  0:00:03  0:01:17 1955k
  5  152M    5 9288k    0     0  2014k      0  0:01:17  0:00:04  0:01:13 2014k
  7  152M    7 11.4M    0     0  2086k      0  0:01:15  0:00:05  0:01:10 2330k
  8  152M    8 13.7M    0     0  2120k      0  0:01:13  0:00:06  0:01:07 2426k
 10  152M   10 16.1M    0     0  2177k      0  0:01:11  0:00:07  0:01:04 2417k
 12  152M   12 18.6M    0     0  2222k      0  0:01:10  0:00:08  0:01:02 2417k
 13  152M   13 21.3M    0     0  2279k      0  0:01:08  0:00:09  0:00:59 2522k
 15  152M   15 24.2M    0     0  2337k      0  0:01:07  0:00:10  0:00:57 2617k
 17  152M   17 27.1M    0     0  2399k      0  0:01:05  0:00:11  0:00:54 2770k
 19  152M   19 30.2M    0     0  2460k      0  0:01:03  0:00:12  0:00:51 2890k
 22  152M   22 34.0M    0     0  2564k      0  0:01:01  0:00:13  0:00:48 3152k
 25  152M   25 38.7M    0     0  2713k      0  0:00:57  0:00:14  0:00:43 3543k
 28  152M   28 44.3M    0     0  2908k      0  0:00:53  0:00:15  0:00:38 4126k
 33  152M   33 51.4M    0     0  3172k      0  0:00:49  0:00:16  0:00:33 4966k
 39  152M   39 60.6M    0     0  3522k      0  0:00:44  0:00:17  0:00:27 6191k
 46  152M   46 71.8M    0     0  3954k      0  0:00:39  0:00:18  0:00:21 7745k
 55  152M   55 85.4M    0     0  4461k      0  0:00:35  0:00:19  0:00:16 9605k
 67  152M   67  102M    0     0  5112k      0  0:00:30  0:00:20  0:00:10 11.7M
 81  152M   81  124M    0     0  5882k      0  0:00:26  0:00:21  0:00:05 14.5M
 96  152M   96  148M    0     0  6709k      0  0:00:23  0:00:22  0:00:01 17.5M
100  152M  100  152M    0     0  6885k      0  0:00:22  0:00:22 --:--:-- 19.5M

Next, we run a Blang implementation of our usual unidentifiable toy example:

using Pigeons

blang_unidentifiable_example(n_trials, n_successes) =
    Pigeons.BlangTarget(
        `$(Pigeons.blang_executable("blangDemos", "demos.UnidentifiableProduct")) --model.nTrials $n_trials --model.nFails $n_successes`
    )
pt = pigeons(target = blang_unidentifiable_example(100, 50))

┌ Info: Neither traces, disk, nor online recorders included.
│    You may not have access to your samples (unless you are using a custom recorder, or maybe you just want log(Z)).
└    To add recorders, use e.g. pigeons(target = ..., record = [traces; record_default()])
Preprocess {
  2 samplers constructed with following prototypes:
    RealScalar sampled via: [RealSliceSampler]
} [ endingBlock=Preprocess blockTime=362.1ms blockNErrors=0 ]
Inference {
  ────────────────────────────────────────────────────────────────────────────
  scans        Λ        time(s)    allc(B)  log(Z₁/Z₀)   min(α)     mean(α)
────────── ────────── ────────── ────────── ────────── ────────── ──────────
        2      0.992       4.18   1.53e+07      -4.05      0.621       0.89
        4       1.42     0.0365   8.67e+06      -4.43      0.192      0.842
        8       1.44     0.0638   1.78e+07      -5.35      0.581       0.84
       16        1.7      0.144   3.49e+07      -4.72      0.577      0.811
       32       1.51      0.225   7.05e+07       -4.8      0.602      0.833
       64       1.52      0.348   1.43e+08      -4.95      0.727      0.831
      128       1.56      0.514   2.83e+08      -4.93      0.713      0.827
      256       1.52      0.923   5.55e+08      -5.06       0.78      0.831
      512       1.52       1.56   1.07e+09      -4.97      0.793      0.831
 1.02e+03       1.54       2.77    2.1e+09      -4.97      0.789      0.829
────────────────────────────────────────────────────────────────────────────

As shown above, create a StreamTarget amounts to specifying which command will be used to create a child process.

To terminate the child processes associated with a stream target, use:

Pigeons.kill_child_processes(pt)