Saving traces
The traces
refer to the list of samples $X_1, X_2, \dots, X_n$ from which we can approximate expectations of the form $E[f(X)]$, where $X \sim \pi$ via a Monte Carlo average of the form $\sum_i f(X_i) / n$.
To indicate that the traces should be saved, use
using DynamicPPL
using Pigeons
target = Pigeons.toy_turing_unid_target(100, 50)
pt = pigeons(; target,
n_rounds = 3,
# make sure to record the trace:
record = [traces; round_trip; record_default()])
PT(checkpoint = false, ...)
Note that there are more memory efficient alternatives to saving the full traces: see online (constant-memory) statistics and off-memory processing.
Accessing traces
Use get_sample
to access the list of samples:
get_sample(pt)
8-element Pigeons.SampleArray{Vector{Float64}, PT}:
[0.5480531395881381, 0.7111587788265167, -8.001156932600198]
[0.6417578593639657, 0.8193098317534669, -6.044521492146245]
[0.5635640490721443, 0.9512261759202737, -7.265017408328339]
[0.5372315596959659, 0.9417959416828765, -6.833608590115077]
[0.6424119190840571, 0.6614625444898359, -6.638153033637418]
[0.7108478937373857, 0.7534993485537478, -6.0508454283065785]
[0.6351986896221519, 0.7398042493665612, -5.8220440272267]
[0.8062385791654489, 0.6066887061763869, -5.843884557029044]
In the special case where each state is a vector, use sample_names
to obtain description of the vector components:
sample_names(pt)
3-element Vector{Symbol}:
:p1
:p2
:log_density
Still in the special case where each state is a vector, it is often convenient to organize the result into a single array, this is done using sample_array
:
sample_array(pt)
8×3×1 Array{Float64, 3}:
[:, :, 1] =
0.548053 0.711159 -8.00116
0.641758 0.81931 -6.04452
0.563564 0.951226 -7.26502
0.537232 0.941796 -6.83361
0.642412 0.661463 -6.63815
0.710848 0.753499 -6.05085
0.635199 0.739804 -5.82204
0.806239 0.606689 -5.84388
Customizing what is saved in the traces
You may want to save only some statistics of interest, or a subset of the dimensions to take up less memory.
We show here an example saving only the value of the first coordinate:
struct OnlyFirstExtractor end
Pigeons.extract_sample(state, log_potential, extractor::OnlyFirstExtractor) =
Pigeons.extract_sample(state, log_potential)[1:1]
pt = pigeons(; target,
n_rounds = 3,
# custom method to extract samples:
extractor = OnlyFirstExtractor(),
# make sure to record the trace:
record = [traces; round_trip; record_default()])
sample_array(pt)
8×1×1 Array{Float64, 3}:
[:, :, 1] =
0.5480531395881381
0.6417578593639657
0.5635640490721443
0.5372315596959659
0.6424119190840571
0.7108478937373857
0.6351986896221519
0.8062385791654489
Optionally, it is a good idea to also adjust the behaviour of sample_names
accordingly. For example, variables_names
gets called when creating MCMCChains object so that e.g. plots are labelled correctly.
Pigeons.sample_names(state, log_potential, extractor::OnlyFirstExtractor) =
Pigeons.sample_names(state, log_potential)[1:1]
To keep only the value of the log potential, you can use the following built-in LogPotentialExtractor
:
pt = pigeons(; target,
n_rounds = 3,
# custom method to extract samples:
extractor = Pigeons.LogPotentialExtractor(),
# make sure to record the trace:
record = [traces; round_trip; record_default()])
sample_array(pt)
8×1×1 Array{Float64, 3}:
[:, :, 1] =
-8.001156932600198
-6.044521492146245
-7.265017408328354
-6.8336085901150785
-6.638153033637418
-6.0508454283065785
-5.8220440272267
-5.843884557029044