Quickstart: Nested Sampling with Ernest
Source:vignettes/nested-sampling-with-ernest.Rmd
nested-sampling-with-ernest.Rmd
Nested Sampling with Ernest
This vignette provides a hands-on introduction to nested sampling
(NS) using the ernest
package. You will learn how to:
- Understand the role of model evidence in Bayesian inference and why it is difficult to compute.
- Use prior transforms to define parameter spaces for NS.
- Set up and run a nested sampling analysis in R with ernest.
- Inspect, summarise, and visualise results from an NS run, including the evidence estimates and posterior samples.
Bayesian Model Evidence and Nested Sampling
In Bayesian inference, we use probabilities to represent our current beliefs about a model’s unobservable parameters (Ashton et al. 2022). When we gather new data, we update these beliefs using Bayes’ theorem: where is the prior distribution, is the likelihood of a model given parameters and the data , and is the posterior distribution of the parameters after our beliefs have been updated.
The denominator is called the Bayesian evidence or marginal likelihood. In isolation, serves to normalise so that it is a well-conditioned probability distribution. If we reorganise Bayes’ theorem to isolate , we see that calculating a model’s evidence involves integrating over all possible values of : This allows us to use as a parameter-independent measure of a model’s overall plausibility given some data. When comparing two models, the ratio of their respective evidences (called the Bayes factor) shows how much more the data support one model over the other, which forms the foundation of Bayesian model selection.
For most data and models, the evidence integral cannot be solved directly. Instead, researchers rely on estimation methods. Nested sampling (NS), introduced by Skilling (2004) and Skilling (2006), is designed to estimate even when the posterior distribution is poorly conditioned (e.g., if has multiple peaks or discontinuities along values of ). It accomplishes this by dividing the prior space into many small nested shells or volumes. These shells are defined by the smallest likelihood value they contain, such that the volume of the cells containing the smallest value is given by If we build many across different values of , we can approximate the original multidimensional integral across the parameter space of with a one-dimensional integral over the sequence of : where exists. This requires us to estimate the volume of each shell , which we can do using the properties of the uniform order statistics (Speagle 2020).
NS generates this sequence of shells by generating a set number of live points within the prior space, then replacing the worst of these points with a new point from with an additional likelihood constraint. This lets NS handle complicated likelihood surfaces, including those with multiple peaks or sharp transitions. In addition, NS naturally provides stopping rules, meaning the algorithm knows when it has gathered enough information to accurately estimate the evidence. As a bonus, the same samples used for evidence estimation can be repurposed to estimate the posterior distribution.
Nested Sampling with ernest
Here, we use an example from the documentation for the python NS package dynesty (Speagle 2020) to demonstrate how to use ernest to design, perform, and report nested sampling runs.
Defining Priors
Nested sampling operates by drawing samples from the prior, but for efficiency, ernest represents the prior space as points in a [0, 1)-unit hypercube. A prior transformation function must be specified to translate points from the hypercube into valid .
In ernest, you define priors using functions like
create_uniform_prior()
or by supplying a custom
transformation to create_prior()
. In addition to organising
the prior into an object that ernest can use during NS, these functions
also perform checks to help ensure your prior transformation function is
size- and type-stable.
Example: Uniform Prior
In many cases, it is sufficient to define a prior with independently distributed uniform or normal distributions. Ernest provides convenience functions to build such priors with efficient prior transformation functions.
The following defines a uniform prior over for each parameter in 3D space:
prior <- create_uniform_prior(
lower = -10,
upper = 10,
varnames = c("x", "y", "z")
)
prior
#> Prior distribution <uniform_prior/ernest_prior>
#>
#> Names: "x", "y", and "z"
#> Bounds:
#> → Lower: -10, -10, and -10
#> → Upper: 10, 10, and 10
Example: Custom/Conditional Prior
For more complex priors, you must provide a custom function. In the case of prior spaces with independent marginals, this amounts to specifying a function that applies the inverse CDF for each component of .
Consider the following prior space with five dimensions: The first two are drawn from a bivariate Normal distribution, the third is drawn from a Beta distribution, the fourth from a Gamma distribution, and the fifth from a truncated normal distribution.
five_dim <- function(u) {
x <- double(5)
# MVN(mu = c(5, 2), Sigma = [5, 4; 4, 5])
t <- qnorm(u[1:2])
sigma_sqrt <- matrix(c(2, 1, 1, 2), nrow = 2, byrow = TRUE)
mu <- c(5, 2)
x[1:2] <- drop(t %*% sigma_sqrt) + c(5, 2)
# Beta
x[3] <- qbeta(u[3], shape1 = 2.31, shape2 = 0.627)
# Gamma
x[4] <- qgamma(u[4], shape = 5)
# Truncated Normal
x[5] <- qtruncnorm(u[5], a = 2, b = 10, mean = 5, sd = 2)
return(x)
}
create_prior(
fn = five_dim,
.n_dim = 5,
.varnames = c("MVN", "MVN", "Beta", "Gamma", "Norm[2, 10]")
)
#> New names:
#> • `MVN` -> `MVN...1`
#> • `MVN` -> `MVN...2`
#> Prior distribution <ernest_prior>
#>
#> Names: "MVN...1", "MVN...2", "Beta", "Gamma", and "Norm[2, 10]"
For more sophisticated priors (such as those for hierarchical models), you will need to build more involved prior transformation functions.
hierarchical <- function(u) {
# mu ~ N(5, 1)
mu <- qnorm(u[1], mean = 5, sd = 1)
# log10(sd) ~ U[-1, 1]
sd <- 10 ^ qunif(u[2], -1, 1)
# x ~ N(mu, sd^2)
x <- qnorm(u[3], mean = mu, sd = sd)
c(mu, sd, x)
}
create_prior(
fn = hierarchical,
.n_dim = 3,
.varnames = c("mu", "sigma", "x"),
.lower = c(-Inf, 0, -Inf)
)
#> Prior distribution <ernest_prior>
#>
#> Names: "mu", "sigma", and "x"
#> Bounds:
#> → Lower: -Inf, 0, and -Inf
#> → Upper: Inf, Inf, and Inf
Likelihood Function
Model log-likelihoods are represented in ernest with functions. These
functions are expected to return a single scalar value for each possible
within the prior space. If, for any reason,
contains regions where
is invalid, ensure your likelihood function returns
-Inf
.
In this example, we use create_likelihood()
to assign
parameters to the LaplaceDemon
density function for a
multivariate normal distribution:
mu <- c(0, 0, 0)
C <- diag(1, 3)
C[C == 0] <- 0.95
loglike <- create_likelihood(
rowwise_fn = dmvn,
mu = !!mu,
Sigma = !!C,
log = TRUE
)
Setting Up and Running the Sampler
Initialise the sampler with your likelihood and prior. The number of
live points (n_points
) controls the resolution of the
sampling, with more points leading to more accurate estimates in
exchange for longer run times.
sampler <- ernest_sampler(
log_lik = loglike,
prior = prior,
n_points = 500
)
sampler
#> Nested sampling specification <ernest_sampler>
#> No. Points: 500
#>
#> ── Sampling Method
#> • Random Walk in Unit Cube LRPS <rwmh_cube/ernest_lrps>
#> • No. Dimensions: 3
#> • No. Calls Since Update: 0
#> • No. Accepted Since Update: 0
#> • Current Step Size: 1
Run nested sampling for a fixed number of iterations or until the evidence estimate converges:
run <- generate(
sampler,
max_iterations = 2000,
seed = 123,
show_progress = FALSE
)
#> ✔ Creating new live points.
#> ✔ `max_iterations` reached (2000).
run
#> Nested sampling run <ernest_run/ernest_sampler>
#> No. Points: 500
#>
#> ── Sampling Method
#> • Random Walk in Unit Cube LRPS <rwmh_cube/ernest_lrps>
#> • No. Dimensions: 3
#> • No. Calls Since Update: 0
#> • No. Accepted Since Update: 0
#> • Current Step Size: 0.0823
#>
#> ── Results
#> No. Iterations: 2000
#> No. Calls: 35578
#> Log. Evidence: -8.9122 (± 1.6336)
generate
produces an ernest_run
object that
can be saved. You can continue a run by calling generate
on
a previously created ernest_run
:
tmp_name <- tempfile("ernest_run.rds")
saveRDS(run, tmp_name)
continued_run <- readRDS(tmp_name)
run2 <- generate(continued_run, min_logz = 0.01, show_progress = FALSE)
#> ✔ Restoring live points from a previous run.
#> ✔ `min_logz` reached (0.01 < 0.01).
run2
#> Nested sampling run <ernest_run/ernest_sampler>
#> No. Points: 500
#>
#> ── Sampling Method
#> • Random Walk in Unit Cube LRPS <rwmh_cube/ernest_lrps>
#> • No. Dimensions: 3
#> • No. Calls Since Update: 0
#> • No. Accepted Since Update: 0
#> • Current Step Size: 0.004
#>
#> ── Results
#> No. Iterations: 6699
#> No. Calls: 153053
#> Log. Evidence: -9.0751 (± 0.1247)
Inspecting and Summarising Results
The result object has a summary
method for viewing
evidence estimates, posterior samples, and diagnostics as a tidy
tibble
:
summary(run2)
#>
#> ── Nested sampling results <ernest_run> ────────────────────────────────────────
#> No. Points: 500
#> No. Iterations: 6699
#> No. Lik. Calls: 153053
#> Log. Evidence: -9.0751 (± 0.1247)
summary(run2)$run
#> # A tibble: 7,199 × 7
#> call log_lik log_volume log_weight log_evidence log_evidence_err information
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 -2309. -0.002 -2306. -2315. 0 0
#> 2 2 -2150. -0.004 -2147. -2156. 0 0
#> 3 3 -2134. -0.006 -2131. -2140. 0 0
#> 4 4 -2119. -0.008 -2116. -2125. 0 0
#> 5 5 -2077. -0.01 -2074. -2083. 0 0
#> 6 6 -2072. -0.012 -2069. -2078. 0 0
#> 7 7 -2025. -0.014 -2022. -2031. 0 0
#> 8 8 -2002. -0.016 -1999. -2008. 0 0
#> 9 9 -1927. -0.018 -1924. -1934. 0 0
#> 10 10 -1907. -0.02 -1904. -1913. 0 0
#> # ℹ 7,189 more rows
The posterior
package offers methods for inspecting the
points generated during a run:
You can view the importance weight of each point and re-weight the sample to estimate the posterior distribution:
weights(unweighted_post) |> head()
#> [1] 0 0 0 0 0 0
weighted_post <- unweighted_post |>
resample_draws()
posterior::summarise_draws(weighted_post)
#> # A tibble: 3 × 10
#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 x -0.135 -0.139 0.964 0.988 -1.69 1.41 1.15 1743. 14.4
#> 2 y -0.145 -0.158 0.982 0.994 -1.71 1.51 1.13 1685. 14.6
#> 3 z -0.143 -0.141 0.972 1.00 -1.70 1.43 1.18 1719. 14.1
Visualising Results
Ernest provides plotting utilities for evidence estimates and the posterior distribution.
Uncertainty and Resampling
You can simulate the uncertainty of an NS run by generating random
draws of the log-volume estimate at each iteration Higson et al. (2019). You can then visualise
this uncertainty with the estimate’s plot
method.
calc_sim <- calculate(run2, ndraws = 500)
calc_sim
#> Nested sampling estimates <ernest_estimate>
#> No. of Simulated Draws: 500
#> Log. Volume: -20 ± 1.3
#> Log. Evidence: -9.1 ± 0.12
#> # A tibble: 7,199 × 4
#> log_lik log_volume log_weight log_evidence
#> <rvar[1d]> <rvar[1d]> <rvar[1d]> <rvar[1d]>
#> 1 -2309 ± NA -0.0020 ± 0.0020 -2315 ± 0.88 -2315 ± 0.88
#> 2 -2150 ± NA -0.0039 ± 0.0029 -2156 ± 0.80 -2156 ± 0.80
#> 3 -2134 ± NA -0.0059 ± 0.0035 -2140 ± 0.81 -2140 ± 0.81
#> 4 -2119 ± NA -0.0077 ± 0.0042 -2125 ± 0.80 -2125 ± 0.80
#> 5 -2077 ± NA -0.0098 ± 0.0046 -2083 ± 0.80 -2083 ± 0.80
#> 6 -2072 ± NA -0.0119 ± 0.0050 -2078 ± 0.82 -2078 ± 0.80
#> 7 -2025 ± NA -0.0138 ± 0.0054 -2031 ± 0.84 -2031 ± 0.84
#> 8 -2002 ± NA -0.0160 ± 0.0059 -2008 ± 0.84 -2008 ± 0.84
#> 9 -1927 ± NA -0.0181 ± 0.0061 -1934 ± 0.76 -1934 ± 0.76
#> 10 -1907 ± NA -0.0201 ± 0.0065 -1913 ± 0.76 -1913 ± 0.76
#> # ℹ 7,189 more rows
plot(calc_sim)
For more details on nested sampling, please refer to ernest’s documentation.