MCMC with VPLanet and `alabi`

This tutorial demonstrates how to run Bayesian parameter inference on a VPLanet forward model using a custom likelihood function and the alabi surrogate-model framework.

Evaluating a VPLanet model can take seconds to minutes per call. Running a standard MCMC sampler (e.g. emcee or dynesty) directly on the forward model requires millions of evaluations, which is computationally prohibitive. alabi addresses this by:

Building a Gaussian Process surrogate of the log-posterior surface using a small number of active-learning iterations.
Running MCMC on the fast surrogate instead of the expensive forward model.

We also show how to run MCMC directly using dynesty or emcee — useful for fast models or as a baseline comparison.

Problem: inferring TRAPPIST-1 stellar properties

We fit four stellar parameters of TRAPPIST-1 using observed luminosity and XUV luminosity:

Parameter	Symbol	Prior range
Stellar mass	\(m_\star\)	\([0.07, 0.11]\, M_\odot\)
XUV saturation fraction	\(f_{\rm sat}\)	\([-5, -1]\, {\rm dex}\)
XUV saturation timescale	\(t_{\rm sat}\)	\([0.1, 12]\, {\rm Gyr}\)
XUV power-law exponent	\(\beta_{\rm XUV}\)	\([-2, 0]\)

The age is constrained by a Gaussian prior \(t_{\rm age} = 7.6 \pm 2.2\, {\rm Gyr}\).

Dependencies: vplanet_inference, numpy, astropy, dynesty, emcee, matplotlib

Optional (for surrogate MCMC): alabi (see installation note below)

References:

Birky et al. (2021) — alabi method paper
Speagle (2020) — dynesty nested sampling
Foreman-Mackey et al. (2013) — emcee affine-invariant MCMC
VPLanet stellar module

[ ]:

import os
import numpy as np
import matplotlib.pyplot as plt
import astropy.units as u
from functools import partial
import multiprocessing as mp
import warnings
warnings.filterwarnings("ignore")

import vplanet_inference as vpi

1. Configure the VPLanet Forward Model

We use the bundled stellar infiles — a single low-mass star with the stellar module activated. The model outputs bolometric luminosity and XUV luminosity at the end of the simulation.

[ ]:

INFILE_PATH = os.path.join(vpi.INFILE_DIR, "stellar")
print(f"Infile path: {INFILE_PATH}")
print("Files:", os.listdir(INFILE_PATH))

[ ]:

# Input parameters: what to vary and in what units
# theta = [mass, log(fsat), tsat, age, beta]
inparams = {
    "star.dMass":       u.Msun,
    "star.dSatXUVFrac": u.dex(u.dimensionless_unscaled),  # log scale
    "star.dSatXUVTime": u.Gyr,
    "vpl.dStopTime":    u.Gyr,
    "star.dXUVBeta":    -u.dimensionless_unscaled,         # stored negative in vpl
}

# Output parameters: what to collect from the log file
outparams = {
    "final.star.Luminosity":  u.Lsun,
    "final.star.LXUVStellar": u.Lsun,
}

vpm = vpi.VplanetModel(
    inparams=inparams,
    outparams=outparams,
    inpath=INFILE_PATH,
    outpath="output/mcmc",
    verbose=False,
)

print("Model inputs :", vpm.inparams)
print("Model outputs:", vpm.outparams)

Sanity check: single run at known TRAPPIST-1 values

[ ]:

# Approximate TRAPPIST-1 values (Burgasser & Mamajek 2017; Wheatley et al. 2017)
theta_true = np.array([
    0.089,   # mass [Msun]
    -2.92,   # log(fsat) [dex]
    1.0,     # tsat [Gyr]
    7.6,     # age [Gyr]
    -1.18,   # beta (stored negative)
])

result_true = vpm.run_model(theta_true)

print("Model output at TRAPPIST-1 parameters:")
for name, val, unit in zip(vpm.outparams, result_true, vpm.out_units):
    print(f"  {name:<35s} = {val:.4e} [{unit}]")

2. Observational Data and Priors

We constrain the inference with two observational measurements:

Observable	Symbol	Value	Reference
Bolometric luminosity	\(L_{\rm bol}\)	\((5.22 \pm 0.19) \times 10^{-4}\, L_\odot\)	Gillon et al. (2017)
XUV-to-bol. ratio	\(L_{\rm XUV}/L_{\rm bol}\)	\((7.5 \pm 1.5) \times 10^{-4}\)	Wheatley et al. (2017)

Five parameters have the following priors:

Parameter	Prior	Values
\(m_\star\)	Uniform	\([0.07, 0.11]\, M_\odot\)
\(\log f_{\rm sat}\)	Gaussian	\(-2.92 \pm 0.26\)
\(t_{\rm sat}\)	Uniform	\([0.1, 12]\, {\rm Gyr}\)
Age	Gaussian	\(7.6 \pm 2.2\, {\rm Gyr}\)
\(\beta_{\rm XUV}\)	Gaussian	\(-1.18 \pm 0.31\)

[ ]:

# Observational data: shape (n_obs, 2) = [[mean, std], ...]
# We observe Lbol and Lxuv/Lbol (not Lxuv directly)
like_data = np.array([
    [5.22e-4, 0.19e-4],   # Lbol [Lsun]
    [7.5e-4,  1.5e-4],    # Lxuv/Lbol [dimensionless]
])

# Prior configuration: (mean, sigma) for Gaussian; None for Uniform
# Order matches inparams: [mass, log_fsat, tsat, age, beta]
prior_data = [
    (None,   None),   # mass — uniform
    (-2.92,  0.26),   # log(fsat) — Gaussian
    (None,   None),   # tsat — uniform
    (7.6,    2.2),    # age — Gaussian
    (-1.18,  0.31),   # beta — Gaussian
]

# Hard bounds (uniform prior limits for all parameters)
bounds = [
    (0.07, 0.11),     # mass [Msun]
    (-5.0, -1.0),     # log(fsat)
    (0.1,  12.0),     # tsat [Gyr]
    (0.1,  12.0),     # age [Gyr]
    (-2.0,  0.0),     # beta
]

labels = [
    r"$m_\star\,[M_\odot]$",
    r"$\log f_{\rm sat}$",
    r"$t_{\rm sat}\,[{\rm Gyr}]$",
    r"Age [Gyr]",
    r"$\beta_{\rm XUV}$",
]

3. Custom Likelihood and Prior Functions

The likelihood compares model predictions to observations assuming Gaussian errors:

\[\ln \mathcal{L}(\theta) = -\frac{1}{2} \sum_k \left(\frac{y_k^{\rm model}(\theta) - y_k^{\rm obs}}{\sigma_k^{\rm obs}}\right)^2\]

Note that the model returns \(L_{\rm bol}\) and \(L_{\rm XUV}\) as separate quantities, but the observation constrains \(L_{\rm XUV}/L_{\rm bol}\). This transformation is applied inside lnlike.

3.1 Custom Log-Likelihood

[ ]:

def lnlike(theta):
    """
    Log-likelihood comparing VPLanet outputs to observations.

    theta : array-like, shape (5,)
        [mass (Msun), log_fsat, tsat (Gyr), age (Gyr), beta]

    Returns
    -------
    float : log-likelihood value (−inf if model fails)
    """
    try:
        out = vpm.run_model(np.array(theta))
    except Exception:
        return -np.inf

    lbol = out[0]   # final.star.Luminosity  [Lsun]
    lxuv = out[1]   # final.star.LXUVStellar [Lsun]

    # Guard against non-physical outputs
    if lbol <= 0 or lxuv <= 0 or not np.isfinite(lbol) or not np.isfinite(lxuv):
        return -np.inf

    # Model observables: [Lbol, Lxuv/Lbol]
    model = np.array([lbol, lxuv / lbol])

    # Gaussian log-likelihood
    lnl = -0.5 * np.sum(((model - like_data[:, 0]) / like_data[:, 1]) ** 2)

    return lnl


# Test: likelihood at true parameters
lnl_true = lnlike(theta_true)
print(f"ln L at true parameters: {lnl_true:.4f}")

3.2 Log-Prior

We combine Gaussian and uniform priors. Parameters with (None, None) in prior_data get a flat (uniform) prior within bounds.

[ ]:

def lnprior(theta):
    """
    Log-prior: uniform bounds + Gaussian constraints where specified.

    Returns −inf if any parameter is outside its hard bounds.
    """
    lnp = 0.0
    for i, (val, (lo, hi), gauss) in enumerate(zip(theta, bounds, prior_data)):
        # Hard bounds (uniform component)
        if not (lo < val < hi):
            return -np.inf
        # Gaussian component
        mu, sigma = gauss
        if mu is not None:
            lnp += -0.5 * ((val - mu) / sigma) ** 2
    return lnp


# Test: prior at true parameters
lnp_true = lnprior(theta_true)
print(f"ln prior at true parameters: {lnp_true:.4f}")

3.3 Log-Posterior

[ ]:

def lnpost(theta):
    """Log-posterior = log-likelihood + log-prior."""
    lnp = lnprior(theta)
    if not np.isfinite(lnp):
        return -np.inf
    return lnlike(theta) + lnp


print(f"ln posterior at true parameters: {lnpost(theta_true):.4f}")

4. Option A — Direct MCMC with `dynesty`

For a quick test or for fast models, you can run dynesty nested sampling directly on the VPLanet forward model (no surrogate needed).

dynesty requires a prior transform \(\pi^{-1}: [0,1]^d \to \theta\) that maps the unit hypercube to the prior distribution.

Note: Direct nested sampling on VPLanet is only feasible for simple models with short simulation times (< 1 s per call). For realistic stellar/tidal evolution simulations (seconds to minutes per call), use the alabi surrogate approach in Section 5.

[ ]:

import dynesty
from dynesty import plotting as dyplot
from scipy.special import ndtri  # inverse normal CDF


def prior_transform(u_vec):
    """
    Map unit-hypercube samples u ∈ [0,1]^d to the parameter prior.

    For uniform priors: theta = lo + u * (hi - lo)
    For Gaussian priors (truncated to [lo, hi]): theta = mu + sigma * Phi^{-1}(u)
    """
    theta = np.zeros(len(bounds))
    for i, (u_i, (lo, hi), gauss) in enumerate(zip(u_vec, bounds, prior_data)):
        mu, sigma = gauss
        if mu is None:
            # Uniform prior
            theta[i] = lo + u_i * (hi - lo)
        else:
            # Gaussian prior truncated to [lo, hi]
            # Map [0,1] -> CDF of truncated normal
            from scipy.stats import norm
            cdf_lo = norm.cdf(lo, loc=mu, scale=sigma)
            cdf_hi = norm.cdf(hi, loc=mu, scale=sigma)
            theta[i] = norm.ppf(cdf_lo + u_i * (cdf_hi - cdf_lo), loc=mu, scale=sigma)
    return theta


# Quick test of prior transform
u_test = np.full(len(bounds), 0.5)
print("Prior transform at u=0.5 (median of prior):")
for name, val in zip(labels, prior_transform(u_test)):
    print(f"  {name}: {val:.4f}")

[ ]:

# Likelihood function for dynesty (takes only theta, returns ln-likelihood)
def lnlike_dynesty(theta):
    lnp = lnprior(theta)
    if not np.isfinite(lnp):
        return -1e300
    lnl = lnlike(theta)
    if not np.isfinite(lnl):
        return -1e300
    return lnl

[ ]:

# Run static nested sampling
# nlive=50 is very small (use ≥200 for production); this is just for illustration
sampler = dynesty.NestedSampler(
    loglikelihood=lnlike_dynesty,
    prior_transform=prior_transform,
    ndim=len(bounds),
    nlive=50,
)

sampler.run_nested(print_progress=True)
results_dynesty = sampler.results
print(f"\nTotal likelihood calls: {results_dynesty.ncall}")
print(f"log Evidence: {results_dynesty.logz[-1]:.2f} ± {results_dynesty.logzerr[-1]:.2f}")

[ ]:

# Corner plot of posterior samples
fig, axes = dyplot.cornerplot(
    results_dynesty,
    labels=labels,
    show_titles=True,
    title_kwargs={"fontsize": 10},
    label_kwargs={"fontsize": 10},
    truths=theta_true,
    fig=plt.subplots(len(bounds), len(bounds), figsize=(10, 10)),
)
plt.suptitle("dynesty posterior — TRAPPIST-1 stellar parameters", y=1.01, fontsize=12)
plt.tight_layout()
plt.show()

[ ]:

# Extract posterior samples and compute parameter summaries
from dynesty import utils as dyfunc

samples_eq, weights = results_dynesty.samples, np.exp(results_dynesty.logwt - results_dynesty.logz[-1])
samples_dynesty = dyfunc.resample_equal(results_dynesty.samples, weights)

print("Posterior summaries (median ± 1-sigma):")
for i, label in enumerate(labels):
    lo, med, hi = np.percentile(samples_dynesty[:, i], [16, 50, 84])
    print(f"  {label}: {med:.4f} +{hi-med:.4f} -{med-lo:.4f}")

5. Option B — Direct MCMC with `emcee`

emcee uses an ensemble of walkers and is simpler to set up than nested sampling. It requires the log-posterior directly (no prior transform needed).

Tip: Initialize walkers in a small ball around the known maximum-likelihood estimate for faster convergence.

[ ]:

import emcee

N_WALKERS = 20
N_STEPS   = 500   # increase to ≥2000 for production
N_DIM     = len(bounds)

# Initialize walkers in a small Gaussian ball around the nominal parameters
rng = np.random.default_rng(42)
theta_init = theta_true + 1e-3 * rng.standard_normal((N_WALKERS, N_DIM))

# Clip to within bounds
for i, (lo, hi) in enumerate(bounds):
    theta_init[:, i] = np.clip(theta_init[:, i], lo + 1e-6, hi - 1e-6)

print(f"Running emcee: {N_WALKERS} walkers × {N_STEPS} steps = {N_WALKERS * N_STEPS} total calls")

[ ]:

# Parallelise with multiprocessing
with mp.Pool(mp.cpu_count()) as pool:
    sampler_emcee = emcee.EnsembleSampler(
        N_WALKERS,
        N_DIM,
        lnpost,
        pool=pool,
    )
    sampler_emcee.run_mcmc(theta_init, N_STEPS, progress=True)

[ ]:

# Discard burn-in and thin the chain
try:
    tau = sampler_emcee.get_autocorr_time(quiet=True)
    burn = int(2 * np.max(tau))
    thin = max(1, int(0.5 * np.min(tau)))
except Exception:
    burn = N_STEPS // 4
    thin = 1
    print("Note: autocorrelation estimation failed (chain too short). Using fallback burn-in.")

flat_samples = sampler_emcee.get_chain(discard=burn, thin=thin, flat=True)
print(f"Burn-in: {burn} steps, thin: {thin}")
print(f"Posterior samples: {flat_samples.shape[0]}")

[ ]:

# Trace plot: walker trajectories over time
chain = sampler_emcee.get_chain()

fig, axes = plt.subplots(N_DIM, 1, figsize=(10, 8), sharex=True)
for i, (ax, label) in enumerate(zip(axes, labels)):
    for walker in chain[:, :, i].T:
        ax.plot(walker, alpha=0.3, lw=0.5, color="steelblue")
    ax.axhline(theta_true[i], color="red", lw=1.5, ls="--", label="true" if i == 0 else None)
    ax.axvline(burn, color="k", lw=1, ls=":", label="burn-in" if i == 0 else None)
    ax.set_ylabel(label, fontsize=9)
axes[0].legend(fontsize=8)
axes[-1].set_xlabel("Step", fontsize=10)
fig.suptitle("emcee walker traces — TRAPPIST-1", fontsize=11)
plt.tight_layout()
plt.show()

[ ]:

# Corner plot using matplotlib
try:
    import corner
    fig = corner.corner(
        flat_samples,
        labels=labels,
        truths=theta_true,
        quantiles=[0.16, 0.5, 0.84],
        show_titles=True,
        title_kwargs={"fontsize": 9},
    )
    plt.suptitle("emcee posterior — TRAPPIST-1 stellar parameters", y=1.01, fontsize=11)
    plt.show()
except ImportError:
    print("Install 'corner' for a corner plot: pip install corner")

print("\nPosterior summaries (median ± 1-sigma):")
for i, label in enumerate(labels):
    lo, med, hi = np.percentile(flat_samples[:, i], [16, 50, 84])
    print(f"  {label}: {med:.4f} +{hi-med:.4f} -{med-lo:.4f}")

6. Option C — Surrogate MCMC with `alabi`

alabi (Active Learning Approach for Bayesian Inference) accelerates inference by replacing the expensive forward model with a Gaussian Process surrogate trained on a small number of strategically chosen model evaluations.

Installation

git clone https://github.com/jbirky/alabi
cd alabi
pip install -e .

Workflow

Initial samples  →  Fit GP  →  Active learning (BAPE)  →  Dynesty/emcee on surrogate

The BAPE (Bayesian Active Posterior Estimation) algorithm selects the next point to evaluate by maximising the expected information gain about the posterior, concentrating model evaluations in regions of high posterior probability.

[ ]:

try:
    import alabi
    from alabi import SurrogateModel
    from alabi import utility as ut
    ALABI_AVAILABLE = True
    print(f"alabi version: {alabi.__version__}")
except ImportError:
    ALABI_AVAILABLE = False
    print(
        "alabi is not installed. To install:\n"
        "  git clone https://github.com/jbirky/alabi\n"
        "  cd alabi && pip install -e .\n"
        "\nThe cells below show the alabi workflow but will be skipped."
    )

6.1 Setting Up the Prior Sampler and Transform

alabi needs two functions:

``prior_sampler``: draws random samples from the prior (for seeding GP training data)
``prior_transform``: maps the unit hypercube to the prior (for dynesty)

[ ]:

if ALABI_AVAILABLE:
    # alabi uses its own prior utilities
    # prior_data and bounds must match the parameter ordering of theta

    # Sampler for drawing training/test points from the prior
    prior_sampler_alabi = partial(
        ut.prior_sampler_normal,
        prior_data=prior_data,
        bounds=bounds,
    )

    # Prior transform for dynesty
    prior_transform_alabi = partial(
        ut.prior_transform_normal,
        bounds=bounds,
        data=prior_data,
    )

    # Log-prior for emcee
    lnprior_alabi = partial(
        ut.lnprior_normal,
        bounds=bounds,
        data=prior_data,
    )

    # Test
    sample = prior_sampler_alabi(n=3)
    print("Three prior samples:")
    print(sample)

6.2 Initialising the Surrogate Model

[ ]:

if ALABI_AVAILABLE:
    SAVEDIR = "output/alabi_trappist"

    sm = SurrogateModel(
        fn=lnpost,                     # function to emulate (log-posterior)
        bounds=bounds,                 # hard parameter bounds
        prior_sampler=prior_sampler_alabi,
        savedir=SAVEDIR,
        labels=labels,
        scale="nlog",                  # nlog = negative log (for a log-posterior)
    )

    # Draw initial training and test sets from the prior
    sm.init_samples(ntrain=100, ntest=100, reload=False)

    # Fit an initial GP to the training data
    sm.init_gp(
        kernel="ExpSquaredKernel",
        fit_amp=False,
        fit_mean=True,
        white_noise=-15,
    )

    print("Surrogate model initialized.")
    print(f"Training set size: {len(sm.training_x)}")

6.3 Active Learning

The surrogate is improved iteratively using the BAPE algorithm. Each iteration selects the point with maximum expected information gain, evaluates the true forward model there, and updates the GP.

For a production run, use niter=500; here we use a small number for demonstration.

[ ]:

if ALABI_AVAILABLE:
    sm.active_train(
        niter=50,           # number of active-learning iterations (use 500+ in production)
        algorithm="bape",  # Bayesian Active Posterior Estimation
        gp_opt_freq=10,    # re-optimise GP hyperparameters every 10 iterations
        save_progress=True,
    )

    # Plot GP diagnostics (training set, test predictions, residuals)
    sm.plot(plots=["gp_all"])

6.4 MCMC on the Surrogate

Once the surrogate is trained, we run dynesty on the GP emulator instead of the expensive forward model. This is orders of magnitude faster.

[ ]:

if ALABI_AVAILABLE:
    # Static nested sampling on the GP surrogate
    sm.run_dynesty(
        ptform=prior_transform_alabi,
        mode="static",
        multi_proc=True,
        save_iter=100,
    )

    # Plot corner plot and run statistics
    sm.plot(plots=["dynesty_all"])

[ ]:

if ALABI_AVAILABLE:
    # You can also run emcee on the surrogate for a different sampling approach
    sm.run_emcee(
        lnprior=lnprior_alabi,
        nwalkers=50,
        nsteps=10000,    # surrogate calls are fast, so we can afford many steps
        opt_init=False,
    )
    sm.plot(plots=["emcee_corner"])

6.5 Reloading a Saved Model

alabi automatically saves the training set and GP model to savedir at each iteration. You can reload a completed run without re-training:

[ ]:

if ALABI_AVAILABLE and os.path.exists(SAVEDIR):
    sm_reload = alabi.cache_utils.load_model_cache(SAVEDIR)
    print(f"Reloaded model from {SAVEDIR}")
    print(f"Training set size: {len(sm_reload.training_x)}")

Summary

Task	Code
Define custom likelihood	`def lnlike(theta): out = vpm.run_model(theta); return -0.5 * chi2`
Combine with prior	`def lnpost(theta): return lnlike(theta) + lnprior(theta)`
Run `dynesty` (direct)	`NestedSampler(lnlike, prior_transform, ndim).run_nested()`
Run `emcee` (direct)	`EnsembleSampler(nwalk, ndim, lnpost, pool=pool).run_mcmc(init, nsteps)`
Run `alabi` surrogate	`SurrogateModel(fn=lnpost, ...).init_samples().init_gp().active_train()`
MCMC on surrogate	`sm.run_dynesty(ptform=...)` or `sm.run_emcee(lnprior=...)`
Reload saved run	`alabi.cache_utils.load_model_cache(savedir)`

Key takeaways:

Write your own lnlike to handle any transformation of model outputs to observables
Always guard against failed model runs (try/except, finite checks)
For slow models, alabi active learning dramatically reduces the number of VPLanet calls needed
alabi is model-agnostic — the same SurrogateModel interface works with any Python callable

MCMC with VPLanet and alabi