{ "cells": [ { "cell_type": "markdown", "id": "b1000001", "metadata": {}, "source": [ "# MCMC with VPLanet and `alabi`\n", "\n", "This tutorial demonstrates how to run **Bayesian parameter inference** on a VPLanet forward model\n", "using a **custom likelihood function** and the `alabi` surrogate-model framework.\n", "\n", "Evaluating a VPLanet model can take seconds to minutes per call.\n", "Running a standard MCMC sampler (e.g. `emcee` or `dynesty`) directly on the forward model\n", "requires millions of evaluations, which is computationally prohibitive.\n", "`alabi` addresses this by:\n", "\n", "1. Building a **Gaussian Process surrogate** of the log-posterior surface using a small number of\n", " active-learning iterations.\n", "2. Running MCMC on the **fast surrogate** instead of the expensive forward model.\n", "\n", "We also show how to run MCMC **directly** using `dynesty` or `emcee` — useful for fast models or\n", "as a baseline comparison.\n", "\n", "### Problem: inferring TRAPPIST-1 stellar properties\n", "\n", "We fit four stellar parameters of TRAPPIST-1 using observed luminosity and XUV luminosity:\n", "\n", "| Parameter | Symbol | Prior range |\n", "|-----------|--------|-------------|\n", "| Stellar mass | $m_\\star$ | $[0.07, 0.11]\\, M_\\odot$ |\n", "| XUV saturation fraction | $f_{\\rm sat}$ | $[-5, -1]\\, {\\rm dex}$ |\n", "| XUV saturation timescale | $t_{\\rm sat}$ | $[0.1, 12]\\, {\\rm Gyr}$ |\n", "| XUV power-law exponent | $\\beta_{\\rm XUV}$ | $[-2, 0]$ |\n", "\n", "The age is constrained by a Gaussian prior $t_{\\rm age} = 7.6 \\pm 2.2\\, {\\rm Gyr}$.\n", "\n", "---\n", "\n", "**Dependencies:** `vplanet_inference`, `numpy`, `astropy`, `dynesty`, `emcee`, `matplotlib`\n", "\n", "**Optional (for surrogate MCMC):** `alabi` (see installation note below)\n", "\n", "**References:**\n", "- [Birky et al. (2021)](https://doi.org/10.3847/1538-4357/abded5) — `alabi` method paper\n", "- [Speagle (2020)](https://doi.org/10.1093/mnras/staa278) — `dynesty` nested sampling\n", "- [Foreman-Mackey et al. (2013)](https://doi.org/10.1086/670067) — `emcee` affine-invariant MCMC\n", "- [VPLanet stellar module](https://virtualplanetarylaboratory.github.io/vplanet/stellar.html)" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000002", "metadata": {}, "outputs": [], "source": [ "import os\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import astropy.units as u\n", "from functools import partial\n", "import multiprocessing as mp\n", "import warnings\n", "warnings.filterwarnings(\"ignore\")\n", "\n", "import vplanet_inference as vpi" ] }, { "cell_type": "markdown", "id": "b1000003", "metadata": {}, "source": [ "## 1. Configure the VPLanet Forward Model\n", "\n", "We use the bundled `stellar` infiles — a single low-mass star with the `stellar` module activated.\n", "The model outputs bolometric luminosity and XUV luminosity at the end of the simulation." ] }, { "cell_type": "code", "execution_count": null, "id": "b1000004", "metadata": {}, "outputs": [], "source": [ "INFILE_PATH = os.path.join(vpi.INFILE_DIR, \"stellar\")\n", "print(f\"Infile path: {INFILE_PATH}\")\n", "print(\"Files:\", os.listdir(INFILE_PATH))" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000005", "metadata": {}, "outputs": [], "source": [ "# Input parameters: what to vary and in what units\n", "# theta = [mass, log(fsat), tsat, age, beta]\n", "inparams = {\n", " \"star.dMass\": u.Msun,\n", " \"star.dSatXUVFrac\": u.dex(u.dimensionless_unscaled), # log scale\n", " \"star.dSatXUVTime\": u.Gyr,\n", " \"vpl.dStopTime\": u.Gyr,\n", " \"star.dXUVBeta\": -u.dimensionless_unscaled, # stored negative in vpl\n", "}\n", "\n", "# Output parameters: what to collect from the log file\n", "outparams = {\n", " \"final.star.Luminosity\": u.Lsun,\n", " \"final.star.LXUVStellar\": u.Lsun,\n", "}\n", "\n", "vpm = vpi.VplanetModel(\n", " inparams=inparams,\n", " outparams=outparams,\n", " inpath=INFILE_PATH,\n", " outpath=\"output/mcmc\",\n", " verbose=False,\n", ")\n", "\n", "print(\"Model inputs :\", vpm.inparams)\n", "print(\"Model outputs:\", vpm.outparams)" ] }, { "cell_type": "markdown", "id": "b1000006", "metadata": {}, "source": [ "### Sanity check: single run at known TRAPPIST-1 values" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000007", "metadata": {}, "outputs": [], "source": [ "# Approximate TRAPPIST-1 values (Burgasser & Mamajek 2017; Wheatley et al. 2017)\n", "theta_true = np.array([\n", " 0.089, # mass [Msun]\n", " -2.92, # log(fsat) [dex]\n", " 1.0, # tsat [Gyr]\n", " 7.6, # age [Gyr]\n", " -1.18, # beta (stored negative)\n", "])\n", "\n", "result_true = vpm.run_model(theta_true)\n", "\n", "print(\"Model output at TRAPPIST-1 parameters:\")\n", "for name, val, unit in zip(vpm.outparams, result_true, vpm.out_units):\n", " print(f\" {name:<35s} = {val:.4e} [{unit}]\")" ] }, { "cell_type": "markdown", "id": "b1000008", "metadata": {}, "source": [ "## 2. Observational Data and Priors\n", "\n", "We constrain the inference with two observational measurements:\n", "\n", "| Observable | Symbol | Value | Reference |\n", "|-----------|--------|-------|-----------|\n", "| Bolometric luminosity | $L_{\\rm bol}$ | $(5.22 \\pm 0.19) \\times 10^{-4}\\, L_\\odot$ | Gillon et al. (2017) |\n", "| XUV-to-bol. ratio | $L_{\\rm XUV}/L_{\\rm bol}$ | $(7.5 \\pm 1.5) \\times 10^{-4}$ | Wheatley et al. (2017) |\n", "\n", "Five parameters have the following priors:\n", "\n", "| Parameter | Prior | Values |\n", "|-----------|-------|--------|\n", "| $m_\\star$ | Uniform | $[0.07, 0.11]\\, M_\\odot$ |\n", "| $\\log f_{\\rm sat}$ | Gaussian | $-2.92 \\pm 0.26$ |\n", "| $t_{\\rm sat}$ | Uniform | $[0.1, 12]\\, {\\rm Gyr}$ |\n", "| Age | Gaussian | $7.6 \\pm 2.2\\, {\\rm Gyr}$ |\n", "| $\\beta_{\\rm XUV}$ | Gaussian | $-1.18 \\pm 0.31$ |" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000009", "metadata": {}, "outputs": [], "source": [ "# Observational data: shape (n_obs, 2) = [[mean, std], ...]\n", "# We observe Lbol and Lxuv/Lbol (not Lxuv directly)\n", "like_data = np.array([\n", " [5.22e-4, 0.19e-4], # Lbol [Lsun]\n", " [7.5e-4, 1.5e-4], # Lxuv/Lbol [dimensionless]\n", "])\n", "\n", "# Prior configuration: (mean, sigma) for Gaussian; None for Uniform\n", "# Order matches inparams: [mass, log_fsat, tsat, age, beta]\n", "prior_data = [\n", " (None, None), # mass — uniform\n", " (-2.92, 0.26), # log(fsat) — Gaussian\n", " (None, None), # tsat — uniform\n", " (7.6, 2.2), # age — Gaussian\n", " (-1.18, 0.31), # beta — Gaussian\n", "]\n", "\n", "# Hard bounds (uniform prior limits for all parameters)\n", "bounds = [\n", " (0.07, 0.11), # mass [Msun]\n", " (-5.0, -1.0), # log(fsat)\n", " (0.1, 12.0), # tsat [Gyr]\n", " (0.1, 12.0), # age [Gyr]\n", " (-2.0, 0.0), # beta\n", "]\n", "\n", "labels = [\n", " r\"$m_\\star\\,[M_\\odot]$\",\n", " r\"$\\log f_{\\rm sat}$\",\n", " r\"$t_{\\rm sat}\\,[{\\rm Gyr}]$\",\n", " r\"Age [Gyr]\",\n", " r\"$\\beta_{\\rm XUV}$\",\n", "]" ] }, { "cell_type": "markdown", "id": "b1000010", "metadata": {}, "source": [ "## 3. Custom Likelihood and Prior Functions\n", "\n", "The likelihood compares model predictions to observations assuming Gaussian errors:\n", "\n", "$$\n", "\\ln \\mathcal{L}(\\theta) = -\\frac{1}{2}\n", "\\sum_k \\left(\\frac{y_k^{\\rm model}(\\theta) - y_k^{\\rm obs}}{\\sigma_k^{\\rm obs}}\\right)^2\n", "$$\n", "\n", "Note that the model returns $L_{\\rm bol}$ and $L_{\\rm XUV}$ as separate quantities,\n", "but the observation constrains $L_{\\rm XUV}/L_{\\rm bol}$.\n", "This transformation is applied inside `lnlike`.\n", "\n", "### 3.1 Custom Log-Likelihood" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000011", "metadata": {}, "outputs": [], "source": [ "def lnlike(theta):\n", " \"\"\"\n", " Log-likelihood comparing VPLanet outputs to observations.\n", "\n", " theta : array-like, shape (5,)\n", " [mass (Msun), log_fsat, tsat (Gyr), age (Gyr), beta]\n", "\n", " Returns\n", " -------\n", " float : log-likelihood value (−inf if model fails)\n", " \"\"\"\n", " try:\n", " out = vpm.run_model(np.array(theta))\n", " except Exception:\n", " return -np.inf\n", "\n", " lbol = out[0] # final.star.Luminosity [Lsun]\n", " lxuv = out[1] # final.star.LXUVStellar [Lsun]\n", "\n", " # Guard against non-physical outputs\n", " if lbol <= 0 or lxuv <= 0 or not np.isfinite(lbol) or not np.isfinite(lxuv):\n", " return -np.inf\n", "\n", " # Model observables: [Lbol, Lxuv/Lbol]\n", " model = np.array([lbol, lxuv / lbol])\n", "\n", " # Gaussian log-likelihood\n", " lnl = -0.5 * np.sum(((model - like_data[:, 0]) / like_data[:, 1]) ** 2)\n", "\n", " return lnl\n", "\n", "\n", "# Test: likelihood at true parameters\n", "lnl_true = lnlike(theta_true)\n", "print(f\"ln L at true parameters: {lnl_true:.4f}\")" ] }, { "cell_type": "markdown", "id": "b1000012", "metadata": {}, "source": [ "### 3.2 Log-Prior\n", "\n", "We combine Gaussian and uniform priors.\n", "Parameters with `(None, None)` in `prior_data` get a flat (uniform) prior within `bounds`." ] }, { "cell_type": "code", "execution_count": null, "id": "b1000013", "metadata": {}, "outputs": [], "source": [ "def lnprior(theta):\n", " \"\"\"\n", " Log-prior: uniform bounds + Gaussian constraints where specified.\n", "\n", " Returns −inf if any parameter is outside its hard bounds.\n", " \"\"\"\n", " lnp = 0.0\n", " for i, (val, (lo, hi), gauss) in enumerate(zip(theta, bounds, prior_data)):\n", " # Hard bounds (uniform component)\n", " if not (lo < val < hi):\n", " return -np.inf\n", " # Gaussian component\n", " mu, sigma = gauss\n", " if mu is not None:\n", " lnp += -0.5 * ((val - mu) / sigma) ** 2\n", " return lnp\n", "\n", "\n", "# Test: prior at true parameters\n", "lnp_true = lnprior(theta_true)\n", "print(f\"ln prior at true parameters: {lnp_true:.4f}\")" ] }, { "cell_type": "markdown", "id": "b1000014", "metadata": {}, "source": [ "### 3.3 Log-Posterior" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000015", "metadata": {}, "outputs": [], "source": [ "def lnpost(theta):\n", " \"\"\"Log-posterior = log-likelihood + log-prior.\"\"\"\n", " lnp = lnprior(theta)\n", " if not np.isfinite(lnp):\n", " return -np.inf\n", " return lnlike(theta) + lnp\n", "\n", "\n", "print(f\"ln posterior at true parameters: {lnpost(theta_true):.4f}\")" ] }, { "cell_type": "markdown", "id": "b1000016", "metadata": {}, "source": [ "## 4. Option A — Direct MCMC with `dynesty`\n", "\n", "For a quick test or for fast models, you can run `dynesty` nested sampling directly\n", "on the VPLanet forward model (no surrogate needed).\n", "\n", "`dynesty` requires a **prior transform** $\\pi^{-1}: [0,1]^d \\to \\theta$ that maps\n", "the unit hypercube to the prior distribution.\n", "\n", "> **Note:** Direct nested sampling on VPLanet is only feasible for simple models with short\n", "> simulation times (< 1 s per call). For realistic stellar/tidal evolution simulations\n", "> (seconds to minutes per call), use the `alabi` surrogate approach in Section 5." ] }, { "cell_type": "code", "execution_count": null, "id": "b1000017", "metadata": {}, "outputs": [], "source": [ "import dynesty\n", "from dynesty import plotting as dyplot\n", "from scipy.special import ndtri # inverse normal CDF\n", "\n", "\n", "def prior_transform(u_vec):\n", " \"\"\"\n", " Map unit-hypercube samples u ∈ [0,1]^d to the parameter prior.\n", "\n", " For uniform priors: theta = lo + u * (hi - lo)\n", " For Gaussian priors (truncated to [lo, hi]): theta = mu + sigma * Phi^{-1}(u)\n", " \"\"\"\n", " theta = np.zeros(len(bounds))\n", " for i, (u_i, (lo, hi), gauss) in enumerate(zip(u_vec, bounds, prior_data)):\n", " mu, sigma = gauss\n", " if mu is None:\n", " # Uniform prior\n", " theta[i] = lo + u_i * (hi - lo)\n", " else:\n", " # Gaussian prior truncated to [lo, hi]\n", " # Map [0,1] -> CDF of truncated normal\n", " from scipy.stats import norm\n", " cdf_lo = norm.cdf(lo, loc=mu, scale=sigma)\n", " cdf_hi = norm.cdf(hi, loc=mu, scale=sigma)\n", " theta[i] = norm.ppf(cdf_lo + u_i * (cdf_hi - cdf_lo), loc=mu, scale=sigma)\n", " return theta\n", "\n", "\n", "# Quick test of prior transform\n", "u_test = np.full(len(bounds), 0.5)\n", "print(\"Prior transform at u=0.5 (median of prior):\")\n", "for name, val in zip(labels, prior_transform(u_test)):\n", " print(f\" {name}: {val:.4f}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000018", "metadata": {}, "outputs": [], "source": [ "# Likelihood function for dynesty (takes only theta, returns ln-likelihood)\n", "def lnlike_dynesty(theta):\n", " lnp = lnprior(theta)\n", " if not np.isfinite(lnp):\n", " return -1e300\n", " lnl = lnlike(theta)\n", " if not np.isfinite(lnl):\n", " return -1e300\n", " return lnl" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000019", "metadata": {}, "outputs": [], "source": [ "# Run static nested sampling\n", "# nlive=50 is very small (use ≥200 for production); this is just for illustration\n", "sampler = dynesty.NestedSampler(\n", " loglikelihood=lnlike_dynesty,\n", " prior_transform=prior_transform,\n", " ndim=len(bounds),\n", " nlive=50,\n", ")\n", "\n", "sampler.run_nested(print_progress=True)\n", "results_dynesty = sampler.results\n", "print(f\"\\nTotal likelihood calls: {results_dynesty.ncall}\")\n", "print(f\"log Evidence: {results_dynesty.logz[-1]:.2f} ± {results_dynesty.logzerr[-1]:.2f}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000020", "metadata": {}, "outputs": [], "source": [ "# Corner plot of posterior samples\n", "fig, axes = dyplot.cornerplot(\n", " results_dynesty,\n", " labels=labels,\n", " show_titles=True,\n", " title_kwargs={\"fontsize\": 10},\n", " label_kwargs={\"fontsize\": 10},\n", " truths=theta_true,\n", " fig=plt.subplots(len(bounds), len(bounds), figsize=(10, 10)),\n", ")\n", "plt.suptitle(\"dynesty posterior — TRAPPIST-1 stellar parameters\", y=1.01, fontsize=12)\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000021", "metadata": {}, "outputs": [], "source": [ "# Extract posterior samples and compute parameter summaries\n", "from dynesty import utils as dyfunc\n", "\n", "samples_eq, weights = results_dynesty.samples, np.exp(results_dynesty.logwt - results_dynesty.logz[-1])\n", "samples_dynesty = dyfunc.resample_equal(results_dynesty.samples, weights)\n", "\n", "print(\"Posterior summaries (median ± 1-sigma):\")\n", "for i, label in enumerate(labels):\n", " lo, med, hi = np.percentile(samples_dynesty[:, i], [16, 50, 84])\n", " print(f\" {label}: {med:.4f} +{hi-med:.4f} -{med-lo:.4f}\")" ] }, { "cell_type": "markdown", "id": "b1000022", "metadata": {}, "source": [ "## 5. Option B — Direct MCMC with `emcee`\n", "\n", "`emcee` uses an ensemble of walkers and is simpler to set up than nested sampling.\n", "It requires the **log-posterior** directly (no prior transform needed).\n", "\n", "> **Tip:** Initialize walkers in a small ball around the known maximum-likelihood\n", "> estimate for faster convergence." ] }, { "cell_type": "code", "execution_count": null, "id": "b1000023", "metadata": {}, "outputs": [], "source": [ "import emcee\n", "\n", "N_WALKERS = 20\n", "N_STEPS = 500 # increase to ≥2000 for production\n", "N_DIM = len(bounds)\n", "\n", "# Initialize walkers in a small Gaussian ball around the nominal parameters\n", "rng = np.random.default_rng(42)\n", "theta_init = theta_true + 1e-3 * rng.standard_normal((N_WALKERS, N_DIM))\n", "\n", "# Clip to within bounds\n", "for i, (lo, hi) in enumerate(bounds):\n", " theta_init[:, i] = np.clip(theta_init[:, i], lo + 1e-6, hi - 1e-6)\n", "\n", "print(f\"Running emcee: {N_WALKERS} walkers × {N_STEPS} steps = {N_WALKERS * N_STEPS} total calls\")" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000024", "metadata": {}, "outputs": [], "source": [ "# Parallelise with multiprocessing\n", "with mp.Pool(mp.cpu_count()) as pool:\n", " sampler_emcee = emcee.EnsembleSampler(\n", " N_WALKERS,\n", " N_DIM,\n", " lnpost,\n", " pool=pool,\n", " )\n", " sampler_emcee.run_mcmc(theta_init, N_STEPS, progress=True)" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000025", "metadata": {}, "outputs": [], "source": [ "# Discard burn-in and thin the chain\n", "try:\n", " tau = sampler_emcee.get_autocorr_time(quiet=True)\n", " burn = int(2 * np.max(tau))\n", " thin = max(1, int(0.5 * np.min(tau)))\n", "except Exception:\n", " burn = N_STEPS // 4\n", " thin = 1\n", " print(\"Note: autocorrelation estimation failed (chain too short). Using fallback burn-in.\")\n", "\n", "flat_samples = sampler_emcee.get_chain(discard=burn, thin=thin, flat=True)\n", "print(f\"Burn-in: {burn} steps, thin: {thin}\")\n", "print(f\"Posterior samples: {flat_samples.shape[0]}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000026", "metadata": {}, "outputs": [], "source": [ "# Trace plot: walker trajectories over time\n", "chain = sampler_emcee.get_chain()\n", "\n", "fig, axes = plt.subplots(N_DIM, 1, figsize=(10, 8), sharex=True)\n", "for i, (ax, label) in enumerate(zip(axes, labels)):\n", " for walker in chain[:, :, i].T:\n", " ax.plot(walker, alpha=0.3, lw=0.5, color=\"steelblue\")\n", " ax.axhline(theta_true[i], color=\"red\", lw=1.5, ls=\"--\", label=\"true\" if i == 0 else None)\n", " ax.axvline(burn, color=\"k\", lw=1, ls=\":\", label=\"burn-in\" if i == 0 else None)\n", " ax.set_ylabel(label, fontsize=9)\n", "axes[0].legend(fontsize=8)\n", "axes[-1].set_xlabel(\"Step\", fontsize=10)\n", "fig.suptitle(\"emcee walker traces — TRAPPIST-1\", fontsize=11)\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000027", "metadata": {}, "outputs": [], "source": [ "# Corner plot using matplotlib\n", "try:\n", " import corner\n", " fig = corner.corner(\n", " flat_samples,\n", " labels=labels,\n", " truths=theta_true,\n", " quantiles=[0.16, 0.5, 0.84],\n", " show_titles=True,\n", " title_kwargs={\"fontsize\": 9},\n", " )\n", " plt.suptitle(\"emcee posterior — TRAPPIST-1 stellar parameters\", y=1.01, fontsize=11)\n", " plt.show()\n", "except ImportError:\n", " print(\"Install 'corner' for a corner plot: pip install corner\")\n", "\n", "print(\"\\nPosterior summaries (median ± 1-sigma):\")\n", "for i, label in enumerate(labels):\n", " lo, med, hi = np.percentile(flat_samples[:, i], [16, 50, 84])\n", " print(f\" {label}: {med:.4f} +{hi-med:.4f} -{med-lo:.4f}\")" ] }, { "cell_type": "markdown", "id": "b1000028", "metadata": {}, "source": [ "## 6. Option C — Surrogate MCMC with `alabi`\n", "\n", "`alabi` (**A**ctive **L**earning **A**pproach for **B**ayesian **I**nference) accelerates\n", "inference by replacing the expensive forward model with a Gaussian Process surrogate\n", "trained on a small number of strategically chosen model evaluations.\n", "\n", "### Installation\n", "\n", "```bash\n", "git clone https://github.com/jbirky/alabi\n", "cd alabi\n", "pip install -e .\n", "```\n", "\n", "### Workflow\n", "\n", "```\n", "Initial samples → Fit GP → Active learning (BAPE) → Dynesty/emcee on surrogate\n", "```\n", "\n", "The **BAPE** (Bayesian Active Posterior Estimation) algorithm selects the next point\n", "to evaluate by maximising the expected information gain about the posterior,\n", "concentrating model evaluations in regions of high posterior probability." ] }, { "cell_type": "code", "execution_count": null, "id": "b1000029", "metadata": {}, "outputs": [], "source": [ "try:\n", " import alabi\n", " from alabi import SurrogateModel\n", " from alabi import utility as ut\n", " ALABI_AVAILABLE = True\n", " print(f\"alabi version: {alabi.__version__}\")\n", "except ImportError:\n", " ALABI_AVAILABLE = False\n", " print(\n", " \"alabi is not installed. To install:\\n\"\n", " \" git clone https://github.com/jbirky/alabi\\n\"\n", " \" cd alabi && pip install -e .\\n\"\n", " \"\\nThe cells below show the alabi workflow but will be skipped.\"\n", " )" ] }, { "cell_type": "markdown", "id": "b1000030", "metadata": {}, "source": [ "### 6.1 Setting Up the Prior Sampler and Transform\n", "\n", "`alabi` needs two functions:\n", "- **`prior_sampler`**: draws random samples from the prior (for seeding GP training data)\n", "- **`prior_transform`**: maps the unit hypercube to the prior (for dynesty)" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000031", "metadata": {}, "outputs": [], "source": [ "if ALABI_AVAILABLE:\n", " # alabi uses its own prior utilities\n", " # prior_data and bounds must match the parameter ordering of theta\n", "\n", " # Sampler for drawing training/test points from the prior\n", " prior_sampler_alabi = partial(\n", " ut.prior_sampler_normal,\n", " prior_data=prior_data,\n", " bounds=bounds,\n", " )\n", "\n", " # Prior transform for dynesty\n", " prior_transform_alabi = partial(\n", " ut.prior_transform_normal,\n", " bounds=bounds,\n", " data=prior_data,\n", " )\n", "\n", " # Log-prior for emcee\n", " lnprior_alabi = partial(\n", " ut.lnprior_normal,\n", " bounds=bounds,\n", " data=prior_data,\n", " )\n", "\n", " # Test\n", " sample = prior_sampler_alabi(n=3)\n", " print(\"Three prior samples:\")\n", " print(sample)" ] }, { "cell_type": "markdown", "id": "b1000032", "metadata": {}, "source": [ "### 6.2 Initialising the Surrogate Model" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000033", "metadata": {}, "outputs": [], "source": [ "if ALABI_AVAILABLE:\n", " SAVEDIR = \"output/alabi_trappist\"\n", "\n", " sm = SurrogateModel(\n", " fn=lnpost, # function to emulate (log-posterior)\n", " bounds=bounds, # hard parameter bounds\n", " prior_sampler=prior_sampler_alabi,\n", " savedir=SAVEDIR,\n", " labels=labels,\n", " scale=\"nlog\", # nlog = negative log (for a log-posterior)\n", " )\n", "\n", " # Draw initial training and test sets from the prior\n", " sm.init_samples(ntrain=100, ntest=100, reload=False)\n", "\n", " # Fit an initial GP to the training data\n", " sm.init_gp(\n", " kernel=\"ExpSquaredKernel\",\n", " fit_amp=False,\n", " fit_mean=True,\n", " white_noise=-15,\n", " )\n", "\n", " print(\"Surrogate model initialized.\")\n", " print(f\"Training set size: {len(sm.training_x)}\")" ] }, { "cell_type": "markdown", "id": "b1000034", "metadata": {}, "source": [ "### 6.3 Active Learning\n", "\n", "The surrogate is improved iteratively using the **BAPE** algorithm.\n", "Each iteration selects the point with maximum expected information gain,\n", "evaluates the true forward model there, and updates the GP.\n", "\n", "For a production run, use `niter=500`; here we use a small number for demonstration." ] }, { "cell_type": "code", "execution_count": null, "id": "b1000035", "metadata": {}, "outputs": [], "source": [ "if ALABI_AVAILABLE:\n", " sm.active_train(\n", " niter=50, # number of active-learning iterations (use 500+ in production)\n", " algorithm=\"bape\", # Bayesian Active Posterior Estimation\n", " gp_opt_freq=10, # re-optimise GP hyperparameters every 10 iterations\n", " save_progress=True,\n", " )\n", "\n", " # Plot GP diagnostics (training set, test predictions, residuals)\n", " sm.plot(plots=[\"gp_all\"])" ] }, { "cell_type": "markdown", "id": "b1000036", "metadata": {}, "source": [ "### 6.4 MCMC on the Surrogate\n", "\n", "Once the surrogate is trained, we run `dynesty` on the **GP emulator** instead of the\n", "expensive forward model. This is orders of magnitude faster." ] }, { "cell_type": "code", "execution_count": null, "id": "b1000037", "metadata": {}, "outputs": [], "source": [ "if ALABI_AVAILABLE:\n", " # Static nested sampling on the GP surrogate\n", " sm.run_dynesty(\n", " ptform=prior_transform_alabi,\n", " mode=\"static\",\n", " multi_proc=True,\n", " save_iter=100,\n", " )\n", "\n", " # Plot corner plot and run statistics\n", " sm.plot(plots=[\"dynesty_all\"])" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000038", "metadata": {}, "outputs": [], "source": [ "if ALABI_AVAILABLE:\n", " # You can also run emcee on the surrogate for a different sampling approach\n", " sm.run_emcee(\n", " lnprior=lnprior_alabi,\n", " nwalkers=50,\n", " nsteps=10000, # surrogate calls are fast, so we can afford many steps\n", " opt_init=False,\n", " )\n", " sm.plot(plots=[\"emcee_corner\"])" ] }, { "cell_type": "markdown", "id": "b1000039", "metadata": {}, "source": [ "### 6.5 Reloading a Saved Model\n", "\n", "`alabi` automatically saves the training set and GP model to `savedir` at each iteration.\n", "You can reload a completed run without re-training:" ] }, { "cell_type": "code", "execution_count": null, "id": "b1000040", "metadata": {}, "outputs": [], "source": [ "if ALABI_AVAILABLE and os.path.exists(SAVEDIR):\n", " sm_reload = alabi.cache_utils.load_model_cache(SAVEDIR)\n", " print(f\"Reloaded model from {SAVEDIR}\")\n", " print(f\"Training set size: {len(sm_reload.training_x)}\")" ] }, { "cell_type": "markdown", "id": "b1000047", "metadata": {}, "source": [ "## Summary\n", "\n", "| Task | Code |\n", "|------|------|\n", "| Define custom likelihood | `def lnlike(theta): out = vpm.run_model(theta); return -0.5 * chi2` |\n", "| Combine with prior | `def lnpost(theta): return lnlike(theta) + lnprior(theta)` |\n", "| Run `dynesty` (direct) | `NestedSampler(lnlike, prior_transform, ndim).run_nested()` |\n", "| Run `emcee` (direct) | `EnsembleSampler(nwalk, ndim, lnpost, pool=pool).run_mcmc(init, nsteps)` |\n", "| Run `alabi` surrogate | `SurrogateModel(fn=lnpost, ...).init_samples().init_gp().active_train()` |\n", "| MCMC on surrogate | `sm.run_dynesty(ptform=...)` or `sm.run_emcee(lnprior=...)` |\n", "| Reload saved run | `alabi.cache_utils.load_model_cache(savedir)` |\n", "\n", "**Key takeaways:**\n", "\n", "- Write your own `lnlike` to handle any transformation of model outputs to observables\n", "- Always guard against failed model runs (`try/except`, finite checks)\n", "- For slow models, `alabi` active learning dramatically reduces the number of VPLanet calls needed\n", "- `alabi` is model-agnostic — the same `SurrogateModel` interface works with any Python callable" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.10.19" } }, "nbformat": 4, "nbformat_minor": 5 }