{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "b1000001",
   "metadata": {},
   "source": [
    "# MCMC with VPLanet and `alabi`\n",
    "\n",
    "This tutorial demonstrates how to run **Bayesian parameter inference** on a VPLanet forward model\n",
    "using a **custom likelihood function** and the `alabi` surrogate-model framework.\n",
    "\n",
    "Evaluating a VPLanet model can take seconds to minutes per call.\n",
    "Running a standard MCMC sampler (e.g. `emcee` or `dynesty`) directly on the forward model\n",
    "requires millions of evaluations, which is computationally prohibitive.\n",
    "`alabi` addresses this by:\n",
    "\n",
    "1. Building a **Gaussian Process surrogate** of the log-posterior surface using a small number of\n",
    "   active-learning iterations.\n",
    "2. Running MCMC on the **fast surrogate** instead of the expensive forward model.\n",
    "\n",
    "We also show how to run MCMC **directly** using `dynesty` or `emcee` — useful for fast models or\n",
    "as a baseline comparison.\n",
    "\n",
    "### Problem: inferring TRAPPIST-1 stellar properties\n",
    "\n",
    "We fit four stellar parameters of TRAPPIST-1 using observed luminosity and XUV luminosity:\n",
    "\n",
    "| Parameter | Symbol | Prior range |\n",
    "|-----------|--------|-------------|\n",
    "| Stellar mass | $m_\\star$ | $[0.07, 0.11]\\, M_\\odot$ |\n",
    "| XUV saturation fraction | $f_{\\rm sat}$ | $[-5, -1]\\, {\\rm dex}$ |\n",
    "| XUV saturation timescale | $t_{\\rm sat}$ | $[0.1, 12]\\, {\\rm Gyr}$ |\n",
    "| XUV power-law exponent | $\\beta_{\\rm XUV}$ | $[-2, 0]$ |\n",
    "\n",
    "The age is constrained by a Gaussian prior $t_{\\rm age} = 7.6 \\pm 2.2\\, {\\rm Gyr}$.\n",
    "\n",
    "---\n",
    "\n",
    "**Dependencies:** `vplanet_inference`, `numpy`, `astropy`, `dynesty`, `emcee`, `matplotlib`\n",
    "\n",
    "**Optional (for surrogate MCMC):** `alabi` (see installation note below)\n",
    "\n",
    "**References:**\n",
    "- [Birky et al. (2021)](https://doi.org/10.3847/1538-4357/abded5) — `alabi` method paper\n",
    "- [Speagle (2020)](https://doi.org/10.1093/mnras/staa278) — `dynesty` nested sampling\n",
    "- [Foreman-Mackey et al. (2013)](https://doi.org/10.1086/670067) — `emcee` affine-invariant MCMC\n",
    "- [VPLanet stellar module](https://virtualplanetarylaboratory.github.io/vplanet/stellar.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000002",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import astropy.units as u\n",
    "from functools import partial\n",
    "import multiprocessing as mp\n",
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")\n",
    "\n",
    "import vplanet_inference as vpi"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000003",
   "metadata": {},
   "source": [
    "## 1. Configure the VPLanet Forward Model\n",
    "\n",
    "We use the bundled `stellar` infiles — a single low-mass star with the `stellar` module activated.\n",
    "The model outputs bolometric luminosity and XUV luminosity at the end of the simulation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000004",
   "metadata": {},
   "outputs": [],
   "source": [
    "INFILE_PATH = os.path.join(vpi.INFILE_DIR, \"stellar\")\n",
    "print(f\"Infile path: {INFILE_PATH}\")\n",
    "print(\"Files:\", os.listdir(INFILE_PATH))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000005",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Input parameters: what to vary and in what units\n",
    "# theta = [mass, log(fsat), tsat, age, beta]\n",
    "inparams = {\n",
    "    \"star.dMass\":       u.Msun,\n",
    "    \"star.dSatXUVFrac\": u.dex(u.dimensionless_unscaled),  # log scale\n",
    "    \"star.dSatXUVTime\": u.Gyr,\n",
    "    \"vpl.dStopTime\":    u.Gyr,\n",
    "    \"star.dXUVBeta\":    -u.dimensionless_unscaled,         # stored negative in vpl\n",
    "}\n",
    "\n",
    "# Output parameters: what to collect from the log file\n",
    "outparams = {\n",
    "    \"final.star.Luminosity\":  u.Lsun,\n",
    "    \"final.star.LXUVStellar\": u.Lsun,\n",
    "}\n",
    "\n",
    "vpm = vpi.VplanetModel(\n",
    "    inparams=inparams,\n",
    "    outparams=outparams,\n",
    "    inpath=INFILE_PATH,\n",
    "    outpath=\"output/mcmc\",\n",
    "    verbose=False,\n",
    ")\n",
    "\n",
    "print(\"Model inputs :\", vpm.inparams)\n",
    "print(\"Model outputs:\", vpm.outparams)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000006",
   "metadata": {},
   "source": [
    "### Sanity check: single run at known TRAPPIST-1 values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000007",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Approximate TRAPPIST-1 values (Burgasser & Mamajek 2017; Wheatley et al. 2017)\n",
    "theta_true = np.array([\n",
    "    0.089,   # mass [Msun]\n",
    "    -2.92,   # log(fsat) [dex]\n",
    "    1.0,     # tsat [Gyr]\n",
    "    7.6,     # age [Gyr]\n",
    "    -1.18,   # beta (stored negative)\n",
    "])\n",
    "\n",
    "result_true = vpm.run_model(theta_true)\n",
    "\n",
    "print(\"Model output at TRAPPIST-1 parameters:\")\n",
    "for name, val, unit in zip(vpm.outparams, result_true, vpm.out_units):\n",
    "    print(f\"  {name:<35s} = {val:.4e} [{unit}]\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000008",
   "metadata": {},
   "source": [
    "## 2. Observational Data and Priors\n",
    "\n",
    "We constrain the inference with two observational measurements:\n",
    "\n",
    "| Observable | Symbol | Value | Reference |\n",
    "|-----------|--------|-------|-----------|\n",
    "| Bolometric luminosity | $L_{\\rm bol}$ | $(5.22 \\pm 0.19) \\times 10^{-4}\\, L_\\odot$ | Gillon et al. (2017) |\n",
    "| XUV-to-bol. ratio | $L_{\\rm XUV}/L_{\\rm bol}$ | $(7.5 \\pm 1.5) \\times 10^{-4}$ | Wheatley et al. (2017) |\n",
    "\n",
    "Five parameters have the following priors:\n",
    "\n",
    "| Parameter | Prior | Values |\n",
    "|-----------|-------|--------|\n",
    "| $m_\\star$ | Uniform | $[0.07, 0.11]\\, M_\\odot$ |\n",
    "| $\\log f_{\\rm sat}$ | Gaussian | $-2.92 \\pm 0.26$ |\n",
    "| $t_{\\rm sat}$ | Uniform | $[0.1, 12]\\, {\\rm Gyr}$ |\n",
    "| Age | Gaussian | $7.6 \\pm 2.2\\, {\\rm Gyr}$ |\n",
    "| $\\beta_{\\rm XUV}$ | Gaussian | $-1.18 \\pm 0.31$ |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000009",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Observational data: shape (n_obs, 2) = [[mean, std], ...]\n",
    "# We observe Lbol and Lxuv/Lbol (not Lxuv directly)\n",
    "like_data = np.array([\n",
    "    [5.22e-4, 0.19e-4],   # Lbol [Lsun]\n",
    "    [7.5e-4,  1.5e-4],    # Lxuv/Lbol [dimensionless]\n",
    "])\n",
    "\n",
    "# Prior configuration: (mean, sigma) for Gaussian; None for Uniform\n",
    "# Order matches inparams: [mass, log_fsat, tsat, age, beta]\n",
    "prior_data = [\n",
    "    (None,   None),   # mass — uniform\n",
    "    (-2.92,  0.26),   # log(fsat) — Gaussian\n",
    "    (None,   None),   # tsat — uniform\n",
    "    (7.6,    2.2),    # age — Gaussian\n",
    "    (-1.18,  0.31),   # beta — Gaussian\n",
    "]\n",
    "\n",
    "# Hard bounds (uniform prior limits for all parameters)\n",
    "bounds = [\n",
    "    (0.07, 0.11),     # mass [Msun]\n",
    "    (-5.0, -1.0),     # log(fsat)\n",
    "    (0.1,  12.0),     # tsat [Gyr]\n",
    "    (0.1,  12.0),     # age [Gyr]\n",
    "    (-2.0,  0.0),     # beta\n",
    "]\n",
    "\n",
    "labels = [\n",
    "    r\"$m_\\star\\,[M_\\odot]$\",\n",
    "    r\"$\\log f_{\\rm sat}$\",\n",
    "    r\"$t_{\\rm sat}\\,[{\\rm Gyr}]$\",\n",
    "    r\"Age [Gyr]\",\n",
    "    r\"$\\beta_{\\rm XUV}$\",\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000010",
   "metadata": {},
   "source": [
    "## 3. Custom Likelihood and Prior Functions\n",
    "\n",
    "The likelihood compares model predictions to observations assuming Gaussian errors:\n",
    "\n",
    "$$\n",
    "\\ln \\mathcal{L}(\\theta) = -\\frac{1}{2}\n",
    "\\sum_k \\left(\\frac{y_k^{\\rm model}(\\theta) - y_k^{\\rm obs}}{\\sigma_k^{\\rm obs}}\\right)^2\n",
    "$$\n",
    "\n",
    "Note that the model returns $L_{\\rm bol}$ and $L_{\\rm XUV}$ as separate quantities,\n",
    "but the observation constrains $L_{\\rm XUV}/L_{\\rm bol}$.\n",
    "This transformation is applied inside `lnlike`.\n",
    "\n",
    "### 3.1 Custom Log-Likelihood"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000011",
   "metadata": {},
   "outputs": [],
   "source": [
    "def lnlike(theta):\n",
    "    \"\"\"\n",
    "    Log-likelihood comparing VPLanet outputs to observations.\n",
    "\n",
    "    theta : array-like, shape (5,)\n",
    "        [mass (Msun), log_fsat, tsat (Gyr), age (Gyr), beta]\n",
    "\n",
    "    Returns\n",
    "    -------\n",
    "    float : log-likelihood value (−inf if model fails)\n",
    "    \"\"\"\n",
    "    try:\n",
    "        out = vpm.run_model(np.array(theta))\n",
    "    except Exception:\n",
    "        return -np.inf\n",
    "\n",
    "    lbol = out[0]   # final.star.Luminosity  [Lsun]\n",
    "    lxuv = out[1]   # final.star.LXUVStellar [Lsun]\n",
    "\n",
    "    # Guard against non-physical outputs\n",
    "    if lbol <= 0 or lxuv <= 0 or not np.isfinite(lbol) or not np.isfinite(lxuv):\n",
    "        return -np.inf\n",
    "\n",
    "    # Model observables: [Lbol, Lxuv/Lbol]\n",
    "    model = np.array([lbol, lxuv / lbol])\n",
    "\n",
    "    # Gaussian log-likelihood\n",
    "    lnl = -0.5 * np.sum(((model - like_data[:, 0]) / like_data[:, 1]) ** 2)\n",
    "\n",
    "    return lnl\n",
    "\n",
    "\n",
    "# Test: likelihood at true parameters\n",
    "lnl_true = lnlike(theta_true)\n",
    "print(f\"ln L at true parameters: {lnl_true:.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000012",
   "metadata": {},
   "source": [
    "### 3.2 Log-Prior\n",
    "\n",
    "We combine Gaussian and uniform priors.\n",
    "Parameters with `(None, None)` in `prior_data` get a flat (uniform) prior within `bounds`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000013",
   "metadata": {},
   "outputs": [],
   "source": [
    "def lnprior(theta):\n",
    "    \"\"\"\n",
    "    Log-prior: uniform bounds + Gaussian constraints where specified.\n",
    "\n",
    "    Returns −inf if any parameter is outside its hard bounds.\n",
    "    \"\"\"\n",
    "    lnp = 0.0\n",
    "    for i, (val, (lo, hi), gauss) in enumerate(zip(theta, bounds, prior_data)):\n",
    "        # Hard bounds (uniform component)\n",
    "        if not (lo < val < hi):\n",
    "            return -np.inf\n",
    "        # Gaussian component\n",
    "        mu, sigma = gauss\n",
    "        if mu is not None:\n",
    "            lnp += -0.5 * ((val - mu) / sigma) ** 2\n",
    "    return lnp\n",
    "\n",
    "\n",
    "# Test: prior at true parameters\n",
    "lnp_true = lnprior(theta_true)\n",
    "print(f\"ln prior at true parameters: {lnp_true:.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000014",
   "metadata": {},
   "source": [
    "### 3.3 Log-Posterior"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000015",
   "metadata": {},
   "outputs": [],
   "source": [
    "def lnpost(theta):\n",
    "    \"\"\"Log-posterior = log-likelihood + log-prior.\"\"\"\n",
    "    lnp = lnprior(theta)\n",
    "    if not np.isfinite(lnp):\n",
    "        return -np.inf\n",
    "    return lnlike(theta) + lnp\n",
    "\n",
    "\n",
    "print(f\"ln posterior at true parameters: {lnpost(theta_true):.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000016",
   "metadata": {},
   "source": [
    "## 4. Option A — Direct MCMC with `dynesty`\n",
    "\n",
    "For a quick test or for fast models, you can run `dynesty` nested sampling directly\n",
    "on the VPLanet forward model (no surrogate needed).\n",
    "\n",
    "`dynesty` requires a **prior transform** $\\pi^{-1}: [0,1]^d \\to \\theta$ that maps\n",
    "the unit hypercube to the prior distribution.\n",
    "\n",
    "> **Note:** Direct nested sampling on VPLanet is only feasible for simple models with short\n",
    "> simulation times (< 1 s per call). For realistic stellar/tidal evolution simulations\n",
    "> (seconds to minutes per call), use the `alabi` surrogate approach in Section 5."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000017",
   "metadata": {},
   "outputs": [],
   "source": [
    "import dynesty\n",
    "from dynesty import plotting as dyplot\n",
    "from scipy.special import ndtri  # inverse normal CDF\n",
    "\n",
    "\n",
    "def prior_transform(u_vec):\n",
    "    \"\"\"\n",
    "    Map unit-hypercube samples u ∈ [0,1]^d to the parameter prior.\n",
    "\n",
    "    For uniform priors: theta = lo + u * (hi - lo)\n",
    "    For Gaussian priors (truncated to [lo, hi]): theta = mu + sigma * Phi^{-1}(u)\n",
    "    \"\"\"\n",
    "    theta = np.zeros(len(bounds))\n",
    "    for i, (u_i, (lo, hi), gauss) in enumerate(zip(u_vec, bounds, prior_data)):\n",
    "        mu, sigma = gauss\n",
    "        if mu is None:\n",
    "            # Uniform prior\n",
    "            theta[i] = lo + u_i * (hi - lo)\n",
    "        else:\n",
    "            # Gaussian prior truncated to [lo, hi]\n",
    "            # Map [0,1] -> CDF of truncated normal\n",
    "            from scipy.stats import norm\n",
    "            cdf_lo = norm.cdf(lo, loc=mu, scale=sigma)\n",
    "            cdf_hi = norm.cdf(hi, loc=mu, scale=sigma)\n",
    "            theta[i] = norm.ppf(cdf_lo + u_i * (cdf_hi - cdf_lo), loc=mu, scale=sigma)\n",
    "    return theta\n",
    "\n",
    "\n",
    "# Quick test of prior transform\n",
    "u_test = np.full(len(bounds), 0.5)\n",
    "print(\"Prior transform at u=0.5 (median of prior):\")\n",
    "for name, val in zip(labels, prior_transform(u_test)):\n",
    "    print(f\"  {name}: {val:.4f}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000018",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Likelihood function for dynesty (takes only theta, returns ln-likelihood)\n",
    "def lnlike_dynesty(theta):\n",
    "    lnp = lnprior(theta)\n",
    "    if not np.isfinite(lnp):\n",
    "        return -1e300\n",
    "    lnl = lnlike(theta)\n",
    "    if not np.isfinite(lnl):\n",
    "        return -1e300\n",
    "    return lnl"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000019",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Run static nested sampling\n",
    "# nlive=50 is very small (use ≥200 for production); this is just for illustration\n",
    "sampler = dynesty.NestedSampler(\n",
    "    loglikelihood=lnlike_dynesty,\n",
    "    prior_transform=prior_transform,\n",
    "    ndim=len(bounds),\n",
    "    nlive=50,\n",
    ")\n",
    "\n",
    "sampler.run_nested(print_progress=True)\n",
    "results_dynesty = sampler.results\n",
    "print(f\"\\nTotal likelihood calls: {results_dynesty.ncall}\")\n",
    "print(f\"log Evidence: {results_dynesty.logz[-1]:.2f} ± {results_dynesty.logzerr[-1]:.2f}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000020",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Corner plot of posterior samples\n",
    "fig, axes = dyplot.cornerplot(\n",
    "    results_dynesty,\n",
    "    labels=labels,\n",
    "    show_titles=True,\n",
    "    title_kwargs={\"fontsize\": 10},\n",
    "    label_kwargs={\"fontsize\": 10},\n",
    "    truths=theta_true,\n",
    "    fig=plt.subplots(len(bounds), len(bounds), figsize=(10, 10)),\n",
    ")\n",
    "plt.suptitle(\"dynesty posterior — TRAPPIST-1 stellar parameters\", y=1.01, fontsize=12)\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000021",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Extract posterior samples and compute parameter summaries\n",
    "from dynesty import utils as dyfunc\n",
    "\n",
    "samples_eq, weights = results_dynesty.samples, np.exp(results_dynesty.logwt - results_dynesty.logz[-1])\n",
    "samples_dynesty = dyfunc.resample_equal(results_dynesty.samples, weights)\n",
    "\n",
    "print(\"Posterior summaries (median ± 1-sigma):\")\n",
    "for i, label in enumerate(labels):\n",
    "    lo, med, hi = np.percentile(samples_dynesty[:, i], [16, 50, 84])\n",
    "    print(f\"  {label}: {med:.4f} +{hi-med:.4f} -{med-lo:.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000022",
   "metadata": {},
   "source": [
    "## 5. Option B — Direct MCMC with `emcee`\n",
    "\n",
    "`emcee` uses an ensemble of walkers and is simpler to set up than nested sampling.\n",
    "It requires the **log-posterior** directly (no prior transform needed).\n",
    "\n",
    "> **Tip:** Initialize walkers in a small ball around the known maximum-likelihood\n",
    "> estimate for faster convergence."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000023",
   "metadata": {},
   "outputs": [],
   "source": [
    "import emcee\n",
    "\n",
    "N_WALKERS = 20\n",
    "N_STEPS   = 500   # increase to ≥2000 for production\n",
    "N_DIM     = len(bounds)\n",
    "\n",
    "# Initialize walkers in a small Gaussian ball around the nominal parameters\n",
    "rng = np.random.default_rng(42)\n",
    "theta_init = theta_true + 1e-3 * rng.standard_normal((N_WALKERS, N_DIM))\n",
    "\n",
    "# Clip to within bounds\n",
    "for i, (lo, hi) in enumerate(bounds):\n",
    "    theta_init[:, i] = np.clip(theta_init[:, i], lo + 1e-6, hi - 1e-6)\n",
    "\n",
    "print(f\"Running emcee: {N_WALKERS} walkers × {N_STEPS} steps = {N_WALKERS * N_STEPS} total calls\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000024",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Parallelise with multiprocessing\n",
    "with mp.Pool(mp.cpu_count()) as pool:\n",
    "    sampler_emcee = emcee.EnsembleSampler(\n",
    "        N_WALKERS,\n",
    "        N_DIM,\n",
    "        lnpost,\n",
    "        pool=pool,\n",
    "    )\n",
    "    sampler_emcee.run_mcmc(theta_init, N_STEPS, progress=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000025",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Discard burn-in and thin the chain\n",
    "try:\n",
    "    tau = sampler_emcee.get_autocorr_time(quiet=True)\n",
    "    burn = int(2 * np.max(tau))\n",
    "    thin = max(1, int(0.5 * np.min(tau)))\n",
    "except Exception:\n",
    "    burn = N_STEPS // 4\n",
    "    thin = 1\n",
    "    print(\"Note: autocorrelation estimation failed (chain too short). Using fallback burn-in.\")\n",
    "\n",
    "flat_samples = sampler_emcee.get_chain(discard=burn, thin=thin, flat=True)\n",
    "print(f\"Burn-in: {burn} steps, thin: {thin}\")\n",
    "print(f\"Posterior samples: {flat_samples.shape[0]}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000026",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Trace plot: walker trajectories over time\n",
    "chain = sampler_emcee.get_chain()\n",
    "\n",
    "fig, axes = plt.subplots(N_DIM, 1, figsize=(10, 8), sharex=True)\n",
    "for i, (ax, label) in enumerate(zip(axes, labels)):\n",
    "    for walker in chain[:, :, i].T:\n",
    "        ax.plot(walker, alpha=0.3, lw=0.5, color=\"steelblue\")\n",
    "    ax.axhline(theta_true[i], color=\"red\", lw=1.5, ls=\"--\", label=\"true\" if i == 0 else None)\n",
    "    ax.axvline(burn, color=\"k\", lw=1, ls=\":\", label=\"burn-in\" if i == 0 else None)\n",
    "    ax.set_ylabel(label, fontsize=9)\n",
    "axes[0].legend(fontsize=8)\n",
    "axes[-1].set_xlabel(\"Step\", fontsize=10)\n",
    "fig.suptitle(\"emcee walker traces — TRAPPIST-1\", fontsize=11)\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000027",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Corner plot using matplotlib\n",
    "try:\n",
    "    import corner\n",
    "    fig = corner.corner(\n",
    "        flat_samples,\n",
    "        labels=labels,\n",
    "        truths=theta_true,\n",
    "        quantiles=[0.16, 0.5, 0.84],\n",
    "        show_titles=True,\n",
    "        title_kwargs={\"fontsize\": 9},\n",
    "    )\n",
    "    plt.suptitle(\"emcee posterior — TRAPPIST-1 stellar parameters\", y=1.01, fontsize=11)\n",
    "    plt.show()\n",
    "except ImportError:\n",
    "    print(\"Install 'corner' for a corner plot: pip install corner\")\n",
    "\n",
    "print(\"\\nPosterior summaries (median ± 1-sigma):\")\n",
    "for i, label in enumerate(labels):\n",
    "    lo, med, hi = np.percentile(flat_samples[:, i], [16, 50, 84])\n",
    "    print(f\"  {label}: {med:.4f} +{hi-med:.4f} -{med-lo:.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000028",
   "metadata": {},
   "source": [
    "## 6. Option C — Surrogate MCMC with `alabi`\n",
    "\n",
    "`alabi` (**A**ctive **L**earning **A**pproach for **B**ayesian **I**nference) accelerates\n",
    "inference by replacing the expensive forward model with a Gaussian Process surrogate\n",
    "trained on a small number of strategically chosen model evaluations.\n",
    "\n",
    "### Installation\n",
    "\n",
    "```bash\n",
    "git clone https://github.com/jbirky/alabi\n",
    "cd alabi\n",
    "pip install -e .\n",
    "```\n",
    "\n",
    "### Workflow\n",
    "\n",
    "```\n",
    "Initial samples  →  Fit GP  →  Active learning (BAPE)  →  Dynesty/emcee on surrogate\n",
    "```\n",
    "\n",
    "The **BAPE** (Bayesian Active Posterior Estimation) algorithm selects the next point\n",
    "to evaluate by maximising the expected information gain about the posterior,\n",
    "concentrating model evaluations in regions of high posterior probability."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000029",
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    import alabi\n",
    "    from alabi import SurrogateModel\n",
    "    from alabi import utility as ut\n",
    "    ALABI_AVAILABLE = True\n",
    "    print(f\"alabi version: {alabi.__version__}\")\n",
    "except ImportError:\n",
    "    ALABI_AVAILABLE = False\n",
    "    print(\n",
    "        \"alabi is not installed. To install:\\n\"\n",
    "        \"  git clone https://github.com/jbirky/alabi\\n\"\n",
    "        \"  cd alabi && pip install -e .\\n\"\n",
    "        \"\\nThe cells below show the alabi workflow but will be skipped.\"\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000030",
   "metadata": {},
   "source": [
    "### 6.1 Setting Up the Prior Sampler and Transform\n",
    "\n",
    "`alabi` needs two functions:\n",
    "- **`prior_sampler`**: draws random samples from the prior (for seeding GP training data)\n",
    "- **`prior_transform`**: maps the unit hypercube to the prior (for dynesty)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000031",
   "metadata": {},
   "outputs": [],
   "source": [
    "if ALABI_AVAILABLE:\n",
    "    # alabi uses its own prior utilities\n",
    "    # prior_data and bounds must match the parameter ordering of theta\n",
    "\n",
    "    # Sampler for drawing training/test points from the prior\n",
    "    prior_sampler_alabi = partial(\n",
    "        ut.prior_sampler_normal,\n",
    "        prior_data=prior_data,\n",
    "        bounds=bounds,\n",
    "    )\n",
    "\n",
    "    # Prior transform for dynesty\n",
    "    prior_transform_alabi = partial(\n",
    "        ut.prior_transform_normal,\n",
    "        bounds=bounds,\n",
    "        data=prior_data,\n",
    "    )\n",
    "\n",
    "    # Log-prior for emcee\n",
    "    lnprior_alabi = partial(\n",
    "        ut.lnprior_normal,\n",
    "        bounds=bounds,\n",
    "        data=prior_data,\n",
    "    )\n",
    "\n",
    "    # Test\n",
    "    sample = prior_sampler_alabi(n=3)\n",
    "    print(\"Three prior samples:\")\n",
    "    print(sample)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000032",
   "metadata": {},
   "source": [
    "### 6.2 Initialising the Surrogate Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000033",
   "metadata": {},
   "outputs": [],
   "source": [
    "if ALABI_AVAILABLE:\n",
    "    SAVEDIR = \"output/alabi_trappist\"\n",
    "\n",
    "    sm = SurrogateModel(\n",
    "        fn=lnpost,                     # function to emulate (log-posterior)\n",
    "        bounds=bounds,                 # hard parameter bounds\n",
    "        prior_sampler=prior_sampler_alabi,\n",
    "        savedir=SAVEDIR,\n",
    "        labels=labels,\n",
    "        scale=\"nlog\",                  # nlog = negative log (for a log-posterior)\n",
    "    )\n",
    "\n",
    "    # Draw initial training and test sets from the prior\n",
    "    sm.init_samples(ntrain=100, ntest=100, reload=False)\n",
    "\n",
    "    # Fit an initial GP to the training data\n",
    "    sm.init_gp(\n",
    "        kernel=\"ExpSquaredKernel\",\n",
    "        fit_amp=False,\n",
    "        fit_mean=True,\n",
    "        white_noise=-15,\n",
    "    )\n",
    "\n",
    "    print(\"Surrogate model initialized.\")\n",
    "    print(f\"Training set size: {len(sm.training_x)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000034",
   "metadata": {},
   "source": [
    "### 6.3 Active Learning\n",
    "\n",
    "The surrogate is improved iteratively using the **BAPE** algorithm.\n",
    "Each iteration selects the point with maximum expected information gain,\n",
    "evaluates the true forward model there, and updates the GP.\n",
    "\n",
    "For a production run, use `niter=500`; here we use a small number for demonstration."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000035",
   "metadata": {},
   "outputs": [],
   "source": [
    "if ALABI_AVAILABLE:\n",
    "    sm.active_train(\n",
    "        niter=50,           # number of active-learning iterations (use 500+ in production)\n",
    "        algorithm=\"bape\",  # Bayesian Active Posterior Estimation\n",
    "        gp_opt_freq=10,    # re-optimise GP hyperparameters every 10 iterations\n",
    "        save_progress=True,\n",
    "    )\n",
    "\n",
    "    # Plot GP diagnostics (training set, test predictions, residuals)\n",
    "    sm.plot(plots=[\"gp_all\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000036",
   "metadata": {},
   "source": [
    "### 6.4 MCMC on the Surrogate\n",
    "\n",
    "Once the surrogate is trained, we run `dynesty` on the **GP emulator** instead of the\n",
    "expensive forward model. This is orders of magnitude faster."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000037",
   "metadata": {},
   "outputs": [],
   "source": [
    "if ALABI_AVAILABLE:\n",
    "    # Static nested sampling on the GP surrogate\n",
    "    sm.run_dynesty(\n",
    "        ptform=prior_transform_alabi,\n",
    "        mode=\"static\",\n",
    "        multi_proc=True,\n",
    "        save_iter=100,\n",
    "    )\n",
    "\n",
    "    # Plot corner plot and run statistics\n",
    "    sm.plot(plots=[\"dynesty_all\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000038",
   "metadata": {},
   "outputs": [],
   "source": [
    "if ALABI_AVAILABLE:\n",
    "    # You can also run emcee on the surrogate for a different sampling approach\n",
    "    sm.run_emcee(\n",
    "        lnprior=lnprior_alabi,\n",
    "        nwalkers=50,\n",
    "        nsteps=10000,    # surrogate calls are fast, so we can afford many steps\n",
    "        opt_init=False,\n",
    "    )\n",
    "    sm.plot(plots=[\"emcee_corner\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000039",
   "metadata": {},
   "source": [
    "### 6.5 Reloading a Saved Model\n",
    "\n",
    "`alabi` automatically saves the training set and GP model to `savedir` at each iteration.\n",
    "You can reload a completed run without re-training:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1000040",
   "metadata": {},
   "outputs": [],
   "source": [
    "if ALABI_AVAILABLE and os.path.exists(SAVEDIR):\n",
    "    sm_reload = alabi.cache_utils.load_model_cache(SAVEDIR)\n",
    "    print(f\"Reloaded model from {SAVEDIR}\")\n",
    "    print(f\"Training set size: {len(sm_reload.training_x)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1000047",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "| Task | Code |\n",
    "|------|------|\n",
    "| Define custom likelihood | `def lnlike(theta): out = vpm.run_model(theta); return -0.5 * chi2` |\n",
    "| Combine with prior | `def lnpost(theta): return lnlike(theta) + lnprior(theta)` |\n",
    "| Run `dynesty` (direct) | `NestedSampler(lnlike, prior_transform, ndim).run_nested()` |\n",
    "| Run `emcee` (direct) | `EnsembleSampler(nwalk, ndim, lnpost, pool=pool).run_mcmc(init, nsteps)` |\n",
    "| Run `alabi` surrogate | `SurrogateModel(fn=lnpost, ...).init_samples().init_gp().active_train()` |\n",
    "| MCMC on surrogate | `sm.run_dynesty(ptform=...)` or `sm.run_emcee(lnprior=...)` |\n",
    "| Reload saved run | `alabi.cache_utils.load_model_cache(savedir)` |\n",
    "\n",
    "**Key takeaways:**\n",
    "\n",
    "- Write your own `lnlike` to handle any transformation of model outputs to observables\n",
    "- Always guard against failed model runs (`try/except`, finite checks)\n",
    "- For slow models, `alabi` active learning dramatically reduces the number of VPLanet calls needed\n",
    "- `alabi` is model-agnostic — the same `SurrogateModel` interface works with any Python callable"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10.19"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}