Frequentist uncertainty vs. Bayesian uncertainty analysis#

Mark Bakker, TU Delft & Raoul Collenteur, Eawag, February, 2025

In this notebook, the fit and uncertainty are compared for pastas models solved with least squares (frequentist uncertainty) and with MCMC (Bayesian uncertainty). Besides Pastas the following Python Packages have to be installed to run this notebook:

Note: The EmceeSolver is still an experimental feature and some of the arguments may change in before official release. We welcome testing and feedback on this new feature!
import corner
import emcee
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import pastas as ps

ps.set_log_level("ERROR")
ps.show_versions()
Pastas version: 1.11.0
Python version: 3.11.12
NumPy version: 2.2.6
Pandas version: 2.3.2
SciPy version: 1.16.1
Matplotlib version: 3.10.6
Numba version: 0.61.2

1. A ‘regular’ Pastas Model#

The first step is to create a Pastas Model with a linear RechargeModel and a Gamma response function to simulate the effect of precipitation and evaporation on the heads. The AR1 noise model is used. We first estimate the model parameters using the standard least-squares approach.

head = pd.read_csv(
    "data/B32C0639001.csv", parse_dates=["date"], index_col="date"
).squeeze()
head = head["1990":]  # use data from 1990 on for this example

evap = pd.read_csv("data/evap_260.csv", index_col=0, parse_dates=[0]).squeeze()
rain = pd.read_csv("data/rain_260.csv", index_col=0, parse_dates=[0]).squeeze()

ml1 = ps.Model(head)
ml1.add_noisemodel(ps.ArNoiseModel())

rm = ps.RechargeModel(
    rain, evap, recharge=ps.rch.Linear(), rfunc=ps.Gamma(), name="rch"
)
ml1.add_stressmodel(rm)

ml1.solve()

ax = ml1.plots.results(figsize=(10, 4))
Fit report head                   Fit Statistics
================================================
nfev    29                     EVP         84.76
nobs    351                    R2           0.85
noise   True                   RMSE         0.08
tmin    1990-01-02 00:00:00    AICc     -1976.57
tmax    2005-10-14 00:00:00    BIC      -1953.65
freq    D                      Obj          0.61
warmup  3650 days 00:00:00     ___              
solver  LeastSquares           Interp.        No

Parameters (6 optimized)
================================================
                optimal    initial  vary
rch_A          0.306929   0.198424  True
rch_n          0.873052   1.000000  True
rch_a        145.630837  10.000000  True
rch_f         -0.637846  -1.000000  True
constant_d     0.935049   1.338063  True
noise_alpha   41.304942  15.000000  True
../_images/7b4c3ff11766a68e4cb43ae7fecda6b8d5e37a52547c8cf28f935932e9763ac5.png

The diagnostics show that the noise meets the statistical requirements for uncertainty analysis reasonably well.

ml1.plots.diagnostics();
../_images/566d82789915b8e82bf856e33bc676d6adb49f362496183ade7d94f2b8e43303.png

The estimated least squares parameters and standard errors are stored for later reference

ls_params = ml1.parameters[["optimal", "stderr"]].copy()
ls_params.rename(columns={"optimal": "ls_opt", "stderr": "ls_sig"}, inplace=True)
ls_params
ls_opt ls_sig
rch_A 0.306929 0.021354
rch_n 0.873052 0.026222
rch_a 145.630837 18.352590
rch_f -0.637846 0.069334
constant_d 0.935049 0.048544
noise_alpha 41.304942 6.342921
# Compute prediction interval Pastas
pi = ml1.solver.prediction_interval(n=1000)
ax = ml1.plot(figsize=(10, 3))
ax.fill_between(pi.index, pi.iloc[:, 0], pi.iloc[:, 1], color="lightgray")
ax.legend(["Observations", "Simulation", "95% Prediction interval"], ncol=3, loc=2)
pi_pasta = np.mean(pi[0.975] - pi[0.025])
print(f"Mean prediction interval width: {pi_pasta:.3f} m")
print(f"Prediction interval coverage probability: {ps.stats.picp(head, pi): .3f}")
Mean prediction interval width: 0.325 m
Prediction interval coverage probability:  0.940
../_images/5fb3e6bd696c90f5944f19dfb48a94526a3109973b95cd6bdda5c6549bd5c713.png

2. Use the EmceeSolver#

We will now use MCMC to estimate the model parameters and their uncertainties. The EmceeSolve solver wraps the Emcee package, which implements different versions of MCMC. A good understanding of Emcee helps when using this solver, so it comes recommended to check out their documentation as well.

We start by making a pastas model with a linear recharge model and a Gamma response function. No noise model is added, as this is taken care of in the likelihood function. The model is solved using the regular solve (least squares) to have a good estimate of the starting values of the parameters.

ml2 = ps.Model(head)
rm = ps.RechargeModel(
    rain, evap, recharge=ps.rch.Linear(), rfunc=ps.Gamma(), name="rch"
)
ml2.add_stressmodel(rm)
ml2.solve()
Fit report head                   Fit Statistics
================================================
nfev    16                     EVP         86.58
nobs    351                    R2           0.87
noise   False                  RMSE         0.08
tmin    1990-01-02 00:00:00    AICc     -1797.03
tmax    2005-10-14 00:00:00    BIC      -1777.90
freq    D                      Obj          1.02
warmup  3650 days 00:00:00     ___              
solver  LeastSquares           Interp.        No

Parameters (5 optimized)
================================================
               optimal    initial  vary
rch_A         0.314024   0.198424  True
rch_n         0.800951   1.000000  True
rch_a       235.012025  10.000000  True
rch_f        -0.898120  -1.000000  True
constant_d    1.051881   1.338063  True

To set up the EmceeSolve solver, a number of decisions need to be made:

  • Select the priors of the parameters

  • Select a (log) likelihood function

  • Select the number of steps and thinning

2a. Priors#

The first step is to select the priors of the parameters. This is done by using the ml.set_parameter method and the dist argument (from distribution). Any distribution from scipy.stats can be chosen url, for example uniform, norm, or lognorm. Here, we select normal distributions for the priors. Currently, pastas will use the initial value of the parameter for the loc argument of the distribution (e.g., the mean of a normal distribution), and the stderr as the scale argument (e.g., the standard deviation of a normal distribution). Only for the parameters with a uniform distribution, the pmin and pmax values are used to determine a uniform prior. By default, all parameters are assigned a uniform prior.

Note: This means that either the `pmin` and `pmax` should be set for uniform distributions, or the `stderr` for any other distribution. That is why in this example model was first solved using LeastSquares, in order to obtain estimates for the `stderr`. In practice, these could also be set based on expert judgement or information about the parameters.
# Set the initial parameters to a normal distribution
ml2.parameters["initial"] = ml2.parameters[
    "optimal"
]  # set initial value to the optimal from least squares for good starting point
ml2.parameters["stderr"] = (
    2 * ml2.parameters["stderr"]
)  # this column is used (for now) to set the scale of the normal distribution

for name in ml2.parameters.index:
    ml2.set_parameter(
        name,
        dist="norm",
    )

ml2.parameters
initial pmin pmax vary name dist stderr optimal
rch_A 0.314024 0.00001 19.842364 True rch norm 0.022942 0.314024
rch_n 0.800951 0.10000 5.000000 True rch norm 0.049725 0.800951
rch_a 235.012025 0.01000 10000.000000 True rch norm 46.256299 235.012025
rch_f -0.898120 -2.00000 0.000000 True rch norm 0.107650 -0.898120
constant_d 1.051881 NaN NaN True constant norm 0.057945 1.051881

2b. Create the solver instance#

The next step is to create an instance of the EmceeSolve solver class. At this stage all the settings need to be provided on how the Ensemble Sampler is created (https://emcee.readthedocs.io/en/stable/user/sampler/). Important settings are the nwalkers, the moves, the objective_function. More advanced options are to parallelize the MCMC algorithm (parallel=True), and to set a backend to store the results. Here’s an example:

# Choose the objective function
ln_prob = ps.objfunc.GaussianLikelihoodAr1()

# Create the EmceeSolver with some settings
s = ps.EmceeSolve(
    nwalkers=20,
    moves=emcee.moves.DEMove(),
    objective_function=ln_prob,
    progress_bar=True,
    parallel=False,
)

In the above code we created an EmceeSolve instance with 20 walkers, which take steps according to the DEMove move algorithm (see Emcee docs), and a Gaussian likelihood function that assumes AR1 correlated errors. Different objective functions are available, see the Pastas documentation on the different options.

Depending on the likelihood function, a number of additional parameters need to be inferred. These parameters are not added to the Pastas Model instance, but are available from the solver object. Using the set_parameter method of the solver, these parameters can be changed. In this example where we use the GaussianLikelihoodAr1 function, the \(\sigma^2\) and \(\phi\) are estimated; the unknown standard deviation of the errors and the autoregressive parameter.

s.parameters
initial pmin pmax vary stderr name dist
ln_var 0.05 1.000000e-10 1.00000 True 0.01 ln uniform
ln_phi 0.50 1.000000e-10 0.99999 True 0.20 ln uniform
sigsq = ml1.noise().std() ** 2
s.set_parameter("ln_var", initial=sigsq, vary=True)
s.parameters.loc["ln_var", "stderr"] = stderr = sigsq / 8
s.parameters
initial pmin pmax vary stderr name dist
ln_var 0.00347 1.000000e-10 1.00000 True 0.000434 ln uniform
ln_phi 0.50000 1.000000e-10 0.99999 True 0.200000 ln uniform

2c. Run the solver and solve the model#

After setting the parameters and creating a EmceeSolve solver instance we are now ready to run the MCMC analysis. We can do this by running ml.solve. We can pass the same parameters that we normally provide to this method (e.g., tmin or fit_constant). Here we use the initial parameters from our least-square solve, and do not fit a noise model, because we take autocorrelated errors into account through the likelihood function.

All the arguments that are not used by ml.solve, for example steps and tune, are passed on to the run_mcmc method from the sampler (see Emcee docs). The most important is the steps argument, that determines how many steps each of the walkers takes.

# Use the solver to run MCMC
ml2.solve(
    solver=s,
    initial=False,
    tmin="1990",
    steps=1000,
    tune=True,
    report=False,
)
emcee: Exception while calling your likelihood function:
  params: [ 3.03721704e-01  8.38272778e-01  1.84133330e+02 -7.82482828e-01
  1.03450968e+00  4.01019955e-03  6.74997734e-01]
  args: (False, None, None)
  kwargs: {}
  exception:
  0%|          | 0/1000 [00:00<?, ?it/s]
  0%|          | 3/1000 [00:00<00:33, 29.42it/s]
  1%|          | 6/1000 [00:00<00:33, 29.54it/s]
  1%|          | 9/1000 [00:00<00:33, 29.62it/s]
  1%|          | 12/1000 [00:00<00:33, 29.41it/s]
  2%|▏         | 15/1000 [00:00<00:33, 29.52it/s]
  2%|▏         | 18/1000 [00:00<00:33, 29.63it/s]
  2%|▏         | 21/1000 [00:00<00:32, 29.71it/s]
  2%|▏         | 24/1000 [00:00<00:32, 29.63it/s]
  3%|▎         | 27/1000 [00:00<00:32, 29.74it/s]
  3%|▎         | 31/1000 [00:01<00:32, 29.95it/s]
  4%|▎         | 35/1000 [00:01<00:32, 30.12it/s]
  4%|▍         | 39/1000 [00:01<00:31, 30.14it/s]
  4%|▍         | 43/1000 [00:01<00:31, 30.09it/s]
  5%|▍         | 47/1000 [00:01<00:31, 30.14it/s]
  5%|▌         | 51/1000 [00:01<00:31, 30.30it/s]
  6%|▌         | 55/1000 [00:01<00:31, 30.37it/s]
  6%|▌         | 59/1000 [00:01<00:30, 30.51it/s]
  6%|▋         | 63/1000 [00:02<00:30, 30.59it/s]
  7%|▋         | 67/1000 [00:02<00:30, 30.61it/s]
  7%|▋         | 71/1000 [00:02<00:30, 30.47it/s]
  8%|▊         | 75/1000 [00:02<00:30, 30.43it/s]
  8%|▊         | 79/1000 [00:02<00:30, 30.37it/s]
  8%|▊         | 83/1000 [00:02<00:30, 30.41it/s]
  9%|▊         | 87/1000 [00:02<00:30, 30.38it/s]
  9%|▉         | 91/1000 [00:03<00:29, 30.39it/s]
 10%|▉         | 95/1000 [00:03<00:29, 30.48it/s]
 10%|▉         | 99/1000 [00:03<00:29, 30.53it/s]
 10%|█         | 103/1000 [00:03<00:29, 30.43it/s]
 11%|█         | 107/1000 [00:03<00:29, 30.45it/s]
 11%|█         | 111/1000 [00:03<00:29, 30.43it/s]
 12%|█▏        | 115/1000 [00:03<00:29, 30.42it/s]
 12%|█▏        | 119/1000 [00:03<00:28, 30.43it/s]
 12%|█▏        | 123/1000 [00:04<00:28, 30.44it/s]
 13%|█▎        | 127/1000 [00:04<00:28, 30.30it/s]
 13%|█▎        | 131/1000 [00:04<00:29, 29.86it/s]
 14%|█▎        | 135/1000 [00:04<00:28, 30.04it/s]
 14%|█▍        | 139/1000 [00:04<00:28, 30.15it/s]
 14%|█▍        | 143/1000 [00:04<00:28, 30.26it/s]
 15%|█▍        | 147/1000 [00:04<00:28, 30.36it/s]
 15%|█▌        | 151/1000 [00:04<00:27, 30.41it/s]
 16%|█▌        | 155/1000 [00:05<00:27, 30.47it/s]
 16%|█▌        | 159/1000 [00:05<00:27, 30.49it/s]
 16%|█▋        | 163/1000 [00:05<00:27, 30.39it/s]
 17%|█▋        | 167/1000 [00:05<00:27, 30.49it/s]
 17%|█▋        | 171/1000 [00:05<00:27, 30.50it/s]
 18%|█▊        | 175/1000 [00:05<00:26, 30.57it/s]
 18%|█▊        | 179/1000 [00:05<00:26, 30.59it/s]
 18%|█▊        | 183/1000 [00:06<00:26, 30.54it/s]
 19%|█▊        | 187/1000 [00:06<00:26, 30.53it/s]
 19%|█▉        | 191/1000 [00:06<00:26, 30.51it/s]
 20%|█▉        | 195/1000 [00:06<00:26, 30.38it/s]
 20%|█▉        | 199/1000 [00:06<00:26, 30.44it/s]
 20%|██        | 203/1000 [00:06<00:26, 30.47it/s]
 21%|██        | 207/1000 [00:06<00:26, 30.49it/s]
 21%|██        | 211/1000 [00:06<00:25, 30.48it/s]
 22%|██▏       | 215/1000 [00:07<00:25, 30.46it/s]
 22%|██▏       | 219/1000 [00:07<00:25, 30.51it/s]
 22%|██▏       | 223/1000 [00:07<00:25, 30.45it/s]
 23%|██▎       | 227/1000 [00:07<00:25, 30.55it/s]
 23%|██▎       | 231/1000 [00:07<00:25, 30.55it/s]
 24%|██▎       | 235/1000 [00:07<00:25, 30.58it/s]
 24%|██▍       | 239/1000 [00:07<00:24, 30.57it/s]
 24%|██▍       | 243/1000 [00:08<00:24, 30.67it/s]
 25%|██▍       | 247/1000 [00:08<00:24, 30.68it/s]
 25%|██▌       | 251/1000 [00:08<00:24, 30.66it/s]
 26%|██▌       | 255/1000 [00:08<00:24, 30.53it/s]
 26%|██▌       | 259/1000 [00:08<00:24, 30.56it/s]
 26%|██▋       | 263/1000 [00:08<00:24, 30.58it/s]
 27%|██▋       | 267/1000 [00:08<00:23, 30.63it/s]
 27%|██▋       | 271/1000 [00:08<00:23, 30.58it/s]
 28%|██▊       | 275/1000 [00:09<00:23, 30.56it/s]
 28%|██▊       | 279/1000 [00:09<00:23, 30.62it/s]
 28%|██▊       | 283/1000 [00:09<00:23, 30.64it/s]
 29%|██▊       | 287/1000 [00:09<00:23, 30.53it/s]
 29%|██▉       | 291/1000 [00:09<00:23, 30.60it/s]
 30%|██▉       | 295/1000 [00:09<00:22, 30.68it/s]
 30%|██▉       | 299/1000 [00:09<00:22, 30.76it/s]
 30%|███       | 303/1000 [00:09<00:22, 30.81it/s]
 31%|███       | 307/1000 [00:10<00:22, 30.83it/s]
 31%|███       | 311/1000 [00:10<00:22, 30.82it/s]
 32%|███▏      | 315/1000 [00:10<00:22, 30.65it/s]
 32%|███▏      | 319/1000 [00:10<00:22, 30.68it/s]
 32%|███▏      | 323/1000 [00:10<00:22, 30.70it/s]
 33%|███▎      | 327/1000 [00:10<00:21, 30.71it/s]
 33%|███▎      | 331/1000 [00:10<00:21, 30.75it/s]
 34%|███▎      | 335/1000 [00:11<00:21, 30.79it/s]
 34%|███▍      | 339/1000 [00:11<00:21, 30.76it/s]
 34%|███▍      | 343/1000 [00:11<00:21, 30.78it/s]
 35%|███▍      | 347/1000 [00:11<00:21, 30.56it/s]
 35%|███▌      | 351/1000 [00:11<00:21, 30.61it/s]
 36%|███▌      | 355/1000 [00:11<00:21, 30.55it/s]
 36%|███▌      | 359/1000 [00:11<00:20, 30.58it/s]
 36%|███▋      | 363/1000 [00:11<00:20, 30.60it/s]
 37%|███▋      | 367/1000 [00:12<00:20, 30.66it/s]
 37%|███▋      | 371/1000 [00:12<00:20, 30.70it/s]
 38%|███▊      | 375/1000 [00:12<00:20, 30.75it/s]
 38%|███▊      | 379/1000 [00:12<00:20, 30.60it/s]
 38%|███▊      | 383/1000 [00:12<00:20, 30.70it/s]
 39%|███▊      | 387/1000 [00:12<00:19, 30.75it/s]
 39%|███▉      | 391/1000 [00:12<00:19, 30.73it/s]
 40%|███▉      | 395/1000 [00:12<00:19, 30.68it/s]
 40%|███▉      | 399/1000 [00:13<00:19, 30.64it/s]
 40%|████      | 403/1000 [00:13<00:19, 30.64it/s]
 41%|████      | 407/1000 [00:13<00:19, 30.47it/s]
 41%|████      | 411/1000 [00:13<00:19, 30.50it/s]
 42%|████▏     | 415/1000 [00:13<00:19, 30.54it/s]
 42%|████▏     | 419/1000 [00:13<00:18, 30.58it/s]
 42%|████▏     | 423/1000 [00:13<00:18, 30.62it/s]
 43%|████▎     | 427/1000 [00:14<00:18, 30.68it/s]
 43%|████▎     | 431/1000 [00:14<00:18, 30.67it/s]
 44%|████▎     | 435/1000 [00:14<00:18, 30.61it/s]
 44%|████▍     | 439/1000 [00:14<00:18, 30.26it/s]
 44%|████▍     | 443/1000 [00:14<00:18, 30.33it/s]
 45%|████▍     | 447/1000 [00:14<00:18, 30.46it/s]
 45%|████▌     | 451/1000 [00:14<00:17, 30.52it/s]
 46%|████▌     | 455/1000 [00:14<00:17, 30.55it/s]
 46%|████▌     | 459/1000 [00:15<00:17, 30.56it/s]
 46%|████▋     | 463/1000 [00:15<00:17, 30.56it/s]
 47%|████▋     | 467/1000 [00:15<00:17, 30.63it/s]
 47%|████▋     | 471/1000 [00:15<00:17, 30.52it/s]
 48%|████▊     | 475/1000 [00:15<00:17, 30.51it/s]
 48%|████▊     | 479/1000 [00:15<00:17, 30.46it/s]
 48%|████▊     | 483/1000 [00:15<00:16, 30.47it/s]
 49%|████▊     | 487/1000 [00:15<00:16, 30.50it/s]
 49%|████▉     | 491/1000 [00:16<00:16, 30.52it/s]
 50%|████▉     | 495/1000 [00:16<00:16, 30.59it/s]
 50%|████▉     | 499/1000 [00:16<00:16, 30.51it/s]
 50%|█████     | 503/1000 [00:16<00:16, 30.53it/s]
 51%|█████     | 507/1000 [00:16<00:16, 30.61it/s]
 51%|█████     | 511/1000 [00:16<00:15, 30.58it/s]
 52%|█████▏    | 515/1000 [00:16<00:15, 30.51it/s]
 52%|█████▏    | 519/1000 [00:17<00:15, 30.48it/s]
 52%|█████▏    | 523/1000 [00:17<00:15, 30.45it/s]
 53%|█████▎    | 527/1000 [00:17<00:15, 30.47it/s]
 53%|█████▎    | 531/1000 [00:17<00:15, 30.30it/s]
 54%|█████▎    | 535/1000 [00:17<00:15, 30.01it/s]
 54%|█████▍    | 539/1000 [00:17<00:15, 30.11it/s]
 54%|█████▍    | 543/1000 [00:17<00:15, 30.21it/s]
 55%|█████▍    | 547/1000 [00:17<00:14, 30.27it/s]
 55%|█████▌    | 551/1000 [00:18<00:14, 30.32it/s]
 56%|█████▌    | 555/1000 [00:18<00:14, 30.49it/s]
 56%|█████▌    | 559/1000 [00:18<00:14, 30.41it/s]
 56%|█████▋    | 563/1000 [00:18<00:14, 30.50it/s]
 57%|█████▋    | 567/1000 [00:18<00:14, 30.54it/s]
 57%|█████▋    | 571/1000 [00:18<00:14, 30.59it/s]
 57%|█████▊    | 575/1000 [00:18<00:13, 30.69it/s]
 58%|█████▊    | 579/1000 [00:18<00:13, 30.73it/s]
 58%|█████▊    | 583/1000 [00:19<00:13, 30.74it/s]
 59%|█████▊    | 587/1000 [00:19<00:13, 30.73it/s]
 59%|█████▉    | 591/1000 [00:19<00:13, 30.65it/s]
 60%|█████▉    | 595/1000 [00:19<00:13, 30.70it/s]
 60%|█████▉    | 599/1000 [00:19<00:13, 30.70it/s]
 60%|██████    | 603/1000 [00:19<00:12, 30.66it/s]
 61%|██████    | 607/1000 [00:19<00:12, 30.61it/s]
 61%|██████    | 611/1000 [00:20<00:12, 30.69it/s]
 62%|██████▏   | 615/1000 [00:20<00:12, 30.70it/s]
 62%|██████▏   | 619/1000 [00:20<00:12, 30.61it/s]
 62%|██████▏   | 623/1000 [00:20<00:12, 30.41it/s]
 63%|██████▎   | 627/1000 [00:20<00:12, 30.40it/s]
 63%|██████▎   | 631/1000 [00:20<00:12, 30.46it/s]
 64%|██████▎   | 635/1000 [00:20<00:11, 30.48it/s]
 64%|██████▍   | 639/1000 [00:20<00:11, 30.50it/s]
 64%|██████▍   | 643/1000 [00:21<00:11, 30.49it/s]
 65%|██████▍   | 647/1000 [00:21<00:11, 30.44it/s]
 65%|██████▌   | 651/1000 [00:21<00:11, 30.25it/s]
 66%|██████▌   | 655/1000 [00:21<00:11, 30.31it/s]
 66%|██████▌   | 659/1000 [00:21<00:11, 30.41it/s]
 66%|██████▋   | 663/1000 [00:21<00:11, 30.43it/s]
 67%|██████▋   | 667/1000 [00:21<00:10, 30.50it/s]
 67%|██████▋   | 671/1000 [00:22<00:10, 30.54it/s]
 68%|██████▊   | 675/1000 [00:22<00:10, 30.50it/s]
 68%|██████▊   | 679/1000 [00:22<00:10, 30.47it/s]
 68%|██████▊   | 683/1000 [00:22<00:10, 30.28it/s]
 69%|██████▊   | 687/1000 [00:22<00:10, 30.34it/s]
 69%|██████▉   | 691/1000 [00:22<00:10, 30.39it/s]
 70%|██████▉   | 695/1000 [00:22<00:10, 30.47it/s]
 70%|██████▉   | 699/1000 [00:22<00:09, 30.42it/s]
 70%|███████   | 703/1000 [00:23<00:09, 30.44it/s]
 71%|███████   | 707/1000 [00:23<00:09, 30.47it/s]
 71%|███████   | 711/1000 [00:23<00:09, 30.55it/s]
 72%|███████▏  | 715/1000 [00:23<00:09, 30.47it/s]
 72%|███████▏  | 719/1000 [00:23<00:09, 30.50it/s]
 72%|███████▏  | 723/1000 [00:23<00:09, 30.61it/s]
 73%|███████▎  | 727/1000 [00:23<00:08, 30.66it/s]
 73%|███████▎  | 731/1000 [00:23<00:08, 30.75it/s]
 74%|███████▎  | 735/1000 [00:24<00:08, 30.80it/s]
 74%|███████▍  | 739/1000 [00:24<00:08, 30.80it/s]
 74%|███████▍  | 743/1000 [00:24<00:08, 30.47it/s]
 75%|███████▍  | 747/1000 [00:24<00:08, 30.51it/s]
 75%|███████▌  | 751/1000 [00:24<00:08, 30.52it/s]
 76%|███████▌  | 755/1000 [00:24<00:08, 30.47it/s]
 76%|███████▌  | 759/1000 [00:24<00:07, 30.49it/s]
 76%|███████▋  | 763/1000 [00:25<00:07, 30.49it/s]
 77%|███████▋  | 767/1000 [00:25<00:07, 30.45it/s]
 77%|███████▋  | 771/1000 [00:25<00:07, 30.45it/s]
 78%|███████▊  | 775/1000 [00:25<00:07, 30.32it/s]
 78%|███████▊  | 779/1000 [00:25<00:07, 30.36it/s]
 78%|███████▊  | 783/1000 [00:25<00:07, 30.41it/s]
 79%|███████▊  | 787/1000 [00:25<00:06, 30.52it/s]
 79%|███████▉  | 791/1000 [00:25<00:06, 30.56it/s]
 80%|███████▉  | 795/1000 [00:26<00:06, 30.54it/s]
 80%|███████▉  | 799/1000 [00:26<00:06, 30.48it/s]
 80%|████████  | 803/1000 [00:26<00:06, 30.30it/s]
 81%|████████  | 807/1000 [00:26<00:06, 30.38it/s]
 81%|████████  | 811/1000 [00:26<00:06, 30.49it/s]
 82%|████████▏ | 815/1000 [00:26<00:06, 30.50it/s]
 82%|████████▏ | 819/1000 [00:26<00:05, 30.47it/s]
 82%|████████▏ | 823/1000 [00:26<00:05, 30.48it/s]
 83%|████████▎ | 827/1000 [00:27<00:05, 30.46it/s]
 83%|████████▎ | 831/1000 [00:27<00:05, 30.41it/s]
 84%|████████▎ | 835/1000 [00:27<00:05, 30.34it/s]
 84%|████████▍ | 839/1000 [00:27<00:05, 30.41it/s]
 84%|████████▍ | 843/1000 [00:27<00:05, 30.42it/s]
 85%|████████▍ | 847/1000 [00:27<00:05, 30.44it/s]
 85%|████████▌ | 851/1000 [00:27<00:04, 30.46it/s]
 86%|████████▌ | 855/1000 [00:28<00:04, 30.44it/s]
 86%|████████▌ | 859/1000 [00:28<00:04, 30.54it/s]
 86%|████████▋ | 863/1000 [00:28<00:04, 30.62it/s]
 87%|████████▋ | 867/1000 [00:28<00:04, 30.53it/s]
 87%|████████▋ | 871/1000 [00:28<00:04, 30.52it/s]
 88%|████████▊ | 875/1000 [00:28<00:04, 30.56it/s]
 88%|████████▊ | 879/1000 [00:28<00:03, 30.56it/s]
 88%|████████▊ | 883/1000 [00:28<00:03, 30.52it/s]
 89%|████████▊ | 887/1000 [00:29<00:03, 30.55it/s]
 89%|████████▉ | 891/1000 [00:29<00:03, 30.57it/s]
 90%|████████▉ | 895/1000 [00:29<00:03, 30.47it/s]
 90%|████████▉ | 899/1000 [00:29<00:03, 30.50it/s]
 90%|█████████ | 903/1000 [00:29<00:03, 30.44it/s]
 91%|█████████ | 907/1000 [00:29<00:03, 30.42it/s]
 91%|█████████ | 911/1000 [00:29<00:02, 30.49it/s]
Traceback (most recent call last):
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/emcee/ensemble.py", line 640, in __call__
    return self.f(x, *self.args, **self.kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pastas/solver.py", line 1003, in log_probability
    return lp + self.log_likelihood(
                ^^^^^^^^^^^^^^^^^^^^
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pastas/solver.py", line 1040, in log_likelihood
    rv = self.misfit(
         ^^^^^^^^^^^^
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pastas/solver.py", line 121, in misfit
    rv = self.ml.residuals(p)
         ^^^^^^^^^^^^^^^^^^^^
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pastas/model.py", line 523, in residuals
    sim = self.simulate(p, tmin, tmax, freq, warmup, return_warmup=False)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pastas/model.py", line 454, in simulate
    sim = sim + self.constant.simulate(p[istart])
          ~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/ops/common.py", line 76, in new_method
    return method(self, other)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/arraylike.py", line 186, in __add__
    return self._arith_method(other, operator.add)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/series.py", line 6146, in _arith_method
    return base.IndexOpsMixin._arith_method(self, other, op)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/base.py", line 1393, in _arith_method
    return self._construct_result(result, name=res_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/series.py", line 6242, in _construct_result
    out = self._constructor(result, index=self.index, dtype=dtype, copy=False)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/series.py", line 588, in __init__
    data = SingleBlockManager.from_array(data, index, refs=refs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/internals/managers.py", line 1872, in from_array
    block = new_block(array, placement=bp, ndim=1, refs=refs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/internals/blocks.py", line 2789, in new_block
    def new_block(
    
KeyboardInterrupt
 91%|█████████▏| 913/1000 [00:29<00:02, 30.48it/s]
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In[11], line 2
      1 # Use the solver to run MCMC
----> 2 ml2.solve(
      3     solver=s,
      4     initial=False,
      5     tmin="1990",
      6     steps=1000,
      7     tune=True,
      8     report=False,
      9 )

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pastas/model.py:935, in Model.solve(self, tmin, tmax, freq, warmup, noise, solver, report, initial, weights, fit_constant, freq_obs, initialize, **kwargs)
    932     self.add_solver(solver=solver)
    934 # Solve model
--> 935 success, optimal, stderr = self.solver.solve(
    936     noise=self.settings["noise"], weights=weights, **kwargs
    937 )
    938 if not success:
    939     logger.warning("Model parameters could not be estimated well.")

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pastas/solver.py:955, in EmceeSolve.solve(self, noise, weights, steps, callback, **kwargs)
    944 else:
    945     self.sampler = emcee.EnsembleSampler(
    946         nwalkers=self.nwalkers,
    947         ndim=ndim,
   (...)    952         args=(noise, weights, callback),
    953     )
--> 955     self.sampler.run_mcmc(pinit, steps, progress=self.progress_bar, **kwargs)
    957 # Get optimal values
    958 optimal = self.initial.copy()

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/emcee/ensemble.py:450, in EnsembleSampler.run_mcmc(self, initial_state, nsteps, **kwargs)
    447     initial_state = self._previous_state
    449 results = None
--> 450 for results in self.sample(initial_state, iterations=nsteps, **kwargs):
    451     pass
    453 # Store so that the ``initial_state=None`` case will work

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/emcee/ensemble.py:409, in EnsembleSampler.sample(self, initial_state, log_prob0, rstate0, blobs0, iterations, tune, skip_initial_state_check, thin_by, thin, store, progress, progress_kwargs)
    406 move = self._random.choice(self._moves, p=self._weights)
    408 # Propose
--> 409 state, accepted = move.propose(model, state)
    410 state.random_state = self.random_state
    412 if tune:

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/emcee/moves/red_blue.py:93, in RedBlueMove.propose(self, model, state)
     90 q, factors = self.get_proposal(s, c, model.random)
     92 # Compute the lnprobs of the proposed position.
---> 93 new_log_probs, new_blobs = model.compute_log_prob_fn(q)
     95 # Loop over the walkers and update them accordingly.
     96 for i, (j, f, nlp) in enumerate(
     97     zip(all_inds[S1], factors, new_log_probs)
     98 ):

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/emcee/ensemble.py:496, in EnsembleSampler.compute_log_prob(self, coords)
    494     else:
    495         map_func = map
--> 496     results = list(map_func(self.log_prob_fn, p))
    498 try:
    499     # perhaps log_prob_fn returns blobs?
    500 
   (...)    504     # l is a length-1 array, np.array([1.234]). In that case blob
    505     # will become an empty list.
    506     blob = [l[1:] for l in results if len(l) > 1]

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/emcee/ensemble.py:640, in _FunctionWrapper.__call__(self, x)
    638 def __call__(self, x):
    639     try:
--> 640         return self.f(x, *self.args, **self.kwargs)
    641     except:  # pragma: no cover
    642         import traceback

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pastas/solver.py:1003, in EmceeSolve.log_probability(self, p, noise, weights, callback)
   1001     return -np.inf
   1002 else:
-> 1003     return lp + self.log_likelihood(
   1004         p, noise=noise, weights=weights, callback=callback
   1005     )

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pastas/solver.py:1040, in EmceeSolve.log_likelihood(self, p, noise, weights, callback)
   1037 # Set the parameters that are varied from the model and objective function
   1038 par[self.vary] = p
-> 1040 rv = self.misfit(
   1041     p=par[: -self.objective_function.nparam],
   1042     noise=noise,
   1043     weights=weights,
   1044     callback=callback,
   1045 )
   1047 lnlike = self.objective_function.compute(
   1048     rv, par[-self.objective_function.nparam :]
   1049 )
   1051 return lnlike

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pastas/solver.py:121, in BaseSolver.misfit(self, p, noise, weights, callback, returnseparate)
    118     rv = self.ml.noise(p) * self.ml.noise_weights(p)
    120 else:
--> 121     rv = self.ml.residuals(p)
    123 # Determine if weights need to be applied
    124 if weights is not None:

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pastas/model.py:523, in Model.residuals(self, p, tmin, tmax, freq, warmup)
    520     freq_obs = self.settings["freq_obs"]
    522 # simulate model
--> 523 sim = self.simulate(p, tmin, tmax, freq, warmup, return_warmup=False)
    525 # Get the oseries calibration series
    526 oseries_calib = self.observations(tmin, tmax, freq_obs)

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pastas/model.py:454, in Model.simulate(self, p, tmin, tmax, freq, warmup, return_warmup)
    452     istart += sm.nparam
    453 if self.constant:
--> 454     sim = sim + self.constant.simulate(p[istart])
    455     istart += 1
    456 if self.transform:

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/ops/common.py:76, in _unpack_zerodim_and_defer.<locals>.new_method(self, other)
     72             return NotImplemented
     74 other = item_from_zerodim(other)
---> 76 return method(self, other)

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/arraylike.py:186, in OpsMixin.__add__(self, other)
     98 @unpack_zerodim_and_defer("__add__")
     99 def __add__(self, other):
    100     """
    101     Get Addition of DataFrame and other, column-wise.
    102 
   (...)    184     moose     3.0     NaN
    185     """
--> 186     return self._arith_method(other, operator.add)

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/series.py:6146, in Series._arith_method(self, other, op)
   6144 def _arith_method(self, other, op):
   6145     self, other = self._align_for_op(other)
-> 6146     return base.IndexOpsMixin._arith_method(self, other, op)

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/base.py:1393, in IndexOpsMixin._arith_method(self, other, op)
   1390 with np.errstate(all="ignore"):
   1391     result = ops.arithmetic_op(lvalues, rvalues, op)
-> 1393 return self._construct_result(result, name=res_name)

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/series.py:6242, in Series._construct_result(self, result, name)
   6239 # TODO: result should always be ArrayLike, but this fails for some
   6240 #  JSONArray tests
   6241 dtype = getattr(result, "dtype", None)
-> 6242 out = self._constructor(result, index=self.index, dtype=dtype, copy=False)
   6243 out = out.__finalize__(self)
   6245 # Set the result's name after __finalize__ is called because __finalize__
   6246 #  would set it back to self.name

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/series.py:588, in Series.__init__(self, data, index, dtype, name, copy, fastpath)
    586 manager = _get_option("mode.data_manager", silent=True)
    587 if manager == "block":
--> 588     data = SingleBlockManager.from_array(data, index, refs=refs)
    589 elif manager == "array":
    590     data = SingleArrayManager.from_array(data, index)

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/internals/managers.py:1872, in SingleBlockManager.from_array(cls, array, index, refs)
   1870 array = maybe_coerce_values(array)
   1871 bp = BlockPlacement(slice(0, len(index)))
-> 1872 block = new_block(array, placement=bp, ndim=1, refs=refs)
   1873 return cls(block, index)

File ~/checkouts/readthedocs.org/user_builds/pastas/envs/v1.11.0/lib/python3.11/site-packages/pandas/core/internals/blocks.py:2789, in new_block(values, placement, ndim, refs)
   2785     values = maybe_coerce_values(values)
   2786     return klass(values, ndim=2, placement=placement, refs=refs)
-> 2789 def new_block(
   2790     values,
   2791     placement: BlockPlacement,
   2792     *,
   2793     ndim: int,
   2794     refs: BlockValuesRefs | None = None,
   2795 ) -> Block:
   2796     # caller is responsible for ensuring:
   2797     # - values is NOT a NumpyExtensionArray
   2798     # - check_ndim/ensure_block_shape already checked
   2799     # - maybe_coerce_values already called/unnecessary
   2800     klass = get_block_type(values.dtype)
   2801     return klass(values, ndim=ndim, placement=placement, refs=refs)

KeyboardInterrupt: 

3. Posterior parameter distributions#

The results from the MCMC analysis are stored in the sampler object, accessible through ml.solver.sampler variable. The object ml.solver.sampler.flatchain contains a Pandas DataFrame with \(n\) the parameter samples, where \(n\) is calculated as follows:

\(n = \frac{\left(\text{steps}-\text{burn}\right)\cdot\text{nwalkers}}{\text{thin}} \)

Corner.py#

Corner is a simple but great python package that makes creating corner graphs easy. A couple of lines of code suffice to create a plot of the parameter distributions and the covariances between the parameters.

# Corner plot of the results
fig = plt.figure(figsize=(8, 8))

labels = list(ml2.parameters.index[ml2.parameters.vary]) + list(
    ml2.solver.parameters.index[ml2.solver.parameters.vary]
)
labels = [label.split("_")[1] for label in labels]

best = list(ml2.parameters[ml2.parameters.vary].optimal) + list(
    ml2.solver.parameters[ml2.solver.parameters.vary].optimal
)

axes = corner.corner(
    ml2.solver.sampler.get_chain(flat=True, discard=500),
    quantiles=[0.025, 0.5, 0.975],
    labelpad=0.1,
    show_titles=True,
    title_kwargs=dict(fontsize=10),
    label_kwargs=dict(fontsize=10),
    max_n_ticks=3,
    fig=fig,
    labels=labels,
    truths=best,
)

plt.show()

4. The trace shows when MCMC converges#

The walkers take steps in different directions for each step. It is expected that after a number of steps, the direction of the step becomes random, as a sign that an optimum has been found. This can be checked by looking at the autocorrelation, which should be insignificant after a number of steps. Below we just show how to obtain the different chains, the interpretation of which is outside the scope of this notebook.

fig, axes = plt.subplots(len(labels), figsize=(10, 7), sharex=True)

samples = ml2.solver.sampler.get_chain(flat=True)
for i in range(len(labels)):
    ax = axes[i]
    ax.plot(samples[:, i], "k", alpha=0.5)
    ax.set_xlim(0, len(samples))
    ax.set_ylabel(labels[i])
    ax.yaxis.set_label_coords(-0.1, 0.5)

axes[-1].set_xlabel("step number")
mcn_params = pd.DataFrame(index=ls_params.index, columns=["mcn_opt", "mcn_sig"])
params = ml2.solver.sampler.get_chain(
    flat=True, discard=500
)  # discard first 500 of every chain
for iparam in range(params.shape[1] - 1):
    mcn_params.iloc[iparam] = np.median(params[:, iparam]), np.std(params[:, iparam])
mean_time_diff = head.index.to_series().diff().mean().total_seconds() / 86400

# Translate phi into the value of alpha also used by the noisemodel
mcn_params.loc["noise_alpha", "mcn_opt"] = -mean_time_diff / np.log(
    np.median(params[:, -1])
)
mcn_params.loc["noise_alpha", "mcn_sig"] = -mean_time_diff / np.log(
    np.std(params[:, -1])
)
pd.concat((ls_params, mcn_params), axis=1)

Repeat with uniform priors#

Set more or less uninformative uniform priors. Now also include \(\sigma^2\).

ml3 = ps.Model(head)
rm = ps.RechargeModel(
    rain, evap, recharge=ps.rch.Linear(), rfunc=ps.Gamma(), name="rch"
)
ml3.add_stressmodel(rm)
ml3.solve(report=False)

Uniform prior selected from 0.25 till 4 times the optimal values

# Set the initial parameters to a normal distribution
ml3.parameters["initial"] = ml3.parameters[
    "optimal"
]  # set initial value to the optimal from least squares for good starting point
for name in ml3.parameters.index:
    if ml3.parameters.loc[name, "optimal"] > 0:
        ml3.set_parameter(
            name,
            dist="uniform",
            pmin=0.25 * ml3.parameters.loc[name, "optimal"],
            pmax=4 * ml3.parameters.loc[name, "optimal"],
        )
    else:
        ml3.set_parameter(
            name,
            dist="uniform",
            pmin=4 * ml3.parameters.loc[name, "optimal"],
            pmax=0.25 * ml3.parameters.loc[name, "optimal"],
        )

ml3.parameters
# Choose the objective function
ln_prob = ps.objfunc.GaussianLikelihoodAr1()

# Create the EmceeSolver with some settings
s = ps.EmceeSolve(
    nwalkers=20,
    moves=emcee.moves.DEMove(),
    objective_function=ln_prob,
    progress_bar=True,
    parallel=False,
)

s.parameters.loc["ln_var", "initial"] = 0.05**2
s.parameters.loc["ln_var", "pmin"] = 0.05**2 / 4
s.parameters.loc["ln_var", "pmax"] = 4 * 0.05**2

# Use the solver to run MCMC
ml3.solve(
    solver=s,
    initial=False,
    tmin="1990",
    steps=1000,
    tune=True,
    report=False,
)
s.parameters
# Corner plot of the results
fig = plt.figure(figsize=(8, 8))

labels = list(ml3.parameters.index[ml3.parameters.vary]) + list(
    ml3.solver.parameters.index[ml3.solver.parameters.vary]
)
labels = [label.split("_")[1] for label in labels]

best = list(ml3.parameters[ml3.parameters.vary].optimal) + list(
    ml3.solver.parameters[ml3.solver.parameters.vary].optimal
)

axes = corner.corner(
    ml3.solver.sampler.get_chain(flat=True, discard=500),
    quantiles=[0.025, 0.5, 0.975],
    labelpad=0.1,
    show_titles=True,
    title_kwargs=dict(fontsize=10),
    label_kwargs=dict(fontsize=10),
    max_n_ticks=3,
    fig=fig,
    labels=labels,
    truths=best,
)

plt.show()
fig, axes = plt.subplots(len(labels), figsize=(10, 7), sharex=True)

samples = ml3.solver.sampler.get_chain(flat=True)
for i in range(len(labels)):
    ax = axes[i]
    ax.plot(samples[:, i], "k", alpha=0.5)
    ax.set_xlim(0, len(samples))
    ax.set_ylabel(labels[i])
    ax.yaxis.set_label_coords(-0.1, 0.5)

axes[-1].set_xlabel("step number")
mcu_params = pd.DataFrame(index=ls_params.index, columns=["mcu_opt", "mcu_sig"])
params = ml3.solver.sampler.get_chain(
    flat=True, discard=500
)  # discard first 500 of every chain
for iparam in range(params.shape[1] - 1):
    mcu_params.iloc[iparam] = np.median(params[:, iparam]), np.std(params[:, iparam])
mean_time_diff = head.index.to_series().diff().mean().total_seconds() / 86400
mcu_params.loc["noise_alpha", "mcu_opt"] = -mean_time_diff / np.log(
    np.median(params[:, -1])
)
mcu_params.loc["noise_alpha", "mcu_sig"] = -mean_time_diff / np.log(
    np.std(params[:, -1])
)
pd.concat((ls_params, mcn_params, mcu_params), axis=1)

5. Compute prediction interval#

nobs = len(head)
params = ml3.solver.sampler.get_chain(flat=True, discard=500)
sim = {}
# compute for 1000 random samples of chain
np.random.seed(1)
for i in np.random.choice(np.arange(10000), size=1000, replace=False):
    h = ml3.simulate(p=params[i, :-2])
    res = ml3.residuals(p=params[i, :-2])
    h += np.random.normal(loc=0, scale=np.std(res), size=len(h))
    sim[i] = h
simdf = pd.DataFrame.from_dict(sim, orient="columns", dtype=float)
alpha = 0.05
q = [alpha / 2, 1 - alpha / 2]
pi = simdf.quantile(q, axis=1).transpose()
pimean = np.mean(pi[0.975] - pi[0.025])
print(f"prediction interval emcee with uniform priors: {pimean:.3f} m")
print(f"PICP: {ps.stats.picp(head, pi):.3f}")

For this example, the prediction interval is dominated by the residuals not by the uncertainty of the parameters. In the code cell below, the parameter uncertainty is not included: the coverage only changes slightly and is mostly affected by the difference in randomly drawing residuals.

logprob = ml3.solver.sampler.compute_log_prob(
    ml3.solver.sampler.get_chain(flat=True, discard=500)
)[0]
imax = np.argmax(logprob)  # parameter set with larges likelihood
#
nobs = len(head)
params = ml3.solver.sampler.get_chain(flat=True, discard=500)
sim = {}
# compute for 1000 random samples of residuals, but one parameter set
h = ml3.simulate(p=params[imax, :-2])
res = ml3.residuals(p=params[imax, :-2])
np.random.seed(1)
for i in range(1000):
    sim[i] = h + np.random.normal(loc=0, scale=np.std(res), size=len(h))
simdf = pd.DataFrame.from_dict(sim, orient="columns", dtype=float)
alpha = 0.05
q = [alpha / 2, 1 - alpha / 2]
pi = simdf.quantile(q, axis=1).transpose()
pimean = np.mean(pi[0.975] - pi[0.025])
print(f"prediction interval emcee with uniform priors: {pimean:.3f} m")
print(f"PICP: {ps.stats.picp(head, pi):.3f}")