Frequentist uncertainty vs. Bayesian uncertainty analysis#
Mark Bakker, TU Delft & Raoul Collenteur, Eawag, February, 2025
In this notebook, the fit and uncertainty are compared for pastas
models solved with least squares (frequentist uncertainty) and with MCMC (Bayesian uncertainty).
Besides Pastas the following Python Packages have to be installed to run this notebook:
import corner
import emcee
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pastas as ps
ps.set_log_level("ERROR")
ps.show_versions()
Pastas version: 1.9.0
Python version: 3.11.10
NumPy version: 2.2.5
Pandas version: 2.2.3
SciPy version: 1.15.2
Matplotlib version: 3.10.1
Numba version: 0.61.2
DeprecationWarning: As of Pastas 1.5, no noisemodel is added to the pastas Model class by default anymore. To solve your model using a noisemodel, you have to explicitly add a noisemodel to your model before solving. For more information, and how to adapt your code, please see this issue on GitHub: https://github.com/pastas/pastas/issues/735
1. A ‘regular’ Pastas Model#
The first step is to create a Pastas Model with a linear RechargeModel
and a Gamma
response function to simulate the effect of precipitation and evaporation on the heads. The AR1 noise model is used. We first estimate the model parameters using the standard least-squares approach.
head = pd.read_csv(
"data/B32C0639001.csv", parse_dates=["date"], index_col="date"
).squeeze()
head = head["1990":] # use data from 1990 on for this example
evap = pd.read_csv("data/evap_260.csv", index_col=0, parse_dates=[0]).squeeze()
rain = pd.read_csv("data/rain_260.csv", index_col=0, parse_dates=[0]).squeeze()
ml1 = ps.Model(head)
ml1.add_noisemodel(ps.ArNoiseModel())
rm = ps.RechargeModel(
rain, evap, recharge=ps.rch.Linear(), rfunc=ps.Gamma(), name="rch"
)
ml1.add_stressmodel(rm)
ml1.solve()
ax = ml1.plots.results(figsize=(10, 4))
Fit report head Fit Statistics
================================================
nfev 23 EVP 84.78
nobs 351 R2 0.85
noise True RMSE 0.08
tmin 1990-01-02 00:00:00 AICc -1976.58
tmax 2005-10-14 00:00:00 BIC -1953.65
freq D Obj 0.61
warmup 3650 days 00:00:00 ___
solver LeastSquares Interp. No
Parameters (6 optimized)
================================================
optimal initial vary
rch_A 0.307036 0.198424 True
rch_n 0.872701 1.000000 True
rch_a 145.956334 10.000000 True
rch_f -0.639200 -1.000000 True
constant_d 0.935562 1.338063 True
noise_alpha 41.250126 15.000000 True
Warnings! (1)
================================================
Response tmax for 'rch' > than warmup period.

The diagnostics show that the noise meets the statistical requirements for uncertainty analysis reasonably well.
ml1.plots.diagnostics();

The estimated least squares parameters and standard errors are stored for later reference
ls_params = ml1.parameters[["optimal", "stderr"]].copy()
ls_params.rename(columns={"optimal": "ls_opt", "stderr": "ls_sig"}, inplace=True)
ls_params
ls_opt | ls_sig | |
---|---|---|
rch_A | 0.307036 | 0.021350 |
rch_n | 0.872701 | 0.026214 |
rch_a | 145.956334 | 18.398060 |
rch_f | -0.639200 | 0.069405 |
constant_d | 0.935562 | 0.048531 |
noise_alpha | 41.250126 | 6.332870 |
# Compute prediction interval Pastas
pi = ml1.solver.prediction_interval(n=1000)
ax = ml1.plot(figsize=(10, 3))
ax.fill_between(pi.index, pi.iloc[:, 0], pi.iloc[:, 1], color="lightgray")
ax.legend(["Observations", "Simulation", "95% Prediction interval"], ncol=3, loc=2)
pi_pasta = np.mean(pi[0.975] - pi[0.025])
print(f"Mean prediction interval width: {pi_pasta:.3f} m")
print(f"Prediction interval coverage probability: {ps.stats.picp(head, pi): .3f}")
Mean prediction interval width: 0.324 m
Prediction interval coverage probability: 0.949

2. Use the EmceeSolver#
We will now use MCMC to estimate the model parameters and their uncertainties. The EmceeSolve
solver wraps the Emcee package, which implements different versions of MCMC. A good understanding of Emcee helps when using this solver, so it comes recommended to check out their documentation as well.
We start by making a pastas
model with a linear recharge model and a Gamma response function. No noise model is added, as this is taken care of in the likelihood function. The model is solved using the regular solve (least squares) to have a good estimate of the starting values of the parameters.
ml2 = ps.Model(head)
rm = ps.RechargeModel(
rain, evap, recharge=ps.rch.Linear(), rfunc=ps.Gamma(), name="rch"
)
ml2.add_stressmodel(rm)
ml2.solve()
Fit report head Fit Statistics
================================================
nfev 18 EVP 86.58
nobs 351 R2 0.87
noise False RMSE 0.08
tmin 1990-01-02 00:00:00 AICc -1797.03
tmax 2005-10-14 00:00:00 BIC -1777.90
freq D Obj 1.02
warmup 3650 days 00:00:00 ___
solver LeastSquares Interp. No
Parameters (5 optimized)
================================================
optimal initial vary
rch_A 0.314026 0.198424 True
rch_n 0.800945 1.000000 True
rch_a 235.019665 10.000000 True
rch_f -0.898129 -1.000000 True
constant_d 1.051883 1.338063 True
Warnings! (1)
================================================
Response tmax for 'rch' > than warmup period.
To set up the EmceeSolve
solver, a number of decisions need to be made:
Select the priors of the parameters
Select a (log) likelihood function
Select the number of steps and thinning
2a. Priors#
The first step is to select the priors of the parameters. This is done by using the ml.set_parameter
method and the dist
argument (from distribution). Any distribution from scipy.stats
can be chosen url, for example uniform
, norm
, or lognorm
. Here, we select normal distributions for the priors. Currently, pastas
will use the initial
value of the parameter for the loc
argument of the distribution (e.g., the mean of a normal distribution), and the stderr
as the scale
argument (e.g., the standard deviation of a normal distribution). Only for the parameters with a uniform
distribution, the pmin
and pmax
values are used to determine a uniform prior. By default, all parameters are assigned a uniform
prior.
# Set the initial parameters to a normal distribution
ml2.parameters["initial"] = ml2.parameters[
"optimal"
] # set initial value to the optimal from least squares for good starting point
ml2.parameters["stderr"] = (
2 * ml2.parameters["stderr"]
) # this column is used (for now) to set the scale of the normal distribution
for name in ml2.parameters.index:
ml2.set_parameter(
name,
dist="norm",
)
ml2.parameters
initial | pmin | pmax | vary | name | dist | stderr | optimal | |
---|---|---|---|---|---|---|---|---|
rch_A | 0.314026 | 0.00001 | 19.842364 | True | rch | norm | 0.022943 | 0.314026 |
rch_n | 0.800945 | 0.01000 | 100.000000 | True | rch | norm | 0.049724 | 0.800945 |
rch_a | 235.019665 | 0.01000 | 10000.000000 | True | rch | norm | 46.257951 | 235.019665 |
rch_f | -0.898129 | -2.00000 | 0.000000 | True | rch | norm | 0.107651 | -0.898129 |
constant_d | 1.051883 | NaN | NaN | True | constant | norm | 0.057945 | 1.051883 |
2b. Create the solver instance#
The next step is to create an instance of the EmceeSolve
solver class. At this stage all the settings need to be provided on how the Ensemble Sampler is created (https://emcee.readthedocs.io/en/stable/user/sampler/). Important settings are the nwalkers
, the moves
, the objective_function
. More advanced options are to parallelize the MCMC algorithm (parallel=True
), and to set a backend to store the results. Here’s an example:
# Choose the objective function
ln_prob = ps.objfunc.GaussianLikelihoodAr1()
# Create the EmceeSolver with some settings
s = ps.EmceeSolve(
nwalkers=20,
moves=emcee.moves.DEMove(),
objective_function=ln_prob,
progress_bar=True,
parallel=False,
)
In the above code we created an EmceeSolve
instance with 20 walkers, which take steps according to the DEMove
move algorithm (see Emcee docs), and a Gaussian likelihood function that assumes AR1 correlated errors. Different objective functions are available, see the Pastas documentation on the different options.
Depending on the likelihood function, a number of additional parameters need to be inferred. These parameters are not added to the Pastas Model instance, but are available from the solver object. Using the set_parameter
method of the solver, these parameters can be changed. In this example where we use the GaussianLikelihoodAr1
function, the \(\sigma^2\) and \(\phi\) are estimated; the unknown standard deviation of the errors and the autoregressive parameter.
s.parameters
initial | pmin | pmax | vary | stderr | name | dist | |
---|---|---|---|---|---|---|---|
ln_var | 0.05 | 1.000000e-10 | 1.00000 | True | 0.01 | ln | uniform |
ln_phi | 0.50 | 1.000000e-10 | 0.99999 | True | 0.20 | ln | uniform |
sigsq = ml1.noise().std() ** 2
s.set_parameter("ln_var", initial=sigsq, vary=True)
s.parameters.loc["ln_var", "stderr"] = stderr = sigsq / 8
s.parameters
initial | pmin | pmax | vary | stderr | name | dist | |
---|---|---|---|---|---|---|---|
ln_var | 0.00347 | 1.000000e-10 | 1.00000 | True | 0.000434 | ln | uniform |
ln_phi | 0.50000 | 1.000000e-10 | 0.99999 | True | 0.200000 | ln | uniform |
2c. Run the solver and solve the model#
After setting the parameters and creating a EmceeSolve solver instance we are now ready to run the MCMC analysis. We can do this by running ml.solve
. We can pass the same parameters that we normally provide to this method (e.g., tmin
or fit_constant
). Here we use the initial parameters from our least-square solve, and do not fit a noise model, because we take autocorrelated errors into account through the likelihood function.
All the arguments that are not used by ml.solve
, for example steps
and tune
, are passed on to the run_mcmc
method from the sampler (see Emcee docs). The most important is the steps
argument, that determines how many steps each of the walkers takes.
# Use the solver to run MCMC
ml2.solve(
solver=s,
initial=False,
tmin="1990",
steps=1000,
tune=True,
report=False,
)
emcee: Exception while calling your likelihood function:
params: [ 3.19568071e-01 8.05813148e-01 2.02607380e+02 -7.65523361e-01
9.88256509e-01 3.56659646e-03 6.32732727e-01]
args: (False, None, None)
kwargs: {}
exception:
0%| | 0/1000 [00:00<?, ?it/s]
0%| | 3/1000 [00:00<00:33, 29.56it/s]
1%| | 6/1000 [00:00<00:33, 29.56it/s]
1%| | 9/1000 [00:00<00:33, 29.59it/s]
1%| | 12/1000 [00:00<00:33, 29.51it/s]
2%|▏ | 15/1000 [00:00<00:33, 29.47it/s]
2%|▏ | 18/1000 [00:00<00:33, 29.51it/s]
2%|▏ | 21/1000 [00:00<00:33, 29.53it/s]
2%|▏ | 24/1000 [00:00<00:32, 29.65it/s]
3%|▎ | 27/1000 [00:00<00:32, 29.71it/s]
3%|▎ | 30/1000 [00:01<00:32, 29.72it/s]
3%|▎ | 33/1000 [00:01<00:32, 29.76it/s]
4%|▎ | 37/1000 [00:01<00:32, 29.90it/s]
4%|▍ | 40/1000 [00:01<00:32, 29.93it/s]
4%|▍ | 44/1000 [00:01<00:31, 29.96it/s]
5%|▍ | 48/1000 [00:01<00:31, 30.03it/s]
5%|▌ | 52/1000 [00:01<00:31, 30.10it/s]
6%|▌ | 56/1000 [00:01<00:31, 30.21it/s]
6%|▌ | 60/1000 [00:02<00:31, 30.32it/s]
6%|▋ | 64/1000 [00:02<00:30, 30.43it/s]
7%|▋ | 68/1000 [00:02<00:30, 30.54it/s]
7%|▋ | 72/1000 [00:02<00:30, 30.59it/s]
8%|▊ | 76/1000 [00:02<00:30, 30.53it/s]
8%|▊ | 80/1000 [00:02<00:30, 30.57it/s]
8%|▊ | 84/1000 [00:02<00:29, 30.62it/s]
9%|▉ | 88/1000 [00:02<00:29, 30.69it/s]
9%|▉ | 92/1000 [00:03<00:29, 30.68it/s]
10%|▉ | 96/1000 [00:03<00:29, 30.61it/s]
10%|█ | 100/1000 [00:03<00:29, 30.68it/s]
10%|█ | 104/1000 [00:03<00:29, 30.72it/s]
11%|█ | 108/1000 [00:03<00:28, 30.77it/s]
11%|█ | 112/1000 [00:03<00:28, 30.70it/s]
12%|█▏ | 116/1000 [00:03<00:28, 30.80it/s]
12%|█▏ | 120/1000 [00:03<00:28, 30.87it/s]
12%|█▏ | 124/1000 [00:04<00:28, 30.85it/s]
13%|█▎ | 128/1000 [00:04<00:28, 30.86it/s]
13%|█▎ | 132/1000 [00:04<00:28, 30.89it/s]
14%|█▎ | 136/1000 [00:04<00:27, 30.90it/s]
14%|█▍ | 140/1000 [00:04<00:27, 30.88it/s]
14%|█▍ | 144/1000 [00:04<00:27, 30.85it/s]
15%|█▍ | 148/1000 [00:04<00:27, 30.90it/s]
15%|█▌ | 152/1000 [00:04<00:27, 30.87it/s]
16%|█▌ | 156/1000 [00:05<00:27, 30.84it/s]
16%|█▌ | 160/1000 [00:05<00:27, 30.77it/s]
16%|█▋ | 164/1000 [00:05<00:27, 30.68it/s]
17%|█▋ | 168/1000 [00:05<00:27, 30.62it/s]
17%|█▋ | 172/1000 [00:05<00:27, 30.65it/s]
18%|█▊ | 176/1000 [00:05<00:26, 30.66it/s]
18%|█▊ | 180/1000 [00:05<00:26, 30.67it/s]
18%|█▊ | 184/1000 [00:06<00:26, 30.70it/s]
19%|█▉ | 188/1000 [00:06<00:26, 30.66it/s]
19%|█▉ | 192/1000 [00:06<00:26, 30.68it/s]
20%|█▉ | 196/1000 [00:06<00:26, 30.75it/s]
20%|██ | 200/1000 [00:06<00:25, 30.82it/s]
20%|██ | 204/1000 [00:06<00:25, 30.86it/s]
21%|██ | 208/1000 [00:06<00:25, 30.93it/s]
21%|██ | 212/1000 [00:06<00:25, 30.87it/s]
22%|██▏ | 216/1000 [00:07<00:25, 30.80it/s]
22%|██▏ | 220/1000 [00:07<00:25, 30.81it/s]
22%|██▏ | 224/1000 [00:07<00:25, 30.77it/s]
23%|██▎ | 228/1000 [00:07<00:25, 30.81it/s]
23%|██▎ | 232/1000 [00:07<00:24, 30.88it/s]
24%|██▎ | 236/1000 [00:07<00:24, 30.85it/s]
24%|██▍ | 240/1000 [00:07<00:24, 30.75it/s]
24%|██▍ | 244/1000 [00:07<00:24, 30.75it/s]
25%|██▍ | 248/1000 [00:08<00:24, 30.77it/s]
25%|██▌ | 252/1000 [00:08<00:24, 30.79it/s]
26%|██▌ | 256/1000 [00:08<00:24, 30.73it/s]
26%|██▌ | 260/1000 [00:08<00:24, 30.72it/s]
26%|██▋ | 264/1000 [00:08<00:23, 30.79it/s]
27%|██▋ | 268/1000 [00:08<00:23, 30.81it/s]
27%|██▋ | 272/1000 [00:08<00:23, 30.85it/s]
28%|██▊ | 276/1000 [00:09<00:23, 30.81it/s]
28%|██▊ | 280/1000 [00:09<00:23, 30.80it/s]
28%|██▊ | 284/1000 [00:09<00:23, 30.83it/s]
29%|██▉ | 288/1000 [00:09<00:23, 30.82it/s]
29%|██▉ | 292/1000 [00:09<00:22, 30.86it/s]
30%|██▉ | 296/1000 [00:09<00:22, 30.87it/s]
30%|███ | 300/1000 [00:09<00:22, 30.83it/s]
30%|███ | 304/1000 [00:09<00:22, 30.87it/s]
31%|███ | 308/1000 [00:10<00:22, 30.86it/s]
31%|███ | 312/1000 [00:10<00:22, 30.85it/s]
32%|███▏ | 316/1000 [00:10<00:22, 30.83it/s]
32%|███▏ | 320/1000 [00:10<00:22, 30.67it/s]
32%|███▏ | 324/1000 [00:10<00:22, 30.67it/s]
33%|███▎ | 328/1000 [00:10<00:21, 30.72it/s]
33%|███▎ | 332/1000 [00:10<00:21, 30.77it/s]
34%|███▎ | 336/1000 [00:10<00:21, 30.87it/s]
34%|███▍ | 340/1000 [00:11<00:21, 30.84it/s]
34%|███▍ | 344/1000 [00:11<00:21, 30.80it/s]
35%|███▍ | 348/1000 [00:11<00:21, 30.79it/s]
35%|███▌ | 352/1000 [00:11<00:21, 30.75it/s]
36%|███▌ | 356/1000 [00:11<00:20, 30.82it/s]
36%|███▌ | 360/1000 [00:11<00:20, 30.85it/s]
36%|███▋ | 364/1000 [00:11<00:20, 30.89it/s]
37%|███▋ | 368/1000 [00:12<00:20, 30.95it/s]
37%|███▋ | 372/1000 [00:12<00:20, 31.00it/s]
38%|███▊ | 376/1000 [00:12<00:20, 31.02it/s]
38%|███▊ | 380/1000 [00:12<00:20, 30.94it/s]
38%|███▊ | 384/1000 [00:12<00:19, 30.87it/s]
39%|███▉ | 388/1000 [00:12<00:19, 30.77it/s]
39%|███▉ | 392/1000 [00:12<00:19, 30.78it/s]
40%|███▉ | 396/1000 [00:12<00:19, 30.78it/s]
40%|████ | 400/1000 [00:13<00:19, 30.80it/s]
40%|████ | 404/1000 [00:13<00:19, 30.73it/s]
41%|████ | 408/1000 [00:13<00:19, 30.79it/s]
41%|████ | 412/1000 [00:13<00:19, 30.74it/s]
42%|████▏ | 416/1000 [00:13<00:18, 30.77it/s]
42%|████▏ | 420/1000 [00:13<00:18, 30.77it/s]
42%|████▏ | 424/1000 [00:13<00:18, 30.76it/s]
43%|████▎ | 428/1000 [00:13<00:18, 30.83it/s]
43%|████▎ | 432/1000 [00:14<00:18, 30.81it/s]
44%|████▎ | 436/1000 [00:14<00:18, 30.91it/s]
44%|████▍ | 440/1000 [00:14<00:18, 30.99it/s]
44%|████▍ | 444/1000 [00:14<00:17, 31.00it/s]
45%|████▍ | 448/1000 [00:14<00:17, 30.95it/s]
45%|████▌ | 452/1000 [00:14<00:17, 30.91it/s]
46%|████▌ | 456/1000 [00:14<00:17, 30.83it/s]
46%|████▌ | 460/1000 [00:14<00:17, 30.75it/s]
46%|████▋ | 464/1000 [00:15<00:17, 30.82it/s]
47%|████▋ | 468/1000 [00:15<00:17, 30.68it/s]
47%|████▋ | 472/1000 [00:15<00:17, 30.66it/s]
48%|████▊ | 476/1000 [00:15<00:17, 30.72it/s]
48%|████▊ | 480/1000 [00:15<00:16, 30.79it/s]
48%|████▊ | 484/1000 [00:15<00:16, 30.81it/s]
49%|████▉ | 488/1000 [00:15<00:16, 30.82it/s]
49%|████▉ | 492/1000 [00:16<00:16, 30.77it/s]
50%|████▉ | 496/1000 [00:16<00:16, 30.84it/s]
50%|█████ | 500/1000 [00:16<00:16, 30.76it/s]
50%|█████ | 504/1000 [00:16<00:16, 30.83it/s]
51%|█████ | 508/1000 [00:16<00:15, 30.83it/s]
51%|█████ | 512/1000 [00:16<00:15, 30.83it/s]
52%|█████▏ | 516/1000 [00:16<00:15, 30.76it/s]
52%|█████▏ | 520/1000 [00:16<00:15, 30.83it/s]
52%|█████▏ | 524/1000 [00:17<00:15, 30.84it/s]
53%|█████▎ | 528/1000 [00:17<00:15, 30.83it/s]
53%|█████▎ | 532/1000 [00:17<00:15, 30.80it/s]
54%|█████▎ | 536/1000 [00:17<00:15, 30.83it/s]
54%|█████▍ | 540/1000 [00:17<00:14, 30.84it/s]
54%|█████▍ | 544/1000 [00:17<00:14, 30.82it/s]
55%|█████▍ | 548/1000 [00:17<00:14, 30.72it/s]
55%|█████▌ | 552/1000 [00:17<00:14, 30.80it/s]
56%|█████▌ | 556/1000 [00:18<00:14, 30.81it/s]
56%|█████▌ | 560/1000 [00:18<00:14, 30.86it/s]
56%|█████▋ | 564/1000 [00:18<00:14, 30.85it/s]
57%|█████▋ | 568/1000 [00:18<00:14, 30.85it/s]
57%|█████▋ | 572/1000 [00:18<00:13, 30.85it/s]
58%|█████▊ | 576/1000 [00:18<00:13, 30.84it/s]
58%|█████▊ | 580/1000 [00:18<00:13, 30.86it/s]
58%|█████▊ | 584/1000 [00:19<00:13, 30.84it/s]
59%|█████▉ | 588/1000 [00:19<00:13, 30.80it/s]
59%|█████▉ | 592/1000 [00:19<00:13, 30.71it/s]
60%|█████▉ | 596/1000 [00:19<00:13, 30.75it/s]
60%|██████ | 600/1000 [00:19<00:12, 30.78it/s]
60%|██████ | 604/1000 [00:19<00:12, 30.81it/s]
61%|██████ | 608/1000 [00:19<00:12, 30.77it/s]
61%|██████ | 612/1000 [00:19<00:12, 30.77it/s]
62%|██████▏ | 616/1000 [00:20<00:12, 30.75it/s]
62%|██████▏ | 620/1000 [00:20<00:12, 30.75it/s]
62%|██████▏ | 624/1000 [00:20<00:12, 30.75it/s]
63%|██████▎ | 628/1000 [00:20<00:12, 30.78it/s]
63%|██████▎ | 632/1000 [00:20<00:11, 30.82it/s]
64%|██████▎ | 636/1000 [00:20<00:11, 30.81it/s]
64%|██████▍ | 640/1000 [00:20<00:11, 30.80it/s]
64%|██████▍ | 644/1000 [00:20<00:11, 30.81it/s]
65%|██████▍ | 648/1000 [00:21<00:11, 30.75it/s]
65%|██████▌ | 652/1000 [00:21<00:11, 30.70it/s]
66%|██████▌ | 656/1000 [00:21<00:11, 30.66it/s]
66%|██████▌ | 660/1000 [00:21<00:11, 30.59it/s]
66%|██████▋ | 664/1000 [00:21<00:10, 30.65it/s]
67%|██████▋ | 668/1000 [00:21<00:10, 30.68it/s]
67%|██████▋ | 672/1000 [00:21<00:10, 30.74it/s]
68%|██████▊ | 676/1000 [00:22<00:10, 30.84it/s]
68%|██████▊ | 680/1000 [00:22<00:10, 30.87it/s]
68%|██████▊ | 684/1000 [00:22<00:10, 30.88it/s]
69%|██████▉ | 688/1000 [00:22<00:10, 30.83it/s]
69%|██████▉ | 692/1000 [00:22<00:09, 30.81it/s]
70%|██████▉ | 696/1000 [00:22<00:09, 30.76it/s]
70%|███████ | 700/1000 [00:22<00:09, 30.78it/s]
70%|███████ | 704/1000 [00:22<00:09, 30.83it/s]
71%|███████ | 708/1000 [00:23<00:09, 30.88it/s]
71%|███████ | 712/1000 [00:23<00:09, 30.90it/s]
72%|███████▏ | 716/1000 [00:23<00:09, 30.88it/s]
72%|███████▏ | 720/1000 [00:23<00:09, 30.87it/s]
72%|███████▏ | 724/1000 [00:23<00:08, 30.79it/s]
73%|███████▎ | 728/1000 [00:23<00:08, 30.71it/s]
73%|███████▎ | 732/1000 [00:23<00:08, 30.74it/s]
74%|███████▎ | 736/1000 [00:23<00:08, 30.75it/s]
74%|███████▍ | 740/1000 [00:24<00:08, 30.76it/s]
74%|███████▍ | 744/1000 [00:24<00:08, 30.84it/s]
75%|███████▍ | 748/1000 [00:24<00:08, 30.87it/s]
75%|███████▌ | 752/1000 [00:24<00:08, 30.85it/s]
76%|███████▌ | 756/1000 [00:24<00:07, 30.81it/s]
76%|███████▌ | 760/1000 [00:24<00:07, 30.86it/s]
76%|███████▋ | 764/1000 [00:24<00:07, 30.91it/s]
77%|███████▋ | 768/1000 [00:24<00:07, 30.96it/s]
77%|███████▋ | 772/1000 [00:25<00:07, 30.91it/s]
78%|███████▊ | 776/1000 [00:25<00:07, 30.88it/s]
78%|███████▊ | 780/1000 [00:25<00:07, 30.71it/s]
78%|███████▊ | 784/1000 [00:25<00:07, 30.70it/s]
79%|███████▉ | 788/1000 [00:25<00:06, 30.74it/s]
79%|███████▉ | 792/1000 [00:25<00:06, 30.74it/s]
80%|███████▉ | 796/1000 [00:25<00:06, 30.69it/s]
80%|████████ | 800/1000 [00:26<00:06, 30.71it/s]
80%|████████ | 804/1000 [00:26<00:06, 30.69it/s]
81%|████████ | 808/1000 [00:26<00:06, 30.73it/s]
81%|████████ | 812/1000 [00:26<00:06, 30.74it/s]
82%|████████▏ | 816/1000 [00:26<00:05, 30.73it/s]
82%|████████▏ | 820/1000 [00:26<00:05, 30.78it/s]
82%|████████▏ | 824/1000 [00:26<00:05, 30.80it/s]
83%|████████▎ | 828/1000 [00:26<00:05, 30.77it/s]
83%|████████▎ | 832/1000 [00:27<00:05, 30.84it/s]
84%|████████▎ | 836/1000 [00:27<00:05, 30.83it/s]
84%|████████▍ | 840/1000 [00:27<00:05, 30.74it/s]
84%|████████▍ | 844/1000 [00:27<00:05, 30.68it/s]
85%|████████▍ | 848/1000 [00:27<00:04, 30.68it/s]
85%|████████▌ | 852/1000 [00:27<00:04, 30.73it/s]
86%|████████▌ | 856/1000 [00:27<00:04, 30.77it/s]
86%|████████▌ | 860/1000 [00:27<00:04, 30.82it/s]
86%|████████▋ | 864/1000 [00:28<00:04, 30.86it/s]
87%|████████▋ | 868/1000 [00:28<00:04, 30.96it/s]
87%|████████▋ | 872/1000 [00:28<00:04, 30.99it/s]
88%|████████▊ | 876/1000 [00:28<00:04, 30.93it/s]
88%|████████▊ | 880/1000 [00:28<00:03, 30.98it/s]
88%|████████▊ | 884/1000 [00:28<00:03, 31.02it/s]
89%|████████▉ | 888/1000 [00:28<00:03, 31.02it/s]
89%|████████▉ | 892/1000 [00:29<00:03, 31.04it/s]
90%|████████▉ | 896/1000 [00:29<00:03, 31.06it/s]
90%|█████████ | 900/1000 [00:29<00:03, 31.06it/s]
90%|█████████ | 904/1000 [00:29<00:03, 30.97it/s]
91%|█████████ | 908/1000 [00:29<00:02, 30.94it/s]
91%|█████████ | 912/1000 [00:29<00:02, 30.84it/s]
92%|█████████▏| 916/1000 [00:29<00:02, 30.81it/s]
92%|█████████▏| 920/1000 [00:29<00:02, 30.75it/s]
Traceback (most recent call last):
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/emcee/ensemble.py", line 640, in __call__
return self.f(x, *self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pastas/solver.py", line 1004, in log_probability
return lp + self.log_likelihood(
^^^^^^^^^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pastas/solver.py", line 1041, in log_likelihood
rv = self.misfit(
^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pastas/solver.py", line 122, in misfit
rv = self.ml.residuals(p)
^^^^^^^^^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pastas/model.py", line 545, in residuals
sim_interpolated = sim.reindex(oseries_calib.index)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/series.py", line 5153, in reindex
return super().reindex(
^^^^^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/generic.py", line 5610, in reindex
return self._reindex_axes(
^^^^^^^^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/generic.py", line 5638, in _reindex_axes
obj = obj._reindex_with_indexers(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/generic.py", line 5686, in _reindex_with_indexers
new_data = new_data.reindex_indexer(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/internals/managers.py", line 680, in reindex_indexer
new_blocks = self._slice_take_blocks_ax0(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/internals/managers.py", line 773, in _slice_take_blocks_ax0
blk.take_nd(
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/internals/blocks.py", line 1307, in take_nd
new_values = algos.take_nd(
^^^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/array_algos/take.py", line 117, in take_nd
return _take_nd_ndarray(arr, indexer, axis, fill_value, allow_fill)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/array_algos/take.py", line 133, in _take_nd_ndarray
dtype, fill_value, mask_info = _take_preprocess_indexer_and_fill_value(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/array_algos/take.py", line 579, in _take_preprocess_indexer_and_fill_value
dtype, fill_value = maybe_promote(arr.dtype, fill_value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/docs/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/dtypes/cast.py", line 590, in maybe_promote
dtype, fill_value = _maybe_promote_cached(
^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
92%|█████████▏| 920/1000 [00:29<00:02, 30.71it/s]
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
Cell In[11], line 2
1 # Use the solver to run MCMC
----> 2 ml2.solve(
3 solver=s,
4 initial=False,
5 tmin="1990",
6 steps=1000,
7 tune=True,
8 report=False,
9 )
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pastas/model.py:937, in Model.solve(self, tmin, tmax, freq, warmup, noise, solver, report, initial, weights, fit_constant, freq_obs, initialize, **kwargs)
934 self.add_solver(solver=solver)
936 # Solve model
--> 937 success, optimal, stderr = self.solver.solve(
938 noise=self.settings["noise"], weights=weights, **kwargs
939 )
940 if not success:
941 logger.warning("Model parameters could not be estimated well.")
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pastas/solver.py:956, in EmceeSolve.solve(self, noise, weights, steps, callback, **kwargs)
945 else:
946 self.sampler = emcee.EnsembleSampler(
947 nwalkers=self.nwalkers,
948 ndim=ndim,
(...) 953 args=(noise, weights, callback),
954 )
--> 956 self.sampler.run_mcmc(pinit, steps, progress=self.progress_bar, **kwargs)
958 # Get optimal values
959 optimal = self.initial.copy()
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/emcee/ensemble.py:450, in EnsembleSampler.run_mcmc(self, initial_state, nsteps, **kwargs)
447 initial_state = self._previous_state
449 results = None
--> 450 for results in self.sample(initial_state, iterations=nsteps, **kwargs):
451 pass
453 # Store so that the ``initial_state=None`` case will work
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/emcee/ensemble.py:409, in EnsembleSampler.sample(self, initial_state, log_prob0, rstate0, blobs0, iterations, tune, skip_initial_state_check, thin_by, thin, store, progress, progress_kwargs)
406 move = self._random.choice(self._moves, p=self._weights)
408 # Propose
--> 409 state, accepted = move.propose(model, state)
410 state.random_state = self.random_state
412 if tune:
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/emcee/moves/red_blue.py:93, in RedBlueMove.propose(self, model, state)
90 q, factors = self.get_proposal(s, c, model.random)
92 # Compute the lnprobs of the proposed position.
---> 93 new_log_probs, new_blobs = model.compute_log_prob_fn(q)
95 # Loop over the walkers and update them accordingly.
96 for i, (j, f, nlp) in enumerate(
97 zip(all_inds[S1], factors, new_log_probs)
98 ):
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/emcee/ensemble.py:496, in EnsembleSampler.compute_log_prob(self, coords)
494 else:
495 map_func = map
--> 496 results = list(map_func(self.log_prob_fn, p))
498 try:
499 # perhaps log_prob_fn returns blobs?
500
(...) 504 # l is a length-1 array, np.array([1.234]). In that case blob
505 # will become an empty list.
506 blob = [l[1:] for l in results if len(l) > 1]
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/emcee/ensemble.py:640, in _FunctionWrapper.__call__(self, x)
638 def __call__(self, x):
639 try:
--> 640 return self.f(x, *self.args, **self.kwargs)
641 except: # pragma: no cover
642 import traceback
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pastas/solver.py:1004, in EmceeSolve.log_probability(self, p, noise, weights, callback)
1002 return -np.inf
1003 else:
-> 1004 return lp + self.log_likelihood(
1005 p, noise=noise, weights=weights, callback=callback
1006 )
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pastas/solver.py:1041, in EmceeSolve.log_likelihood(self, p, noise, weights, callback)
1038 # Set the parameters that are varied from the model and objective function
1039 par[self.vary] = p
-> 1041 rv = self.misfit(
1042 p=par[: -self.objective_function.nparam],
1043 noise=noise,
1044 weights=weights,
1045 callback=callback,
1046 )
1048 lnlike = self.objective_function.compute(
1049 rv, par[-self.objective_function.nparam :]
1050 )
1052 return lnlike
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pastas/solver.py:122, in BaseSolver.misfit(self, p, noise, weights, callback, returnseparate)
119 rv = self.ml.noise(p) * self.ml.noise_weights(p)
121 else:
--> 122 rv = self.ml.residuals(p)
124 # Determine if weights need to be applied
125 if weights is not None:
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pastas/model.py:545, in Model.residuals(self, p, tmin, tmax, freq, warmup)
540 sim_interpolated = np.interp(
541 oseries_calib.index.asi8, sim.index.asi8, sim.values
542 )
543 else:
544 # All the observation indexes are in the simulation
--> 545 sim_interpolated = sim.reindex(oseries_calib.index)
547 # Calculate the actual residuals here
548 res = oseries_calib.subtract(sim_interpolated)
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/series.py:5153, in Series.reindex(self, index, axis, method, copy, level, fill_value, limit, tolerance)
5136 @doc(
5137 NDFrame.reindex, # type: ignore[has-type]
5138 klass=_shared_doc_kwargs["klass"],
(...) 5151 tolerance=None,
5152 ) -> Series:
-> 5153 return super().reindex(
5154 index=index,
5155 method=method,
5156 copy=copy,
5157 level=level,
5158 fill_value=fill_value,
5159 limit=limit,
5160 tolerance=tolerance,
5161 )
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/generic.py:5610, in NDFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance)
5607 return self._reindex_multi(axes, copy, fill_value)
5609 # perform the reindex on the axes
-> 5610 return self._reindex_axes(
5611 axes, level, limit, tolerance, method, fill_value, copy
5612 ).__finalize__(self, method="reindex")
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/generic.py:5638, in NDFrame._reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
5633 new_index, indexer = ax.reindex(
5634 labels, level=level, limit=limit, tolerance=tolerance, method=method
5635 )
5637 axis = self._get_axis_number(a)
-> 5638 obj = obj._reindex_with_indexers(
5639 {axis: [new_index, indexer]},
5640 fill_value=fill_value,
5641 copy=copy,
5642 allow_dups=False,
5643 )
5644 # If we've made a copy once, no need to make another one
5645 copy = False
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/generic.py:5686, in NDFrame._reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
5683 indexer = ensure_platform_int(indexer)
5685 # TODO: speed up on homogeneous DataFrame objects (see _reindex_multi)
-> 5686 new_data = new_data.reindex_indexer(
5687 index,
5688 indexer,
5689 axis=baxis,
5690 fill_value=fill_value,
5691 allow_dups=allow_dups,
5692 copy=copy,
5693 )
5694 # If we've made a copy once, no need to make another one
5695 copy = False
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/internals/managers.py:680, in BaseBlockManager.reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy, only_slice, use_na_proxy)
677 raise IndexError("Requested axis not found in manager")
679 if axis == 0:
--> 680 new_blocks = self._slice_take_blocks_ax0(
681 indexer,
682 fill_value=fill_value,
683 only_slice=only_slice,
684 use_na_proxy=use_na_proxy,
685 )
686 else:
687 new_blocks = [
688 blk.take_nd(
689 indexer,
(...) 695 for blk in self.blocks
696 ]
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/internals/managers.py:773, in BaseBlockManager._slice_take_blocks_ax0(self, slice_or_indexer, fill_value, only_slice, use_na_proxy, ref_inplace_op)
770 else:
771 bp = BlockPlacement(slice(0, sllen))
772 return [
--> 773 blk.take_nd(
774 slobj,
775 axis=0,
776 new_mgr_locs=bp,
777 fill_value=fill_value,
778 )
779 ]
781 if sl_type == "slice":
782 blknos = self.blknos[slobj]
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/internals/blocks.py:1307, in Block.take_nd(self, indexer, axis, new_mgr_locs, fill_value)
1304 allow_fill = True
1306 # Note: algos.take_nd has upcast logic similar to coerce_to_target_dtype
-> 1307 new_values = algos.take_nd(
1308 values, indexer, axis=axis, allow_fill=allow_fill, fill_value=fill_value
1309 )
1311 # Called from three places in managers, all of which satisfy
1312 # these assertions
1313 if isinstance(self, ExtensionBlock):
1314 # NB: in this case, the 'axis' kwarg will be ignored in the
1315 # algos.take_nd call above.
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/array_algos/take.py:117, in take_nd(arr, indexer, axis, fill_value, allow_fill)
114 return arr.take(indexer, fill_value=fill_value, allow_fill=allow_fill)
116 arr = np.asarray(arr)
--> 117 return _take_nd_ndarray(arr, indexer, axis, fill_value, allow_fill)
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/array_algos/take.py:133, in _take_nd_ndarray(arr, indexer, axis, fill_value, allow_fill)
130 else:
131 indexer = ensure_platform_int(indexer)
--> 133 dtype, fill_value, mask_info = _take_preprocess_indexer_and_fill_value(
134 arr, indexer, fill_value, allow_fill
135 )
137 flip_order = False
138 if arr.ndim == 2 and arr.flags.f_contiguous:
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/array_algos/take.py:579, in _take_preprocess_indexer_and_fill_value(arr, indexer, fill_value, allow_fill, mask)
575 mask_info = None, False
576 else:
577 # check for promotion based on types only (do this first because
578 # it's faster than computing a mask)
--> 579 dtype, fill_value = maybe_promote(arr.dtype, fill_value)
580 if dtype != arr.dtype:
581 # check if promotion is actually required based on indexer
582 if mask is not None:
File ~/checkouts/readthedocs.org/user_builds/pastas/envs/latest/lib/python3.11/site-packages/pandas/core/dtypes/cast.py:590, in maybe_promote(dtype, fill_value)
584 # for performance, we are using a cached version of the actual implementation
585 # of the function in _maybe_promote. However, this doesn't always work (in case
586 # of non-hashable arguments), so we fallback to the actual implementation if needed
587 try:
588 # error: Argument 3 to "__call__" of "_lru_cache_wrapper" has incompatible type
589 # "Type[Any]"; expected "Hashable" [arg-type]
--> 590 dtype, fill_value = _maybe_promote_cached(
591 dtype, fill_value, type(fill_value) # type: ignore[arg-type]
592 )
593 except TypeError:
594 # if fill_value is not hashable (required for caching)
595 dtype, fill_value = _maybe_promote(dtype, fill_value)
KeyboardInterrupt:
3. Posterior parameter distributions#
The results from the MCMC analysis are stored in the sampler
object, accessible through ml.solver.sampler
variable. The object ml.solver.sampler.flatchain
contains a Pandas DataFrame with \(n\) the parameter samples, where \(n\) is calculated as follows:
\(n = \frac{\left(\text{steps}-\text{burn}\right)\cdot\text{nwalkers}}{\text{thin}} \)
Corner.py#
Corner is a simple but great python package that makes creating corner graphs easy. A couple of lines of code suffice to create a plot of the parameter distributions and the covariances between the parameters.
# Corner plot of the results
fig = plt.figure(figsize=(8, 8))
labels = list(ml2.parameters.index[ml2.parameters.vary]) + list(
ml2.solver.parameters.index[ml2.solver.parameters.vary]
)
labels = [label.split("_")[1] for label in labels]
best = list(ml2.parameters[ml2.parameters.vary].optimal) + list(
ml2.solver.parameters[ml2.solver.parameters.vary].optimal
)
axes = corner.corner(
ml2.solver.sampler.get_chain(flat=True, discard=500),
quantiles=[0.025, 0.5, 0.975],
labelpad=0.1,
show_titles=True,
title_kwargs=dict(fontsize=10),
label_kwargs=dict(fontsize=10),
max_n_ticks=3,
fig=fig,
labels=labels,
truths=best,
)
plt.show()
4. The trace shows when MCMC converges#
The walkers take steps in different directions for each step. It is expected that after a number of steps, the direction of the step becomes random, as a sign that an optimum has been found. This can be checked by looking at the autocorrelation, which should be insignificant after a number of steps. Below we just show how to obtain the different chains, the interpretation of which is outside the scope of this notebook.
fig, axes = plt.subplots(len(labels), figsize=(10, 7), sharex=True)
samples = ml2.solver.sampler.get_chain(flat=True)
for i in range(len(labels)):
ax = axes[i]
ax.plot(samples[:, i], "k", alpha=0.5)
ax.set_xlim(0, len(samples))
ax.set_ylabel(labels[i])
ax.yaxis.set_label_coords(-0.1, 0.5)
axes[-1].set_xlabel("step number")
mcn_params = pd.DataFrame(index=ls_params.index, columns=["mcn_opt", "mcn_sig"])
params = ml2.solver.sampler.get_chain(
flat=True, discard=500
) # discard first 500 of every chain
for iparam in range(params.shape[1] - 1):
mcn_params.iloc[iparam] = np.median(params[:, iparam]), np.std(params[:, iparam])
mean_time_diff = head.index.to_series().diff().mean().total_seconds() / 86400
# Translate phi into the value of alpha also used by the noisemodel
mcn_params.loc["noise_alpha", "mcn_opt"] = -mean_time_diff / np.log(
np.median(params[:, -1])
)
mcn_params.loc["noise_alpha", "mcn_sig"] = -mean_time_diff / np.log(
np.std(params[:, -1])
)
pd.concat((ls_params, mcn_params), axis=1)
Repeat with uniform priors#
Set more or less uninformative uniform priors. Now also include \(\sigma^2\).
ml3 = ps.Model(head)
rm = ps.RechargeModel(
rain, evap, recharge=ps.rch.Linear(), rfunc=ps.Gamma(), name="rch"
)
ml3.add_stressmodel(rm)
ml3.solve(report=False)
Uniform prior selected from 0.25 till 4 times the optimal values
# Set the initial parameters to a normal distribution
ml3.parameters["initial"] = ml3.parameters[
"optimal"
] # set initial value to the optimal from least squares for good starting point
for name in ml3.parameters.index:
if ml3.parameters.loc[name, "optimal"] > 0:
ml3.set_parameter(
name,
dist="uniform",
pmin=0.25 * ml3.parameters.loc[name, "optimal"],
pmax=4 * ml3.parameters.loc[name, "optimal"],
)
else:
ml3.set_parameter(
name,
dist="uniform",
pmin=4 * ml3.parameters.loc[name, "optimal"],
pmax=0.25 * ml3.parameters.loc[name, "optimal"],
)
ml3.parameters
# Choose the objective function
ln_prob = ps.objfunc.GaussianLikelihoodAr1()
# Create the EmceeSolver with some settings
s = ps.EmceeSolve(
nwalkers=20,
moves=emcee.moves.DEMove(),
objective_function=ln_prob,
progress_bar=True,
parallel=False,
)
s.parameters.loc["ln_var", "initial"] = 0.05**2
s.parameters.loc["ln_var", "pmin"] = 0.05**2 / 4
s.parameters.loc["ln_var", "pmax"] = 4 * 0.05**2
# Use the solver to run MCMC
ml3.solve(
solver=s,
initial=False,
tmin="1990",
steps=1000,
tune=True,
report=False,
)
s.parameters
# Corner plot of the results
fig = plt.figure(figsize=(8, 8))
labels = list(ml3.parameters.index[ml3.parameters.vary]) + list(
ml3.solver.parameters.index[ml3.solver.parameters.vary]
)
labels = [label.split("_")[1] for label in labels]
best = list(ml3.parameters[ml3.parameters.vary].optimal) + list(
ml3.solver.parameters[ml3.solver.parameters.vary].optimal
)
axes = corner.corner(
ml3.solver.sampler.get_chain(flat=True, discard=500),
quantiles=[0.025, 0.5, 0.975],
labelpad=0.1,
show_titles=True,
title_kwargs=dict(fontsize=10),
label_kwargs=dict(fontsize=10),
max_n_ticks=3,
fig=fig,
labels=labels,
truths=best,
)
plt.show()
fig, axes = plt.subplots(len(labels), figsize=(10, 7), sharex=True)
samples = ml3.solver.sampler.get_chain(flat=True)
for i in range(len(labels)):
ax = axes[i]
ax.plot(samples[:, i], "k", alpha=0.5)
ax.set_xlim(0, len(samples))
ax.set_ylabel(labels[i])
ax.yaxis.set_label_coords(-0.1, 0.5)
axes[-1].set_xlabel("step number")
mcu_params = pd.DataFrame(index=ls_params.index, columns=["mcu_opt", "mcu_sig"])
params = ml3.solver.sampler.get_chain(
flat=True, discard=500
) # discard first 500 of every chain
for iparam in range(params.shape[1] - 1):
mcu_params.iloc[iparam] = np.median(params[:, iparam]), np.std(params[:, iparam])
mean_time_diff = head.index.to_series().diff().mean().total_seconds() / 86400
mcu_params.loc["noise_alpha", "mcu_opt"] = -mean_time_diff / np.log(
np.median(params[:, -1])
)
mcu_params.loc["noise_alpha", "mcu_sig"] = -mean_time_diff / np.log(
np.std(params[:, -1])
)
pd.concat((ls_params, mcn_params, mcu_params), axis=1)
5. Compute prediction interval#
nobs = len(head)
params = ml3.solver.sampler.get_chain(flat=True, discard=500)
sim = {}
# compute for 1000 random samples of chain
np.random.seed(1)
for i in np.random.choice(np.arange(10000), size=1000, replace=False):
h = ml3.simulate(p=params[i, :-2])
res = ml3.residuals(p=params[i, :-2])
h += np.random.normal(loc=0, scale=np.std(res), size=len(h))
sim[i] = h
simdf = pd.DataFrame.from_dict(sim, orient="columns", dtype=float)
alpha = 0.05
q = [alpha / 2, 1 - alpha / 2]
pi = simdf.quantile(q, axis=1).transpose()
pimean = np.mean(pi[0.975] - pi[0.025])
print(f"prediction interval emcee with uniform priors: {pimean:.3f} m")
print(f"PICP: {ps.stats.picp(head, pi):.3f}")
For this example, the prediction interval is dominated by the residuals not by the uncertainty of the parameters. In the code cell below, the parameter uncertainty is not included: the coverage only changes slightly and is mostly affected by the difference in randomly drawing residuals.
logprob = ml3.solver.sampler.compute_log_prob(
ml3.solver.sampler.get_chain(flat=True, discard=500)
)[0]
imax = np.argmax(logprob) # parameter set with larges likelihood
#
nobs = len(head)
params = ml3.solver.sampler.get_chain(flat=True, discard=500)
sim = {}
# compute for 1000 random samples of residuals, but one parameter set
h = ml3.simulate(p=params[imax, :-2])
res = ml3.residuals(p=params[imax, :-2])
np.random.seed(1)
for i in range(1000):
sim[i] = h + np.random.normal(loc=0, scale=np.std(res), size=len(h))
simdf = pd.DataFrame.from_dict(sim, orient="columns", dtype=float)
alpha = 0.05
q = [alpha / 2, 1 - alpha / 2]
pi = simdf.quantile(q, axis=1).transpose()
pimean = np.mean(pi[0.975] - pi[0.025])
print(f"prediction interval emcee with uniform priors: {pimean:.3f} m")
print(f"PICP: {ps.stats.picp(head, pi):.3f}")