Checking Pastas models#
Developed by D.A. Brakenhoff, Artesia
This notebooks showcases the pastas.check submodule. This module can be used to check your time series models with pre-defined checks, or users can define their own checks.
For more elaborate discussions on why checking different aspects of your model is a good idea, we refer you to some of the other notebooks in the documentation, e.g. diagnostic_checking.ipynb and/or stowa_calibration.ipynb. In this notebook we focus on the pastas.check tools.
import pandas as pd
import pastas as ps
Build a simple model#
First let’s load some head data and build a simple time series model.
obs = pd.read_csv("data/head_nb1.csv", index_col=0, parse_dates=True).squeeze("columns")
rain = pd.read_csv("data/rain_nb1.csv", index_col=0, parse_dates=True).squeeze(
"columns"
)
evap = pd.read_csv("data/evap_nb1.csv", index_col=0, parse_dates=True).squeeze(
"columns"
)
We use a linear recharge model, so \(R = P - f \cdot E\). Build the model and solve it.
ml = ps.Model(obs, name="groundwater_head")
sm = ps.RechargeModel(prec=rain, evap=evap, rfunc=ps.Gamma(), name="recharge")
ml.add_stressmodel(sm)
ml.solve()
Fit report groundwater_head Fit Statistics
==================================================
nfev 14 EVP 93.28
nobs 644 R2 0.93
noise False RMSE 0.11
tmin 1985-11-14 00:00:00 AICc -2816.26
tmax 2015-06-28 00:00:00 BIC -2794.01
freq D Obj 4.00
freq_obs None ___
warmup 3650 days 00:00:00 Interp. No
solver LeastSquares weights No
Parameters (5 optimized)
==================================================
optimal initial vary
recharge_A 618.958765 215.674528 True
recharge_n 1.049253 1.000000 True
recharge_a 146.196518 10.000000 True
recharge_f -1.408118 -1.000000 True
constant_d 28.019661 27.900078 True
ax = ml.plot(figsize=(10, 3))
Checking the model#
The pastas check module contains check functions and a convenience function if you
want to do multiple checks on a single model.
Let’s inspect the module:
ps.check?
What is a check function?#
Check functions are the methods that determine whether a pastas Model meets certain criteria (passes the check) or fails to meet those criteria (fails the check). The criteria can be anything, from goodness-of-fit statistics to the magnitude of the variation of some contribution to the head. It is up to the modeller to decide which criteria their model should be subjected to.
Check functions follow a specific template:
A
pastas.Modelas the first argumentAny number of keyword arguments after that
Return a DataFrame containing information about the performed check. The following columns are defined:
statistic: the test statistic
operator: the operator, e.g. >, ==, within, etc.
threshold: the user-specified threshold or comparison values
dimensions: units or dimensions of statistic (useful for interpretation)
pass: whether the check is passed or not
comment: any additional comments relevant to the check
Pre-defined checks#
The pastas.check module contains several pre-defined checks that are frequently
applied to time series models. These can be listed with ps.check.checks which is a
dictionary containing the names and functions.
list(ps.check.checks.keys())
['stat_ufunc_threshold',
'rsq_geq_threshold',
'parameter_ufunc_threshold',
'parameters_leq_threshold',
'response_memory',
'response_memory_vs_warmup',
'uncertainty_gain',
'parameter_bounds',
'uncertainty_parameters',
'acf_runs_test',
'acf_stoffer_toloi_test',
'acf_ljung_box_test',
'correlation_sim_vs_res']
Check: \(R^2 \geq\) threshold#
Let’s try applying the rsq_geq_threshold check to our model, which tests whether the
\(R^2 \geq s\), where \(s\) is some user-defined threshold. We can already see the \(R^2\)
fit in the plot of the model simulation above, so let’s say we want our fit to be
greater or equal to 0.9. The model should obviously pass this check.
ps.check.rsq_geq_threshold(ml, threshold=0.9)
| statistic | operator | threshold | dimensions | pass | comment | |
|---|---|---|---|---|---|---|
| check | ||||||
| rsq>=0.9 | 0.932792 | >= | 0.9 | - | True |
This example isn’t very interesting, but you can already see that when we want to apply multiple checks to all of our models, having each of these checks contained within a function can be useful.
Writing your own checks#
Now as an extra example let’s write our own check function. This time we want to check whether our model is sufficiently good at simulating the low groundwater levels in summer.
We’ll do this by checking the goodness-of-fit for the summer periods. We can write a function that accepts a list of months in which to consider the residuals in order to calculate the fit statistics.
def rsq_geq_threshold_in_months(
ml: ps.Model, threshold: float, months: list[int]
) -> pd.DataFrame:
"""Check if the R² of the model is >= to a threshold in specific months.
Parameters
----------
ml : pastas.Model
The Pastas model to check.
threshold : float
The R² threshold value.
months : list of ints
The month numbers to consider in each year.
Returns
-------
pd.DataFrame
A DataFrame showing the check results
"""
res = ml.residuals() # get the model residuals
mask = res.index.month.isin(months) # define a mask for the selected months
rsq = ps.stats.rsq(obs=ml.observations().loc[mask], res=res.loc[mask]) # compute R²
# store results and context
df = ps.check.get_empty_check_dataframe()
df.loc["rsq_geq_threshold_in_months"] = (
rsq,
">=",
threshold,
"-",
rsq >= threshold,
f"in months: {months}",
)
return df
rsq_geq_threshold_in_months(ml, threshold=0.8, months=[6, 7, 8])
| statistic | operator | threshold | dimensions | pass | comment | |
|---|---|---|---|---|---|---|
| check | ||||||
| rsq_geq_threshold_in_months | 0.839281 | >= | 0.8 | - | True | in months: [6, 7, 8] |
Applying multiple checks to a model#
Often we want to check multiple criteria when deciding whether our model is fit for purpose.
Pre-defined checks#
Some of the pre-defined checks already check multiple criteria, for example whether the parameters do not lie on the parameter bounds. These methods return a DataFrame with multiple rows, with the results of each check on a separate row.
Check: Parameter bounds#
This check is automatically performed when solving a time series model, and any warnings are logged and reported at the bottom of the fit report. But it can be useful to include it in your list of checks. When an optimal parameter lies on a boundary, the check fails.
ps.check.parameter_bounds(ml)
| statistic | operator | threshold | dimensions | pass | comment | |
|---|---|---|---|---|---|---|
| check | ||||||
| Bounds: recharge_A | 618.958765 | within | (1e-05, 21567.452806178622) | [L] / [unit 'recharge' stress] | True | |
| Bounds: recharge_n | 1.049253 | within | (0.1, 5.0) | True | ||
| Bounds: recharge_a | 146.196518 | within | (0.01, 10000.0) | True | ||
| Bounds: recharge_f | -1.408118 | within | (-2.0, 0.0) | [-] | True | |
| Bounds: constant_d | 28.019661 | within | (nan, nan) | [L] | True |
Check: Uncertainty parameters#
This check tests whether the absolute value of the optimal parameters is larger than some factor times the estimated standard deviation of the parameter. Basically, it checks whether the parameter can be estimated with sufficient certainty. This check requires that the estimate of \(\sigma\) is reliable, which is the case when the noise meets certain requirements. This check always includes this comment, to remind users to check the noise prior to trusting the estimated parameter uncertainties.
ps.check.uncertainty_parameters(ml, n_std=1.96)
| statistic | operator | threshold | dimensions | pass | comment | |
|---|---|---|---|---|---|---|
| check | ||||||
| |recharge_A| > 1.96σ | 618.958765 | > | 33.851962 | [L] / [unit 'recharge' stress] | True | Assumes estimate of σ is reliable. |
| |recharge_n| > 1.96σ | 1.049253 | > | 0.042542 | True | Assumes estimate of σ is reliable. | |
| |recharge_a| > 1.96σ | 146.196518 | > | 16.967470 | True | Assumes estimate of σ is reliable. | |
| |recharge_f| > 1.96σ | -1.408118 | > | 0.087412 | [-] | True | Assumes estimate of σ is reliable. |
| |constant_d| > 1.96σ | 28.019661 | > | 0.085939 | [L] | True | Assumes estimate of σ is reliable. |
Check: Length response relative to calibratiod period or warmup#
The memory of the response function indicates the time it takes until the effect of some change has taken place. For example, the time it takes for the groundwater level to rise after 1 mm of rain today. This time is commonly expressed as a percentage e.g. \(t_{95}\), the time it takes until 95% of the rise has taken place. (Since the response is asymptotic, the 100% lies at time infinitiy.)
This memory should never be longer than the calibration period, and preferably be significantly shorter. By default the \(t_{95}\) is used and it is compared to half the length of the calibration period.
ps.check.response_memory(ml, cutoff=0.95, factor_length_oseries=0.5)
| statistic | operator | threshold | dimensions | pass | comment | |
|---|---|---|---|---|---|---|
| check | ||||||
| t95_recharge < 0.5 Δt_calib | 451.811512 | < | 5409.0 | [T] | True |
The memory can also compared to the warmup period. When the memory exceeds the warmup period, the model is not yet done warming up by the time the simulation starts, which can cause issues when calibrating or simulating the model for different time periods.
ps.check.response_memory_vs_warmup(ml, cutoff=0.95)
| statistic | operator | threshold | dimensions | pass | comment | |
|---|---|---|---|---|---|---|
| check | ||||||
| t95_recharge < warmup | 451.811512 | < | 3650 | [T] | True |
The checklist function for performing multiple checks#
The function ps.check.checklist is a convenience function for applying multiple
checks to a model. This function accepts a list of checks. This list can consist of the
following items:
the name of a built-in check, e.g. “rsq_geq_threshold”
any function that requires only a model as its input
a dictionary containing any function and any additional arguments to be passed to the check function.
Note that in the first two cases it is not possible to alter any additional arguments to the functions. If relevant it will use default values. Additionally, each check function should return a DataFrame with the expected columns, so the results can be combined.
The example below show-cases the application of 3 checks in each of the ways described above. Optionally, a report is shown (basically the resulting DataFrame) in which the checks (pass/fail) are colored for quick visual inspection.
# add checks to a list
checklist = [
# name of a built-in check
"rsq_geq_threshold",
# any check function that only requires the model as input
ps.check.parameter_bounds,
# check function, with additional arguments
{
"func": rsq_geq_threshold_in_months,
"threshold": 0.8,
"months": [6, 7, 8],
},
]
# perform all checks on a model, and optionally display a report
checks = ps.check.checklist(ml, checklist, report=True)
| statistic | operator | threshold | dimensions | pass | comment | |
|---|---|---|---|---|---|---|
| check | ||||||
| rsq>=0.7 | 0.932792 | >= | 0.700000 | - | True | |
| Bounds: recharge_A | 618.958765 | within | (np.float64(1e-05), np.float64(21567.452806178622)) | [L] / [unit 'recharge' stress] | True | |
| Bounds: recharge_n | 1.049253 | within | (np.float64(0.1), np.float64(5.0)) | True | ||
| Bounds: recharge_a | 146.196518 | within | (np.float64(0.01), np.float64(10000.0)) | True | ||
| Bounds: recharge_f | -1.408118 | within | (np.float64(-2.0), np.float64(0.0)) | [-] | True | |
| Bounds: constant_d | 28.019661 | within | (np.float64(nan), np.float64(nan)) | [L] | True | |
| rsq_geq_threshold_in_months | 0.839281 | >= | 0.800000 | - | True | in months: [6, 7, 8] |
Checks in literature#
Currently, the ps.check module contains two pre-defined checklists taken from two articles on time series analysis:
the checks used in Brakenhoff et al. (2022).
the checks used in Zaadnoordijk et al. (2019).
These articles used certain checks to determine whether a time series model was reliable or not. Note that these checklists were developed to address specific research questions and hydrogeological contexts. They are provided as reference implementations but may not be universally applicable across all modeling scenarios.
These checklists can be passed to the ps.check.checklist() function. Note that some
of these checks require the noise to meet the requirement of white noise. Our model
currently does not have a noise model, so let’s see what that does to the checks.
checklist_literature = ps.check.get_checks_literature("brakenhoff_2022")
checklist_literature
[{'func': <function pastas.check.rsq_geq_threshold(ml: pastas.model.Model, threshold: float = 0.7)>,
'threshold': 0.7},
{'func': <function pastas.check.response_memory(ml: pastas.model.Model, cutoff: float = 0.95, factor_length_oseries: float = 0.5, names: list[str] | str | None = None)>,
'cutoff': 0.95,
'factor_length_oseries': 0.5},
{'func': <function pastas.check.acf_runs_test(ml: pastas.model.Model, p_threshold: float = 0.05)>},
{'func': <function pastas.check.uncertainty_gain(ml: pastas.model.Model, n_std: float = 1.96, names: list[str] | str | None = None)>,
'n_std': 1.96},
{'func': <function pastas.check.parameter_bounds(ml: pastas.model.Model, parameters: list[str] | str | None = None)>}]
checks_no_noise = ps.check.checklist(ml, checklist_literature)
Noise cannot be calculated if there is no noisemodel present or is not used during parameter estimation.
No noise model found in model. Using residuals instead.
| statistic | operator | threshold | dimensions | pass | comment | |
|---|---|---|---|---|---|---|
| check | ||||||
| rsq>=0.7 | 0.932792 | >= | 0.700000 | - | True | |
| t95_recharge < 0.5 Δt_calib | 451.811512 | < | 5409.000000 | [T] | True | |
| runs_test (p > α) | 0.000000 | > | 0.050000 | - | False | |
| |recharge_A| > 1.96σ | 618.958765 | > | 33.851962 | [L] / [unit stress] | True | Assumes estimate of σ is reliable. |
| Bounds: recharge_A | 618.958765 | within | (np.float64(1e-05), np.float64(21567.452806178622)) | [L] / [unit 'recharge' stress] | True | |
| Bounds: recharge_n | 1.049253 | within | (np.float64(0.1), np.float64(5.0)) | True | ||
| Bounds: recharge_a | 146.196518 | within | (np.float64(0.01), np.float64(10000.0)) | True | ||
| Bounds: recharge_f | -1.408118 | within | (np.float64(-2.0), np.float64(0.0)) | [-] | True | |
| Bounds: constant_d | 28.019661 | within | (np.float64(nan), np.float64(nan)) | [L] | True |
Now let’s add a noisemodel, solve the model again and run the checks again.
ml.add_noisemodel(ps.ArNoiseModel())
ml.solve(initial=False)
Fit report groundwater_head Fit Statistics
==================================================
nfev 22 EVP 92.92
nobs 644 R2 0.93
noise True RMSE 0.11
tmin 1985-11-14 00:00:00 AICc -3256.75
tmax 2015-06-28 00:00:00 BIC -3230.07
freq D Obj 2.01
freq_obs None ___
warmup 3650 days 00:00:00 Interp. No
solver LeastSquares weights No
Parameters (6 optimized)
==================================================
optimal initial vary
recharge_A 680.960195 618.958765 True
recharge_n 1.017608 1.049253 True
recharge_a 150.847155 146.196518 True
recharge_f -1.278594 -1.408118 True
constant_d 27.890750 28.019661 True
noise_alpha 49.257301 15.000000 True
Note that the autocorrelation check now passes, which means we are more confident that the estimated parameter uncertainties are reliable.
Note: No significant autocorrelation is just one of the criteria the noise has to meet in order to meet the requirements of white noise, but we will not discuss that here. For more information refer to the notebooks diagnostic_checking.ipynb and/or stowa_calibration.ipynb.
check_w_noise = ps.check.checklist(ml, checklist_literature)
| statistic | operator | threshold | dimensions | pass | comment | |
|---|---|---|---|---|---|---|
| check | ||||||
| rsq>=0.7 | 0.929243 | >= | 0.700000 | - | True | |
| t95_recharge < 0.5 Δt_calib | 457.028689 | < | 5409.000000 | [T] | True | |
| runs_test (p > α) | 0.323627 | > | 0.050000 | - | True | |
| |recharge_A| > 1.96σ | 680.960195 | > | 69.371036 | [L] / [unit stress] | True | Assumes estimate of σ is reliable. |
| Bounds: recharge_A | 680.960195 | within | (np.float64(1e-05), np.float64(21567.452806178622)) | [L] / [unit 'recharge' stress] | True | |
| Bounds: recharge_n | 1.017608 | within | (np.float64(0.1), np.float64(5.0)) | True | ||
| Bounds: recharge_a | 150.847155 | within | (np.float64(0.01), np.float64(10000.0)) | True | ||
| Bounds: recharge_f | -1.278594 | within | (np.float64(-2.0), np.float64(0.0)) | [-] | True | |
| Bounds: constant_d | 27.890750 | within | (np.float64(nan), np.float64(nan)) | [L] | True | |
| Bounds: noise_alpha | 49.257301 | within | (np.float64(1e-05), np.float64(5000.0)) | [T] | True |
References#
Brakenhoff DA, Vonk MA, Collenteur RA, Van Baar M and Bakker M (2022) Application of Time Series Analysis to Estimate Drawdown From Multiple Well Fields. Front. Earth Sci. 10:907609. doi: 10.3389/feart.2022.907609
Zaadnoordijk, W.J., Bus, S.A.R., Lourens, A. and Berendrecht, W.L. (2019), Automated Time Series Modeling for Piezometers in the National Database of the Netherlands. Groundwater, 57: 834-843. https://doi.org/10.1111/gwat.12819