Checking Pastas models#

Developed by D.A. Brakenhoff, Artesia

This notebooks showcases the pastas.check submodule. This module can be used to check your time series models with pre-defined checks, or users can define their own checks.

For more elaborate discussions on why checking different aspects of your model is a good idea, we refer you to some of the other notebooks in the documentation, e.g. diagnostic_checking.ipynb and/or stowa_calibration.ipynb. In this notebook we focus on the pastas.check tools.

import pandas as pd

import pastas as ps

Build a simple model#

First let’s load some head data and build a simple time series model.

obs = pd.read_csv("data/head_nb1.csv", index_col=0, parse_dates=True).squeeze("columns")
rain = pd.read_csv("data/rain_nb1.csv", index_col=0, parse_dates=True).squeeze(
    "columns"
)
evap = pd.read_csv("data/evap_nb1.csv", index_col=0, parse_dates=True).squeeze(
    "columns"
)

We use a linear recharge model, so \(R = P - f \cdot E\). Build the model and solve it.

ml = ps.Model(obs, name="groundwater_head")
sm = ps.RechargeModel(prec=rain, evap=evap, rfunc=ps.Gamma(), name="recharge")
ml.add_stressmodel(sm)
ml.solve()
Fit report groundwater_head         Fit Statistics
==================================================
nfev     14                     EVP          93.28
nobs     644                    R2            0.93
noise    False                  RMSE          0.11
tmin     1985-11-14 00:00:00    AICc      -2816.26
tmax     2015-06-28 00:00:00    BIC       -2794.01
freq     D                      Obj           4.00
freq_obs None                   ___               
warmup   3650 days 00:00:00     Interp.         No
solver   LeastSquares           weights        Yes

Parameters (5 optimized)
==================================================
               optimal     initial  vary
recharge_A  618.958233  215.674528  True
recharge_n    1.049254    1.000000  True
recharge_a  146.196089   10.000000  True
recharge_f   -1.408117   -1.000000  True
constant_d   28.019659   27.900078  True
ax = ml.plot(figsize=(10, 3))
../_images/96c9cdcc34e4aa3b61c12139846ae7d5a3166411fb546330bac190d7a1bff474.png

Checking the model#

The pastas check module contains check functions and a convenience function if you want to do multiple checks on a single model.

Let’s inspect the module:

ps.check?

What is a check function?#

Check functions are the methods that determine whether a pastas Model meets certain criteria (passes the check) or fails to meet those criteria (fails the check). The criteria can be anything, from goodness-of-fit statistics to the magnitude of the variation of some contribution to the head. It is up to the modeller to decide which criteria their model should be subjected to.

Check functions follow a specific template:

  • A pastas.Model as the first argument

  • Any number of keyword arguments after that

  • Return a DataFrame containing information about the performed check. The following columns are defined:

    • statistic: the test statistic

    • operator: the operator, e.g. >, ==, within, etc.

    • threshold: the user-specified threshold or comparison values

    • dimensions: units or dimensions of statistic (useful for interpretation)

    • pass: whether the check is passed or not

    • comment: any additional comments relevant to the check

Pre-defined checks#

The pastas.check module contains several pre-defined checks that are frequently applied to time series models. These can be listed with ps.check.checks which is a dictionary containing the names and functions.

list(ps.check.checks.keys())
['stat_ufunc_threshold',
 'rsq_geq_threshold',
 'parameter_ufunc_threshold',
 'parameters_leq_threshold',
 'response_memory',
 'response_memory_vs_warmup',
 'uncertainty_gain',
 'parameter_bounds',
 'uncertainty_parameters',
 'acf_runs_test',
 'acf_stoffer_toloi_test',
 'acf_ljung_box_test',
 'correlation_sim_vs_res']

Check: \(R^2 \geq\) threshold#

Let’s try applying the rsq_geq_threshold check to our model, which tests whether the \(R^2 \geq s\), where \(s\) is some user-defined threshold. We can already see the \(R^2\) fit in the plot of the model simulation above, so let’s say we want our fit to be greater or equal to 0.9. The model should obviously pass this check.

ps.check.rsq_geq_threshold(ml, threshold=0.9)
statistic operator threshold dimensions pass comment
check
rsq>=0.9 0.932792 >= 0.9 - True

This example isn’t very interesting, but you can already see that when we want to apply multiple checks to all of our models, having each of these checks contained within a function can be useful.

Writing your own checks#

Now as an extra example let’s write our own check function. This time we want to check whether our model is sufficiently good at simulating the low groundwater levels in summer.

We’ll do this by checking the goodness-of-fit for the summer periods. We can write a function that accepts a list of months in which to consider the residuals in order to calculate the fit statistics.

def rsq_geq_threshold_in_months(
    ml: ps.Model, threshold: float, months: list[int]
) -> pd.DataFrame:
    """Check if the R² of the model is >= to a threshold in specific months.

    Parameters
    ----------
    ml : pastas.Model
        The Pastas model to check.
    threshold : float
        The R² threshold value.
    months : list of ints
        The month numbers to consider in each year.

    Returns
    -------
    pd.DataFrame
        A DataFrame showing the check results
    """
    res = ml.residuals()  # get the model residuals
    mask = res.index.month.isin(months)  # define a mask for the selected months
    rsq = ps.stats.rsq(obs=ml.observations().loc[mask], res=res.loc[mask])  # compute R²
    # store results and context
    df = ps.check.get_empty_check_dataframe()
    df.loc["rsq_geq_threshold_in_months"] = (
        rsq,
        ">=",
        threshold,
        "-",
        rsq >= threshold,
        f"in months: {months}",
    )
    return df
rsq_geq_threshold_in_months(ml, threshold=0.8, months=[6, 7, 8])
statistic operator threshold dimensions pass comment
check
rsq_geq_threshold_in_months 0.839281 >= 0.8 - True in months: [6, 7, 8]

Applying multiple checks to a model#

Often we want to check multiple criteria when deciding whether our model is fit for purpose.

Pre-defined checks#

Some of the pre-defined checks already check multiple criteria, for example whether the parameters do not lie on the parameter bounds. These methods return a DataFrame with multiple rows, with the results of each check on a separate row.

Check: Parameter bounds#

This check is automatically performed when solving a time series model, and any warnings are logged and reported at the bottom of the fit report. But it can be useful to include it in your list of checks. When an optimal parameter lies on a boundary, the check fails.

ps.check.parameter_bounds(ml)
statistic operator threshold dimensions pass comment
check
Bounds: recharge_A 618.958233 within (1e-05, 21567.452806178622) [L] / [unit 'recharge' stress] True
Bounds: recharge_n 1.049254 within (0.1, 5.0) True
Bounds: recharge_a 146.196089 within (0.01, 10000.0) True
Bounds: recharge_f -1.408117 within (-2.0, 0.0) [-] True
Bounds: constant_d 28.019659 within (nan, nan) [L] True

Check: Uncertainty parameters#

This check tests whether the absolute value of the optimal parameters is larger than some factor times the estimated standard deviation of the parameter. Basically, it checks whether the parameter can be estimated with sufficient certainty. This check requires that the estimate of \(\sigma\) is reliable, which is the case when the noise meets certain requirements. This check always includes this comment, to remind users to check the noise prior to trusting the estimated parameter uncertainties.

ps.check.uncertainty_parameters(ml, n_std=1.96)
statistic operator threshold dimensions pass comment
check
|recharge_A| > 1.96σ 618.958233 > 33.851903 [L] / [unit 'recharge' stress] True Assumes estimate of σ is reliable.
|recharge_n| > 1.96σ 1.049254 > 0.042542 True Assumes estimate of σ is reliable.
|recharge_a| > 1.96σ 146.196089 > 16.967426 True Assumes estimate of σ is reliable.
|recharge_f| > 1.96σ -1.408117 > 0.087412 [-] True Assumes estimate of σ is reliable.
|constant_d| > 1.96σ 28.019659 > 0.085939 [L] True Assumes estimate of σ is reliable.

Check: Length response relative to calibratiod period or warmup#

The memory of the response function indicates the time it takes until the effect of some change has taken place. For example, the time it takes for the groundwater level to rise after 1 mm of rain today. This time is commonly expressed as a percentage e.g. \(t_{95}\), the time it takes until 95% of the rise has taken place. (Since the response is asymptotic, the 100% lies at time infinitiy.)

This memory should never be longer than the calibration period, and preferably be significantly shorter. By default the \(t_{95}\) is used and it is compared to half the length of the calibration period.

ps.check.response_memory(ml, cutoff=0.95, factor_length_oseries=0.5)
statistic operator threshold dimensions pass comment
check
t95_recharge < 0.5 Δt_calib 451.810437 < 5409.0 [T] True

The memory can also compared to the warmup period. When the memory exceeds the warmup period, the model is not yet done warming up by the time the simulation starts, which can cause issues when calibrating or simulating the model for different time periods.

ps.check.response_memory_vs_warmup(ml, cutoff=0.95)
statistic operator threshold dimensions pass comment
check
t95_recharge < warmup 451.810437 < 3650 [T] True

The checklist function for performing multiple checks#

The function ps.check.checklist is a convenience function for applying multiple checks to a model. This function accepts a list of checks. This list can consist of the following items:

  • the name of a built-in check, e.g. “rsq_geq_threshold”

  • any function that requires only a model as its input

  • a dictionary containing any function and any additional arguments to be passed to the check function.

Note that in the first two cases it is not possible to alter any additional arguments to the functions. If relevant it will use default values. Additionally, each check function should return a DataFrame with the expected columns, so the results can be combined.

The example below show-cases the application of 3 checks in each of the ways described above. Optionally, a report is shown (basically the resulting DataFrame) in which the checks (pass/fail) are colored for quick visual inspection.

# add checks to a list
checklist = [
    # name of a built-in check
    "rsq_geq_threshold",
    # any check function that only requires the model as input
    ps.check.parameter_bounds,
    # check function, with additional arguments
    {
        "func": rsq_geq_threshold_in_months,
        "threshold": 0.8,
        "months": [6, 7, 8],
    },
]

# perform all checks on a model, and optionally display a report
checks = ps.check.checklist(ml, checklist, report=True)
  statistic operator threshold dimensions pass comment
check            
rsq>=0.7 0.932792 >= 0.700000 - True
Bounds: recharge_A 618.958233 within (np.float64(1e-05), np.float64(21567.452806178622)) [L] / [unit 'recharge' stress] True
Bounds: recharge_n 1.049254 within (np.float64(0.1), np.float64(5.0)) True
Bounds: recharge_a 146.196089 within (np.float64(0.01), np.float64(10000.0)) True
Bounds: recharge_f -1.408117 within (np.float64(-2.0), np.float64(0.0)) [-] True
Bounds: constant_d 28.019659 within (np.float64(nan), np.float64(nan)) [L] True
rsq_geq_threshold_in_months 0.839281 >= 0.800000 - True in months: [6, 7, 8]

Checks in literature#

Currently, the ps.check module contains two pre-defined checklists taken from two articles on time series analysis:

  • the checks used in Brakenhoff et al. (2022).

  • the checks used in Zaadnoordijk et al. (2019).

These articles used certain checks to determine whether a time series model was reliable or not. Note that these checklists were developed to address specific research questions and hydrogeological contexts. They are provided as reference implementations but may not be universally applicable across all modeling scenarios.

These checklists can be passed to the ps.check.checklist() function. Note that some of these checks require the noise to meet the requirement of white noise. Our model currently does not have a noise model, so let’s see what that does to the checks.

checklist_literature = ps.check.get_checks_literature("brakenhoff_2022")
checklist_literature
[{'func': <function pastas.check.rsq_geq_threshold(ml: pastas.model.Model, threshold: float = 0.7)>,
  'threshold': 0.7},
 {'func': <function pastas.check.response_memory(ml: pastas.model.Model, cutoff: float = 0.95, factor_length_oseries: float = 0.5, names: list[str] | str | None = None)>,
  'cutoff': 0.95,
  'factor_length_oseries': 0.5},
 {'func': <function pastas.check.acf_runs_test(ml: pastas.model.Model, p_threshold: float = 0.05)>},
 {'func': <function pastas.check.uncertainty_gain(ml: pastas.model.Model, n_std: float = 1.96, names: list[str] | str | None = None)>,
  'n_std': 1.96},
 {'func': <function pastas.check.parameter_bounds(ml: pastas.model.Model, parameters: list[str] | str | None = None)>}]
checks_no_noise = ps.check.checklist(ml, checklist_literature)
Noise cannot be calculated if there is no noisemodel present or is not used during parameter estimation.
No noise model found in model. Using residuals instead.
  statistic operator threshold dimensions pass comment
check            
rsq>=0.7 0.932792 >= 0.700000 - True
t95_recharge < 0.5 Δt_calib 451.810437 < 5409.000000 [T] True
runs_test (p > α) 0.000000 > 0.050000 - False
|recharge_A| > 1.96σ 618.958233 > 33.851903 [L] / [unit stress] True Assumes estimate of σ is reliable.
Bounds: recharge_A 618.958233 within (np.float64(1e-05), np.float64(21567.452806178622)) [L] / [unit 'recharge' stress] True
Bounds: recharge_n 1.049254 within (np.float64(0.1), np.float64(5.0)) True
Bounds: recharge_a 146.196089 within (np.float64(0.01), np.float64(10000.0)) True
Bounds: recharge_f -1.408117 within (np.float64(-2.0), np.float64(0.0)) [-] True
Bounds: constant_d 28.019659 within (np.float64(nan), np.float64(nan)) [L] True

Now let’s add a noisemodel, solve the model again and run the checks again.

ml.add_noisemodel(ps.ArNoiseModel())
ml.solve(initial=False)
Fit report groundwater_head         Fit Statistics
==================================================
nfev     21                     EVP          92.92
nobs     644                    R2            0.93
noise    True                   RMSE          0.11
tmin     1985-11-14 00:00:00    AICc      -3256.75
tmax     2015-06-28 00:00:00    BIC       -3230.07
freq     D                      Obj           2.01
freq_obs None                   ___               
warmup   3650 days 00:00:00     Interp.         No
solver   LeastSquares           weights        Yes

Parameters (6 optimized)
==================================================
                optimal     initial  vary
recharge_A   680.957230  618.958233  True
recharge_n     1.017633    1.049254  True
recharge_a   150.845807  146.196089  True
recharge_f    -1.278596   -1.408117  True
constant_d    27.890761   28.019659  True
noise_alpha   49.256443   15.000000  True

Note that the autocorrelation check now passes, which means we are more confident that the estimated parameter uncertainties are reliable.

Note: No significant autocorrelation is just one of the criteria the noise has to meet in order to meet the requirements of white noise, but we will not discuss that here. For more information refer to the notebooks diagnostic_checking.ipynb and/or stowa_calibration.ipynb.

check_w_noise = ps.check.checklist(ml, checklist_literature)
  statistic operator threshold dimensions pass comment
check            
rsq>=0.7 0.929246 >= 0.700000 - True
t95_recharge < 0.5 Δt_calib 457.031840 < 5409.000000 [T] True
runs_test (p > α) 0.323627 > 0.050000 - True
|recharge_A| > 1.96σ 680.957230 > 69.370998 [L] / [unit stress] True Assumes estimate of σ is reliable.
Bounds: recharge_A 680.957230 within (np.float64(1e-05), np.float64(21567.452806178622)) [L] / [unit 'recharge' stress] True
Bounds: recharge_n 1.017633 within (np.float64(0.1), np.float64(5.0)) True
Bounds: recharge_a 150.845807 within (np.float64(0.01), np.float64(10000.0)) True
Bounds: recharge_f -1.278596 within (np.float64(-2.0), np.float64(0.0)) [-] True
Bounds: constant_d 27.890761 within (np.float64(nan), np.float64(nan)) [L] True
Bounds: noise_alpha 49.256443 within (np.float64(1e-05), np.float64(5000.0)) [T] True

References#

  • Brakenhoff DA, Vonk MA, Collenteur RA, Van Baar M and Bakker M (2022) Application of Time Series Analysis to Estimate Drawdown From Multiple Well Fields. Front. Earth Sci. 10:907609. doi: 10.3389/feart.2022.907609

  • Zaadnoordijk, W.J., Bus, S.A.R., Lourens, A. and Berendrecht, W.L. (2019), Automated Time Series Modeling for Piezometers in the National Database of the Netherlands. Groundwater, 57: 834-843. https://doi.org/10.1111/gwat.12819