{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "\n", "import pastas as ps\n", "\n", "# stel de logger in om alleen foutmeldingen naar het scherm te printen\n", "ps.set_log_level(\"ERROR\")\n", "\n", "# print de versies van belangrijke Python packages voor pastas\n", "ps.show_versions()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Model calibration\n", "\n", "**Table of contents**\n", "\n", "- [Introduction](#introduction)\n", "- [Effect of the length of the time series (or calibration period)](#effect-of-the-length-of-the-time-series-(or-calibration-period))\n", "- [The noise model](#the-noise-model)\n", "- [Effect of a noise model](#effect-of-a-noise-model)\n", "- [Effect of the noise model on the estimation and reliability of parameter values](#effect-of-the-noise-model-on-the-estimation-and-reliability-of-parameter-values)\n", "- [Effect of an error in the measured groundwater recharge](#effect-of-an-error-in-the-measured-groundwater-recharge)\n", "- [Effect of the measurement frequency on the performance of the AR1 noise model](#effect-of-the-measurement-frequency-on-the-performance-of-the-ar1-noise-model)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "\n", "Calibration of a time series model refers to finding the model parameters such that the groundwater heads simulated by the model match the observed heads as closely as possible. The parameters found through this process are called the optimal parameters. The most commonly used method to find these optimal parameters is the minimization of the sum of squared differences between the observed and modeled heads, also known as the least squares method. The sum of squares is a non-linear function of the model parameters. To find the optimal parameters, a search method is used.\n", "\n", "There are several different search methods available to find the optimal model parameters. Each method is based on a different search algorithm, but the core idea is the same: the model is run multiple times with different values for the parameters. Based on the results (and the corresponding sum of squares), the algorithm determines the next set of parameter values that might result in a lower sum of squares. This continues until the algorithm decides that the best parameter set has been found. The efficiency of the search algorithm depends on how many model runs are needed to find the optimal parameters. Naturally, it is recommended to choose initial parameter values that are as close as possible to the optimal values. Most software packages include strategies to make smart initial guesses for the model parameters, sometimes using characteristics of the input series. In some cases, the modeler may have system-specific knowledge that helps choose better initial values.\n", "\n", "Most commonly used algorithms search for the optimal parameters by gradually adjusting them so that the sum of squares decreases with each step. This creates a direct path from the initial values to the optimal values, where the sum of squares gets smaller at every step. However, this can lead the algorithm to get stuck in a local minimum. It's possible that along the path from the initial to the optimal parameters, the sum of squares temporarily increases before decreasing again. Advanced search algorithms exist that also explore parameter sets that initially result in a higher sum of squares, with the goal of eventually finding the global minimum. Of course, this requires significantly more computation time.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To illustrate, the figure below shows the progression of the six parameters and the objective function (in this case, the root mean squared error for the noise) for each run during the search for the optimal parameters. The same well is used in the following example, where it will be described in more detail. For each iteration in the search process, the model is run multiple times—at least once per parameter. This explains why the parameter values change step by step, either increasing or decreasing. In total, the model was run nearly 100 times to find the optimal parameters." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "ho = (\n", " pd.read_csv(\n", " \"data_stowa/B58C0698001_1.csv\",\n", " skiprows=15,\n", " index_col=2,\n", " parse_dates=True,\n", " date_format=\"%d-%m-%Y\",\n", " )\n", " .iloc[:, 4]\n", " .loc[\"1990\":\"2009\"]\n", " .divide(100)\n", ")\n", "\n", "rain = (\n", " pd.read_csv(\n", " \"data_stowa/neerslaggeg_HEIBLOEM-L_967.txt\",\n", " skiprows=23,\n", " parse_dates=[\"YYYYMMDD\"],\n", " date_format=\"%Y%m%d\",\n", " )\n", " .set_index(\"YYYYMMDD\")[\" RD\"]\n", " .loc[\"1980\":\"2009\"]\n", " .astype(float)\n", " .divide(10000)\n", ")\n", "\n", "evap = (\n", " pd.read_csv(\n", " \"data_stowa/etmgeg_380.txt\",\n", " skiprows=46,\n", " parse_dates=[\"YYYYMMDD\"],\n", " date_format=\"%Y%m%d\",\n", " low_memory=False,\n", " )\n", " .set_index(\"YYYYMMDD\")[\" EV24\"]\n", " .loc[\"1980\":\"2009\"]\n", " .astype(float)\n", " .divide(10000)\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ml = ps.Model(ho.loc[\"1990\":])\n", "rm = ps.RechargeModel(rain, evap, ps.Gamma(), name=\"recharge\")\n", "ml.add_stressmodel(rm)\n", "ml.add_noisemodel(ps.ArNoiseModel())\n", "track = ps.TrackSolve(ml)\n", "ml.solve(report=False, callback=track.track_solve)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=(10, 3), layout=\"tight\")\n", "track.parameters.plot(ax=ax)\n", "ax.set_xlabel(\"Number of runs\")\n", "ax.set_ylabel(\"Parameter value\")\n", "ax2 = ax.twinx()\n", "ax2.plot(track.rmse_res, color=\"k\", ls=\"-\", label=\"RMSE of residuals\")\n", "ax2.set_ylabel(\"RMSE of residuals\")\n", "ax.legend(loc=(0, 1), frameon=False, ncol=6, handlelength=1)\n", "ax2.legend(loc=\"right\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Effect of the length of the time series (or calibration period)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The effect of the length of the calibration period on the results of a time series model is illustrated using measurements from well B58C0698 in the town of Swartbroek near Weert. Data from the period 1990–2010 is used, during which measurements were taken approximately twice per month. The rainfall time series comes from the Heibloem rain station, and the potential evaporation data is from the Maastricht weather station.\n", "\n", "A time series model was built using rainfall and potential evaporation as explanatory series, with a scaled Gamma function used as the response function. The entire dataset is used to calibrate the model. The model fits the data well, as shown in the figure below. The block response and step response functions are also shown. The memory of the block response function is slightly more than 2.5 years. After a rainfall event on the first day, the groundwater level rises fairly quickly, but it takes about 2.5 years to return to the pre-rainfall level. The peak of the block response function is just above 4, which means that if 1 mm of rain falls in a single day, the groundwater head will eventually rise by 4 mm. This corresponds to an effective porosity (phreatic storage) of 1 / 4 = 0.25, which is a reasonable value. The step response eventually reaches a value slightly above 650. This means that if it rains continuously at 1 mm/day starting from the first day, the groundwater head will eventually rise by a little more than 650 mm. This corresponds to parameter $A$ of the Gamma response function (see [Notebook on model structure](%%)).\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ml = ps.Model(ho.loc[\"1990\":])\n", "rm = ps.RechargeModel(rain, evap, ps.Gamma(), name=\"recharge\")\n", "ml.add_stressmodel(rm)\n", "ml.add_noisemodel(ps.ArNoiseModel())\n", "ml.solve(report=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ml = ps.Model(ho.loc[\"1990\":])\n", "rm = ps.RechargeModel(rain, evap, ps.Gamma(), name=\"recharge\")\n", "ml.add_stressmodel(rm)\n", "ml.solve(report=False)\n", "\n", "ax = ml.plot(figsize=(10, 3), layout=\"tight\")\n", "ho.name = \"B58C0698001\"\n", "ax.set_xlabel(\"Year\")\n", "ax.set_ylabel(\"Groundwater head [m NAP]\")\n", "ax.grid()\n", "ax.set_xlim(pd.Timestamp(\"1990\"), pd.Timestamp(\"2010\"))\n", "\n", "block = ml.get_block_response(\"recharge\")\n", "step = ml.get_step_response(\"recharge\")\n", "step.index /= 365\n", "block.index /= 365\n", "fig, ax = plt.subplots(1, 2, figsize=(10, 3))\n", "block.plot(\n", " ax=ax[0], ylabel=\"Head response\", title=\"Block response\", xlabel=\"Year\", xlim=(0, 3)\n", ")\n", "ax[0].grid()\n", "step.plot(\n", " ax=ax[1], ylabel=\"Head response\", title=\"Step response\", xlabel=\"Year\", xlim=(0, 3)\n", ")\n", "ax[1].grid()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the model above, 20 years of data are used to calibrate the model. But what happens if a shorter time series is used? Will the same response function still be found? To investigate this, the model is calibrated using 1 year of data (1990–1991), then 2 years (1990–1992), and so on up to 20 years (1990–2010). For each calibration, the value of parameter $A$ (the final value of the step response) is stored. \n", "\n", "In the figure below, the calibrated value of $A$ is plotted against the length of the calibration period, including the 95% confidence interval. The figure shows that if the calibration period is long enough, the results converge to approximately the same value. In this case, a calibration period of 5 years (1990–1995) is sufficient. That’s about twice the memory of the block response function, which is a reasonable rule of thumb for this case." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Alist = []\n", "years = np.arange(1991, 2011, 1)\n", "for endyear in years:\n", " ml = ps.Model(ho.loc[\"1990\":])\n", " rm = ps.RechargeModel(rain, evap, ps.Gamma(), name=\"recharge\")\n", " ml.add_stressmodel(rm)\n", " ml.add_noisemodel(ps.ArNoiseModel())\n", " ml.solve(tmin=\"1990\", tmax=str(endyear), report=False)\n", " Alist.append(ml.parameters.loc[\"recharge_A\", [\"optimal\", \"stderr\"]])\n", "Alist = np.array(Alist)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=(10, 3), layout=\"tight\")\n", "ax.errorbar(years, Alist[:, 0], marker=\".\", yerr=2 * Alist[:, 1], capsize=3)\n", "ax.set_xticks(years)\n", "ax.set_xticklabels(years, rotation=45)\n", "ax.grid()\n", "ax.set_xlabel(\"End year of the calibration period\")\n", "ax.set_ylabel(\"Parameter A of the\\nresponse function\")\n", "ax.set_title(\n", " \"Parameter A of the response function (with confidence interval); calibration period starts in 1990\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With a shorter calibration period, a value of $A$ that is too high is estimated in this example, and the confidence interval is also too narrow. The value of $A$ found using a longer calibration period ($A \\approx 650$) lies far outside the confidence interval of the value found with a calibration period of only 1 or a few years. \n", "\n", "The value obtained from, for example, a 3-year calibration period may provide a good fit for those three years, but beyond that, the simulated groundwater heads will mostly lie above the measured heads, as shown below. \n", "\n", "If a different 3-year period is chosen for calibration, the model can still perform reasonably well — as shown below for the period 1993–1996 — but that can only be assessed if data is also available for the validation period.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ho.name = \"B58C0698001\"\n", "ml = ps.Model(ho.loc[\"1990\":])\n", "rm = ps.RechargeModel(rain, evap, ps.Gamma(), name=\"recharge\")\n", "ml.add_stressmodel(rm)\n", "ml.add_noisemodel(ps.ArNoiseModel())\n", "ml.solve(tmin=\"1990\", tmax=\"1993\", report=False)\n", "ax = ml.plot(tmin=\"1990\", tmax=\"2010\", figsize=(10, 3))\n", "ax.set_xlabel(\"Year\")\n", "ax.set_ylabel(\"Groundwater head [m NAP]\")\n", "ax.grid()\n", "ax.set_title(\n", " \"Model calibrated on period 1990-1993; poor results outside calibration period\"\n", ")\n", "ax.axvspan(\"1990\", \"1993\", alpha=0.8, color=\"skyblue\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ho.name = \"B58C0698001\"\n", "\n", "ml = ps.Model(ho.loc[\"1990\":])\n", "rm = ps.RechargeModel(rain, evap, ps.Gamma(), name=\"recharge\")\n", "ml.add_stressmodel(rm)\n", "ml.add_noisemodel(ps.ArNoiseModel())\n", "ml.solve(tmin=\"1993\", tmax=\"1996\", report=False)\n", "\n", "ax = ml.plot(tmin=\"1990\", tmax=\"2010\", figsize=(10, 3))\n", "ax.set_xlabel(\"Year\")\n", "ax.set_ylabel(\"Groundwater head [m NAP]\")\n", "ax.grid()\n", "ax.set_title(\n", " \"Model calibrated on period 1993-1996; good results outside calibration period\"\n", ")\n", "ax.axvspan(\"1993\", \"1996\", alpha=0.8, color=\"skyblue\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Intuitively, it makes sense that the calibration period should be at least several times the memory of the system. However, it is difficult to define a fixed rule for the required length of the calibration period. As illustrated above, 3 years may be sufficient for this series, but the outcome depends on which 3 years are chosen — and that’s clearly not ideal in practice.\n", "\n", "[Van der Spek and Bakker (2017)](#references) investigated the minimum length of a calibration period after which the uncertainty in predicted groundwater heads no longer decreases. They analyzed 18 time series with approximately two measurements per month and response times ranging from 60 to 1200 days, but could not find a clear relationship between the system's memory (response time) and the required calibration period length.\n", "\n", "With a 20-year calibration period, they found that the average 95% prediction interval for groundwater heads was about 50% of the total variation in groundwater head (i.e., the difference between the highest and lowest observed levels), while the model's confidence interval was about 10% of the total variation. With a calibration period of only 10 years, the results depended strongly on which 10-year period was used — for some periods, the confidence interval was much larger than for others. With a calibration period of only 5 years, the confidence interval was almost always large.\n", "\n", "The conclusion from [Van der Spek and Bakker (2017)](#references) is that the model's reliability increases with a longer calibration period, but does not significantly improve once the calibration period exceeds 20 years.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The noise model\n", "\n", "A noise model can be used to ensure that the remaining differences between the model and the observations meet certain statistical assumptions, allowing statistical statements to be made using the model (for example: what is the probability that the groundwater head exceeds a certain threshold?). In the previous model, a noise model was also used—although not discussed in detail—so that the residuals approximately satisfied statistical tests and confidence intervals for the parameters could be estimated.\n", "\n", "The differences between the measured and modeled groundwater heads (the residuals) are almost always temporally correlated. Simply put: if the model overestimates the groundwater head today, there's a good chance it will overestimate it again next week. This is easy to explain. For instance, the rainfall data may come from a weather station located 20 km away from the observation well. If it rained at the station but not at the well, the model may simulate a groundwater head that is too high. And since it takes some time for the water level to drop after a rain event, the model may continue to overestimate the head for several days, weeks, or even months. Fortunately, the opposite can also occur: it might rain at the well but not at the station, balancing things out over time. An error in the rainfall input is just one possible cause of discrepancies between the model and measurements. Other causes include measurement errors, unobserved changes, or non-linearity in the system.\n", "\n", "Autocorrelation in the residuals becomes a problem when the model is used to make statistical inferences. To do so reliably, the residuals must satisfy certain statistical assumptions—one of the most important being that they are not autocorrelated. A noise model can be added to a time series model to transform autocorrelated residuals into uncorrelated noise.\n", "\n", "The simplest noise model is the so-called first-order autoregressive model, or AR(1) model. In this model, the residual $\\varepsilon(t)$ at time $t$ is equal to a factor $\\rho$ times the residual at time $t - \\Delta t$ plus a random (uncorrelated) error $n(t)$:\n", "\n", "$$\n", "\\varepsilon(t) = \\rho \\varepsilon(t - \\Delta t) + n(t)\n", "$$\n", "\n", "If the time interval $\\Delta t$ between two residuals varies, then the factor $\\rho$ can be expressed as a function of $\\Delta t$, decreasing as the time step increases:\n", "\n", "$$\n", "\\varepsilon(t) = \\text{e}^{-\\Delta t / \\alpha} \\varepsilon(t - \\Delta t) + n(t)\n", "$$\n", "\n", "Here, $\\alpha$ is a parameter that describes how quickly the correlation between residuals decays with increasing $\\Delta t$. The correlation becomes negligible when $\\Delta t > 3\\alpha$ (since $e^{-3} \\approx 0.05$).\n", "\n", "In the example below, synthetic time series are generated and analyzed to demonstrate that in time series models without a noise model, parameter estimates may still be accurate, but the confidence intervals are underestimated.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Effect of a noise model\n", "\n", "To generate the synthetic series, daily groundwater recharge is calculated as the measured rainfall minus the measured potential evaporation from the previous example (so this is actual rainfall and evaporation data). The response function for groundwater recharge is an exponential function with parameters $A = 600$ and $a = 150$ days. The base level is set to $d = 25$ meters.\n", "\n", "A synthetic groundwater head time series is then simulated using the same observation timestamps as in the previous example. This generated synthetic time series contains no noise or errors. A time series model applied to this synthetic groundwater head series, using an exponential response function, is able to recover the parameters $A$, $a$, and $d$ almost exactly, as shown below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "recharge = rain - evap\n", "A = 600\n", "a = 150\n", "d = 25" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ml = ps.Model(ho) # only the timestamps at which measurements were taken are used\n", "rm = ps.StressModel(recharge, ps.Exponential(), name=\"recharge\", settings=\"prec\")\n", "ml.add_stressmodel(rm)\n", "ml.add_noisemodel(ps.ArNoiseModel())\n", "ml.set_parameter(\"recharge_A\", initial=A)\n", "ml.set_parameter(\"recharge_a\", initial=a)\n", "ml.set_parameter(\"constant_d\", initial=d)\n", "hsynthetic_no_error = ml.simulate().loc[ho.index]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ml = ps.Model(hsynthetic_no_error)\n", "rm = ps.StressModel(recharge, ps.Exponential(), name=\"recharge\", settings=\"prec\")\n", "ml.add_stressmodel(rm)\n", "ml.add_noisemodel(ps.ArNoiseModel())\n", "ml.solve()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To illustrate how the noise model works, a residual (an error) is added to the synthetic groundwater head series. The residual at time $t$, $\\epsilon(t)$, is correlated with the residual at the previous measurement time $t - \\Delta t$, according to the formula given earlier.\n", "\n", "The noise term $n(t)$ is drawn from a normal distribution with a mean of zero and a standard deviation $\\sigma_n$. The stronger the correlation between residuals, the larger the standard deviation of the residuals becomes. The standard deviation $\\sigma_r$ of the residuals can be calculated from the standard deviation $\\sigma_n$ of the noise as follows:\n", "\n", "$$\\sigma_r = \\sigma_n / \\sqrt{1 - \\text{e}^{-2\\Delta t / \\alpha}}$$\n", "\n", "The residuals are generated using $\\sigma_n = 0.1$ m and $\\alpha = 50$ days. The standard deviation of the residuals, $\\sigma_r$, for a time step of 14 days is then equal to:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sigma_n = 0.1\n", "alpha = 50\n", "sigma_r = sigma_n / np.sqrt(1 - np.exp(-2 * 14 / alpha))\n", "print(f\"sigma_r = {sigma_r:.2f} m\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The synthetic groundwater head series now contains a correlated error, just like in reality. We will now first create a time series model of the synthetic series *without* a noise model.\n", "\n", "The time series model appears to provide a good fit, and the estimated parameters reasonably match the parameters used to generate the synthetic series. However, note that the value of $a$ used to generate the synthetic series ($a = 150$ days) does not fall within the estimated confidence interval of the estimated value of $a$.\n", "\n", "The diagnostic test, however, shows that there is autocorrelation in the residuals, so formally, the estimated confidence interval is not valid. We will investigate this further.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "delt = (ho.index[1:] - ho.index[:-1]).values / pd.Timedelta(\"1d\")\n", "np.random.seed(1)\n", "noise = sigma_n * np.random.randn(len(ho))\n", "residuals = np.zeros_like(noise)\n", "residuals[0] = noise[0]\n", "for i in range(1, len(ho)):\n", " residuals[i] = np.exp(-delt[i - 1] / alpha) * residuals[i - 1] + noise[i]\n", "hsynthetic = hsynthetic_no_error + residuals\n", "\n", "ml = ps.Model(hsynthetic)\n", "rm = ps.StressModel(recharge, ps.Exponential(), name=\"recharge\", settings=\"prec\")\n", "ml.add_stressmodel(rm)\n", "ml.solve(report=False)\n", "\n", "axes = ml.plots.results(figsize=(10, 6), layout=\"tight\")\n", "axes[0].set_title(\n", " \"Results for time series model without noise model\", fontsize=14, y=1.1\n", ")\n", "\n", "a_min = (\n", " ml.parameters.at[\"recharge_a\", \"optimal\"]\n", " - 1.96 * ml.parameters.at[\"recharge_a\", \"stderr\"]\n", ")\n", "a_max = (\n", " ml.parameters.at[\"recharge_a\", \"optimal\"]\n", " + 1.96 * ml.parameters.at[\"recharge_a\", \"stderr\"]\n", ")\n", "print(f\"Estimated 95% confidence interval for a: {a_min:.2f} - {a_max:.2f}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "axes = ml.plots.diagnostics(figsize=(10, 6))\n", "axes[0].get_figure().suptitle(\n", " \"Diagnostic checks for time series model without noise model\", fontsize=16\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we now once again create a time series model of the synthetic groundwater head series, but this time *include* a noise model, the fit is again good and the estimated parameters still closely match the specified parameters. However, now the autocorrelation of the noise is within the acceptable level — the noise model has done its job well.\n", "\n", "The fit of the time series model with a noise model is always lower than that of a model without a noise model. However, in this case, the difference is small.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ml = ps.Model(hsynthetic)\n", "rm = ps.StressModel(recharge, ps.Exponential(), name=\"recharge\")\n", "ml.add_stressmodel(rm)\n", "ml.add_noisemodel(ps.ArNoiseModel())\n", "ml.solve(report=False)\n", "\n", "axes = ml.plots.results(figsize=(10, 6), layout=\"tight\")\n", "axes[0].set_title(\"Resultaten voor tijdreeksmodel met ruismodel\", y=1.1)\n", "\n", "a_min = (\n", " ml.parameters.at[\"recharge_a\", \"optimal\"]\n", " - 1.96 * ml.parameters.at[\"recharge_a\", \"stderr\"]\n", ")\n", "a_max = (\n", " ml.parameters.at[\"recharge_a\", \"optimal\"]\n", " + 1.96 * ml.parameters.at[\"recharge_a\", \"stderr\"]\n", ")\n", "print(f\"Estimated 95% confidence interval for a: {a_min:.2f} - {a_max:.2f}\")\n", "\n", "axes = ml.plots.diagnostics(figsize=(10, 6))\n", "axes[0].get_figure().suptitle(\n", " \"Diagnostic checks for time series model with noise model\", fontsize=16\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Effect of the noise model on the estimation and reliability of the parameters\n", "\n", "If the noise model is not applied, the true value of $a$ does not fall within the estimated confidence interval. However, when the noise model *is* applied, the true value of $a$ *does* fall within the estimated confidence interval. \n", "\n", "Of course, this could be due to chance (after all, it's a 95% confidence interval, so there is a 5% chance that the true value falls outside it). That’s why we will perform an experiment.\n", "\n", "In the experiment, we generate a synthetic series 100 times, each time with different (correlated) noise. For all 100 synthetic series, we create a time series model. Since the synthetic series contains different (correlated) noise each time, the optimal parameters vary slightly across the runs.\n", "\n", "The experiment is conducted twice: the first time, 100 time series models are created *without* a noise model; the second time, 100 models are created *with* a noise model.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def experiment(nexp, sigma_noise=0.1, noisemodel=True):\n", " Aexp = np.zeros((nexp, 2))\n", " aexp = np.zeros((nexp, 2))\n", " dexp = np.zeros((nexp, 2))\n", " alphaexp = np.zeros((nexp, 2))\n", " evpexp = np.zeros(nexp)\n", " print(\"progress of experiment:\")\n", " for iexp in range(nexp):\n", " if (iexp + 1) % 20 == 0:\n", " print(iexp + 1, end=\"%, \")\n", " np.random.seed(iexp)\n", " noise = sigma_noise * np.random.randn(len(ho))\n", " residuals = np.zeros_like(noise)\n", " residuals[0] = noise[0]\n", " for i in range(1, len(ho)):\n", " residuals[i] = np.exp(-delt[i - 1] / alpha) * residuals[i - 1] + noise[i]\n", " hsynthetic = hsynthetic_no_error + residuals\n", " ml = ps.Model(hsynthetic)\n", " rm = ps.StressModel(recharge, ps.Exponential(), name=\"recharge\")\n", " ml.add_stressmodel(rm)\n", " if noisemodel:\n", " ml.add_noisemodel(ps.ArNoiseModel())\n", " else:\n", " ml.del_noisemodel()\n", " ml.solve(report=False)\n", " Aexp[iexp] = ml.parameters.loc[\"recharge_A\", [\"optimal\", \"stderr\"]].values\n", " aexp[iexp] = ml.parameters.loc[\"recharge_a\", [\"optimal\", \"stderr\"]].values\n", " dexp[iexp] = ml.parameters.loc[\"constant_d\", [\"optimal\", \"stderr\"]].values\n", " if noisemodel:\n", " alphaexp[iexp] = ml.parameters.loc[\"noise_alpha\", [\"optimal\", \"stderr\"]]\n", " evpexp[iexp] = ml.stats.evp()\n", " return Aexp, aexp, dexp, alphaexp, evpexp" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "result_no_noisemodel = experiment(100, noisemodel=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Results without a noise model\")\n", "print(f\"Mean value of parameter A: {np.mean(result_no_noisemodel[0][:, 0]):.2f}\")\n", "print(f\"Mean value of parameter a: {np.mean(result_no_noisemodel[1][:, 0]):.2f}\")\n", "print(f\"Mean value of parameter d: {np.mean(result_no_noisemodel[2][:, 0]):.2f}\")\n", "print(f\"Mean value of evp: {np.mean(result_no_noisemodel[4]):.2f}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "result_with_noisemodel = experiment(100, noisemodel=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Results with a noise model\")\n", "print(f\"Mean value of parameter A: {np.mean(result_with_noisemodel[0][:, 0]):.2f}\")\n", "print(f\"Mean value of parameter a: {np.mean(result_with_noisemodel[1][:, 0]):.2f}\")\n", "print(f\"Mean value of parameter d: {np.mean(result_with_noisemodel[2][:, 0]):.2f}\")\n", "print(f\"Mean value of parameter alpha: {np.mean(result_with_noisemodel[3][:, 0]):.2f}\")\n", "print(f\"Mean value of evp: {np.mean(result_with_noisemodel[4]):.2f}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It turns out that both the time series model *without* the noise model and the model *with* the noise model give, on average, a good estimate of the parameters. So why should we use a noise model?\n", "\n", "The reason is that the estimated uncertainty of the parameters is too small when no noise model is used. The statistical method used to estimate uncertainty assumes that the remaining noise (the residuals) is uncorrelated. However, if the noise is actually correlated, the estimated uncertainty is too small.\n", "\n", "This can be easily verified with the experiment performed above. During the experiment, the parameters were estimated 100 times, each time including an estimate of the standard error. Under certain statistical assumptions (including normality and uncorrelated noise), the 95% confidence interval is calculated as the estimated value ± 2 times the standard error (technically 1.96, but we round to 2 here).\n", "\n", "So it's easy to check how often the true parameter values (used to generate the synthetic series) fall within the 95% confidence intervals of the estimated parameters. Since we repeated the experiment 100 times, this should happen about 95 times.\n", "\n", "As shown below, the time series model *without* a noise model only provides a confidence interval that contains the true value about 60–65 times. The model *with* a noise model performs much better: in more than 95 out of 100 cases, the true parameter values of the response functions lie within the 95% confidence interval.\n", "\n", "In summary: without a noise model, the parameter estimates are on average correct, but the confidence intervals are too narrow (i.e., uncertainty is underestimated).\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Aexp, aexp, dexp, alphaexp, evp = result_no_noisemodel\n", "print(\"results without noise model\")\n", "print(\n", " \"number of times true A within estimated confidence interval:\",\n", " np.count_nonzero(np.abs(Aexp[:, 0] - A) < 1.96 * Aexp[:, 1]),\n", ")\n", "print(\n", " \"number of times true a within estimated confidence interval:\",\n", " np.count_nonzero(np.abs(aexp[:, 0] - a) < 1.96 * aexp[:, 1]),\n", ")\n", "print(\n", " \"number of times true d within estimated confidence interval:\",\n", " np.count_nonzero(np.abs(dexp[:, 0] - d) < 1.96 * dexp[:, 1]),\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Aexp, aexp, dexp, alphaexp, evp = result_with_noisemodel\n", "print(\"results without noise model\")\n", "print(\n", " \"number of times true A within estimated confidence interval:\",\n", " np.count_nonzero(np.abs(Aexp[:, 0] - A) < 1.96 * Aexp[:, 1]),\n", ")\n", "print(\n", " \"number of times true a within estimated confidence interval:\",\n", " np.count_nonzero(np.abs(aexp[:, 0] - a) < 1.96 * aexp[:, 1]),\n", ")\n", "print(\n", " \"number of times true d within estimated confidence interval:\",\n", " np.count_nonzero(np.abs(dexp[:, 0] - d) < 1.96 * dexp[:, 1]),\n", ")\n", "print(\n", " \"number of times true alpha within estimated confidence interval:\",\n", " np.count_nonzero(np.abs(alphaexp[:, 0] - alpha) < 1.96 * alphaexp[:, 1]),\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Effect of an error in the measured groundwater recharge\n", "\n", "The above experiment explained the advantage of using a noise model. When no noise model is used, the parameters are estimated correctly, but the confidence intervals are too narrow. When a noise model *is* included, the confidence intervals are estimated much more accurately.\n", "\n", "In the previous example, the noise in the groundwater head was generated using a model that was exactly the same as the noise model used in the time series model. As previously explained, the correlation in the residuals in reality is the result of physical processes — for example, a difference between the rainfall measured at the weather station and the rainfall that actually occurred at the observation well.\n", "\n", "In the following test, the experiment is repeated, but this time a correlated error is *not* added to the groundwater head. Instead, a groundwater recharge series (rainfall minus evaporation) is used that contains an error.\n", "\n", "There are many ways to add a realistic error to rainfall or evaporation series. Here, we choose to multiply the measured groundwater recharge by a factor $(1 + \\varepsilon_g)$, where $\\varepsilon_g$ is normally distributed with a mean of zero and a standard deviation of $\\sigma_g$. This means that the groundwater recharge is slightly higher or lower than measured on each day. The chosen value for $\\sigma_g$ is 0.2.\n", "\n", "Below, a time series model is created in which an AR(1) noise model is again included in the fit. As can be seen, the fit is very good and there is virtually no autocorrelation in the noise. However, the noise is not very normally distributed. This is concerning, because the estimation of the confidence intervals for the parameters assumes that the noise is normally distributed.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(100)\n", "factor = 1 + 0.2 * np.random.randn(len(recharge))\n", "ml = ps.Model(hsynthetic_no_error)\n", "rm = ps.StressModel(factor * recharge, ps.Exponential(), name=\"recharge\")\n", "ml.add_stressmodel(rm)\n", "ml.add_noisemodel(ps.ArNoiseModel())\n", "ml.solve(report=False)\n", "\n", "axes = ml.plots.results(figsize=(10, 6), layout=\"tight\")\n", "axes[0].set_title(\n", " \"Results for time series model with error in groundwater recharge\",\n", " fontsize=12,\n", " y=1.1,\n", ")\n", "\n", "axes = ml.plots.diagnostics(figsize=(10, 6))\n", "axes[0].get_figure().suptitle(\n", " \"Diagnostic checks for time series model with error in groundwater recharge\",\n", " fontsize=16,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will once again perform the experiment described above. The time series models *without* a noise model again estimate the parameters well on average, but the confidence intervals are far too narrow: in only 30–40% of the models do the true parameters fall within the confidence intervals of the estimated parameters.\n", "\n", "The time series models *with* a noise model also produce good results for the estimated parameters. In 80–85% of the models, the true parameters fall within the confidence intervals of the estimated parameters. This is less than the expected 95% and is most likely due to the noise not being normally distributed.\n", "\n", "To improve this, a time series model should ideally be used that is based on, for example, a maximum likelihood estimation of the parameters, where the expected distribution of the noise can be specified (see for example van der Spek and Bakker, 2017). However, in most current versions of time series analysis software, it is not possible to define a custom maximum likelihood function.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def experiment(nexp, noisemodel=True):\n", " Aexp = np.zeros((nexp, 2))\n", " aexp = np.zeros((nexp, 2))\n", " dexp = np.zeros((nexp, 2))\n", " alphaexp = np.zeros((nexp, 2))\n", " evpexp = np.zeros(nexp)\n", " print(\"progress of experiment:\")\n", " for iexp in range(nexp):\n", " if (iexp + 1) % 20 == 0:\n", " print(iexp + 1, end=\"%, \")\n", " np.random.seed(iexp)\n", " factor = 1 + 0.2 * np.random.randn(len(recharge))\n", " ml = ps.Model(hsynthetic_no_error)\n", " rm = ps.StressModel(factor * recharge, ps.Exponential(), name=\"recharge\")\n", " ml.add_stressmodel(rm)\n", " if noisemodel:\n", " ml.add_noisemodel(ps.ArNoiseModel())\n", " else:\n", " ml.del_noisemodel()\n", " ml.solve(report=False)\n", " Aexp[iexp] = ml.parameters.loc[\"recharge_A\", [\"optimal\", \"stderr\"]]\n", " aexp[iexp] = ml.parameters.loc[\"recharge_a\", [\"optimal\", \"stderr\"]]\n", " dexp[iexp] = ml.parameters.loc[\"constant_d\", [\"optimal\", \"stderr\"]]\n", " if noisemodel:\n", " alphaexp[iexp] = ml.parameters.loc[\"noise_alpha\", [\"optimal\", \"stderr\"]]\n", " evpexp[iexp] = ml.stats.evp()\n", " return Aexp, aexp, dexp, alphaexp, evpexp" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "result_no_noisemodel = experiment(100, noisemodel=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "result_with_noisemodel = experiment(100, noisemodel=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Results of models with error in recharge, without a noise model\")\n", "print(f\"Mean value of parameter A: {np.mean(result_no_noisemodel[0][:, 0]):.2f}\")\n", "print(f\"Mean value of parameter a: {np.mean(result_no_noisemodel[1][:, 0]):.2f}\")\n", "print(f\"Mean value of parameter d: {np.mean(result_no_noisemodel[2][:, 0]):.2f}\")\n", "print(f\"Mean value of evp: {np.mean(result_no_noisemodel[4]):.2f}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Aexp, aexp, dexp, alphaexp, evp = result_no_noisemodel\n", "print(\n", " \"number of times true A within estimated confidence interval:\",\n", " np.count_nonzero(np.abs(Aexp[:, 0] - A) < 1.96 * Aexp[:, 1]),\n", ")\n", "print(\n", " \"number of times true a within estimated confidence interval:\",\n", " np.count_nonzero(np.abs(aexp[:, 0] - a) < 1.96 * aexp[:, 1]),\n", ")\n", "print(\n", " \"number of times true d within estimated confidence interval:\",\n", " np.count_nonzero(np.abs(dexp[:, 0] - d) < 1.96 * dexp[:, 1]),\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Results of models with error in recharge, with a noise model\")\n", "print(f\"Mean value of parameter A: {np.mean(result_with_noisemodel[0][:, 0]):.2f}\")\n", "print(f\"Mean value of parameter a: {np.mean(result_with_noisemodel[1][:, 0]):.2f}\")\n", "print(f\"Mean value of parameter d: {np.mean(result_with_noisemodel[2][:, 0]):.2f}\")\n", "print(f\"Mean value of parameter alpha: {np.mean(result_with_noisemodel[3][:, 0]):.2f}\")\n", "print(f\"Mean value of evp: {np.mean(result_with_noisemodel[4]):.2f}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Aexp, aexp, dexp, alphaexp, evp = result_with_noisemodel\n", "print(\n", " \"number of times true A within estimated confidence interval:\",\n", " np.count_nonzero(np.abs(Aexp[:, 0] - A) < 1.96 * Aexp[:, 1]),\n", ")\n", "print(\n", " \"number of times true a within estimated confidence interval:\",\n", " np.count_nonzero(np.abs(aexp[:, 0] - a) < 1.96 * aexp[:, 1]),\n", ")\n", "print(\n", " \"number of times true d within estimated confidence interval:\",\n", " np.count_nonzero(np.abs(dexp[:, 0] - d) < 1.96 * dexp[:, 1]),\n", ")\n", "# print('number of times true alpha within estimated confidence interval:', np.count_nonzero(np.abs(alphaexp[:, 0] - alpha) < 1.96 * alphaexp[:, 1]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Effect of measurement frequency on the performance of the AR(1) noise model\n", "\n", "It is often difficult, with current algorithms, to transform the residuals into noise with negligible autocorrelation for time series models of groundwater heads that are measured daily. A practical solution can be to reduce the measurement frequency — for example, by using only one measurement every week or every two weeks.\n", "\n", "Consider, for example, observation well B16G0187_2 in the village of Witte Paarden in Steenwijkerland. The groundwater head was measured daily between 2005 and 2018. A time series model using rainfall and potential evaporation provides a good fit (see below), but there is still a clear autocorrelation in the residuals.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ho_df = pd.read_csv(\"data_stowa/B16G0187_2.csv\", parse_dates=[0], index_col=0)\n", "rain_df = pd.read_csv(\"data_stowa/Frederiksoord.csv\", parse_dates=[0], index_col=0)\n", "evap_df = pd.read_csv(\"data_stowa/Marknesse.csv\", parse_dates=[0], index_col=0)\n", "\n", "\n", "ho = ho_df.iloc[:, 0]\n", "rain = rain_df.iloc[:, 0]\n", "evap = evap_df.iloc[:, 0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ml = ps.Model(ho)\n", "rm = ps.RechargeModel(rain, evap, ps.Gamma(), name=\"recharge\")\n", "ml.add_stressmodel(rm)\n", "ml.add_noisemodel(ps.ArNoiseModel())\n", "ml.solve()\n", "\n", "ax = ml.plot(figsize=(10, 3))\n", "ax.grid()\n", "\n", "f, ax = plt.subplots(figsize=(10, 3), layout=\"tight\")\n", "ax = ps.plots.acf(ml.noise(), alpha=0.05, ax=ax)\n", "ax.set_xlim(0, 150)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The time series model was then fitted again, but this time using only 14-day interval measurements. As a result, the number of observations used for fitting was reduced from more than 4,500 to fewer than 400. The fit is approximately equally good, and all model parameters are similar — except for the parameter $\\alpha$ of the noise model, which is now five times larger. \n", "\n", "However — and this was the goal — the autocorrelation in the noise is now negligible. This means that the standard errors (`stderr`) of the parameters can be used to estimate the confidence intervals of the parameters (provided that the other statistical assumptions are also met).\n", "\n", "Note also that for this well, the standard error of the parameters in the model fitted to 14-day measurements is about twice as large for most parameters compared to the model fitted to daily measurements.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ml = ps.Model(ho)\n", "rm = ps.RechargeModel(rain, evap, ps.Gamma(), name=\"recharge\")\n", "ml.add_stressmodel(rm)\n", "ml.add_noisemodel(ps.ArNoiseModel())\n", "ml.solve(freq_obs=\"14D\")\n", "\n", "ax = ml.plot(figsize=(10, 3))\n", "ax.grid()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n", "\n", "- J.E. van der Spek and M. Bakker, 2017. *The influence of the length of the calibration period and observation frequency on predictive uncertainty in time series modeling of groundwater dynamics*. Water Resources Research, 53(3), pp.2294–2311." ] } ], "metadata": { "authors": [ { "name": "M. Bakker" } ], "kernelspec": { "display_name": "pastas (3.13.5)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.5" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "title": "Kalibratie van een tijdreeksmodel" }, "nbformat": 4, "nbformat_minor": 4 }