{ "cells": [ { "cell_type": "markdown", "id": "fc4bcd17", "metadata": {}, "source": [ "# Estimating recharge\n", "*R.A. Collenteur, University of Graz*\n", "\n", "In this example notebook it is shown how to obtain groundwater recharge estimates using linear and non-linear TFN models available in Pastas. To illustrate the methods we look at example models for a groundwater level time series observed near the town of Wagna in Southeastern Austria. This analysis is based on the study from [Collenteur et al. (2021)](#References).\n", "\n", "
\n", "Note:\n", "While the groundwater level data is the same to that used for the publication, the precipitation and potential evaporation data is not the same (due to open data issues). The results from this notebook are therefore not the same as in the manuscript. \n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "8be9197e", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import pastas as ps\n", "import matplotlib.pyplot as plt\n", "\n", "ps.set_log_level(\"ERROR\")\n", "ps.show_versions(lmfit=True)" ] }, { "cell_type": "markdown", "id": "e2da887d", "metadata": {}, "source": [ "## 1. Load and visualize the input data\n", "Below we load and visualize the data used in this example. This is the input data necessary to for the estimation of groundwater recharge: precipitation, potential evaporation, and groundwater levels. The precipitation and evaporation should be available at daily time intervals and provided in mm/day." ] }, { "cell_type": "code", "execution_count": null, "id": "a0d74148", "metadata": {}, "outputs": [], "source": [ "head = (\n", " pd.read_csv(\"data_wagna/head_wagna.csv\", index_col=0, parse_dates=True, skiprows=2)\n", " .squeeze()\n", " .loc[\"2006\":]\n", ")\n", "evap = pd.read_csv(\n", " \"data_wagna/evap_wagna.csv\", index_col=0, parse_dates=True, skiprows=2\n", ").squeeze()\n", "rain = pd.read_csv(\n", " \"data_wagna/rain_wagna.csv\", index_col=0, parse_dates=True, skiprows=2\n", ").squeeze()\n", "\n", "### Create a plot of all the input data\n", "fig, [ax1, ax2, ax3] = plt.subplots(3, 1, sharex=True, figsize=(6, 6))\n", "\n", "ax1.vlines(rain.index, [0], rain.values, color=\"k\", lw=1)\n", "ax1.set_ylabel(\"P [mm d$^{-1}$]\")\n", "ax1.set_ylim(0, 100)\n", "ax2.vlines(evap.index, [0], evap.values, color=\"k\", lw=1)\n", "ax2.set_ylabel(\"E$_p$ [mm d$^{-1}$]\\n\")\n", "ax2.set_ylim(0, 6)\n", "ax3.plot(head, marker=\".\", markersize=2, color=\"k\", linestyle=\" \")\n", "ax3.set_ylabel(\"h [m]\")\n", "ax3.set_ylim(262.2, 264.5)\n", "for ax in [ax1, ax2, ax3]:\n", " ax.grid()\n", "\n", "plt.xlim(pd.Timestamp(\"2006-01-01\"), pd.Timestamp(\"2020-01-01\"));" ] }, { "cell_type": "markdown", "id": "7ed59985", "metadata": {}, "source": [ "## 2. Creating and calibrating Pastas models\n", "The next step is the create the TFN models in Pastas and calibrate them. Here we create two types of TFN models: one with a linear approximation for the recharge and one with a non-linear recharge model. Based on the time step analysis we calibrate the models on groundwater level time series with a 10-day time interval between observations. \n", "\n", "It is important to use the correct settings for the model calibration In particular it is important to use a warmup period for the non-linear model, where both precipitation and evaporation is available. Here we use a 10-year period for calibration and a three year period for validation." ] }, { "cell_type": "code", "execution_count": null, "id": "27f2d483", "metadata": {}, "outputs": [], "source": [ "# Model settings\n", "tmin = pd.Timestamp(\"2007-01-01\") # Needs warmup\n", "tmax = pd.Timestamp(\"2016-12-31\")\n", "tmax_val = pd.Timestamp(\"2020-01-01\")\n", "noise = True\n", "dt = 10\n", "freq = \"10D\"\n", "h = head.iloc[0::dt]\n", "\n", "mls = {\n", " \"Linear\": [ps.FourParam(), ps.rch.Linear()],\n", " \"Non-linear\": [ps.Exponential(), ps.rch.FlexModel()],\n", "}\n", "\n", "for name, [rfunc, rch] in mls.items():\n", " # Create a Pastas model and add the recharge model\n", " ml = ps.Model(h, name=name)\n", " sm = ps.RechargeModel(rain, evap, recharge=rch, rfunc=rfunc, name=\"rch\")\n", " ml.add_stressmodel(sm)\n", "\n", " # In case of the non-linear model, change some parameter settings\n", " if name == \"Non-linear\":\n", " ml.set_parameter(\"rch_srmax\", vary=False)\n", " ml.set_parameter(\"rch_kv\", vary=True)\n", " ml.set_parameter(\"rch_lp\", vary=False, initial=0.25)\n", " ml.set_parameter(\"constant_d\", vary=True, initial=262, pmax=head.min())\n", "\n", " # Add the ARMA(1,1) noise model and solve the Pastas model\n", " ml.add_noisemodel(ps.ArmaModel())\n", " solver = ps.LmfitSolve()\n", "\n", " ml.solve(\n", " tmin=tmin,\n", " tmax=tmax,\n", " noise=True,\n", " solver=solver,\n", " method=\"least_squares\",\n", " report=\"basic\",\n", " )\n", " mls[name] = ml" ] }, { "cell_type": "markdown", "id": "b2dccb37", "metadata": {}, "source": [ "## 3. Check for autocorrelation\n", "After the models are calibrated, the noise time series may be checked for autocorrelation. This is especially important here because in the next step we use the estimated standard errors of the parameters to compute the 95% confidence intervals of the recharge estimates. The plots below show that there is no significant autocorrelation (with $\\alpha$=0.05)." ] }, { "cell_type": "code", "execution_count": null, "id": "9cfe834d", "metadata": {}, "outputs": [], "source": [ "fig, axes = plt.subplots(2, 1, figsize=(6, 5), sharex=True, sharey=True)\n", "\n", "for i, ml in enumerate(mls.values()):\n", " noise = (\n", " ml.noise().asfreq(freq).fillna(0.0)\n", " ) # Fill up two nan-values so it is regular\n", " ps.plots.acf(noise, acf_options=dict(bin_method=\"regular\"), alpha=0.05, ax=axes[i])\n", " axes[i].set_title(ml.name)\n", " axes[i].set_ylim(-0.2, 0.2)\n", " axes[i].set_ylabel(\"ACF [-]\")\n", " axes[i].set_xlabel(\"\")\n", "\n", "plt.xlim(10)\n", "plt.xlabel(\"Time lag [days]\")\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "id": "ffe49f2d", "metadata": {}, "source": [ "## 4. Compute confidence intervals\n", "In the following code-block the 95% confidence intervals of the recharge estimates are computed using an monte carlo analysis. This may take a bit of time, depending on the number of monte carlo runs are performed (first line)." ] }, { "cell_type": "code", "execution_count": null, "id": "7bbddb8f", "metadata": {}, "outputs": [], "source": [ "n = int(1e2) # number of monte carlo runs, set very low here for ReadTheDocs\n", "alpha = 0.05 # 95% confidence interval\n", "q = [alpha / 2, 1 - alpha / 2]\n", "\n", "# Store the Upper and Lower boundaries of the 95% interval\n", "yerru = []\n", "yerrl = []\n", "yerru_A = []\n", "yerrl_A = []\n", "\n", "# Recharge data\n", "data_r = pd.DataFrame(\n", " {\n", " \"Linear\": mls[\"Linear\"].get_stress(\"rch\"),\n", " \"Non-linear\": mls[\"Non-linear\"].get_stress(\"rch\"),\n", " }\n", ")\n", "\n", "for ml in mls.values():\n", " func = ml.stressmodels[\"rch\"].get_stress\n", " params = ml.fit.get_parameter_sample(n=n, name=\"rch\")\n", " data = {}\n", "\n", " # Here we run the model n times with different parameter samples\n", " for i, param in enumerate(params):\n", " data[i] = func(p=param)\n", "\n", " df = pd.DataFrame.from_dict(data, orient=\"columns\").loc[tmin:tmax_val]\n", "\n", " # store recharge estimates at 10-day intervals\n", " df = df.resample(freq).sum()\n", " yerrl.append(df.quantile(q=q, axis=1).transpose().iloc[:, 0])\n", " yerru.append(df.quantile(q=q, axis=1).transpose().iloc[:, 1])\n", "\n", " # store recharge estimates at one year sums\n", " df = df.resample(\"A\").sum()\n", " rch = data_r[ml.name].resample(\"A\").sum()\n", " yerrl_A.append(\n", " df.quantile(q=q, axis=1).transpose().subtract(rch, axis=0).iloc[:, 0].dropna()\n", " )\n", " yerru_A.append(\n", " df.quantile(q=q, axis=1).transpose().subtract(rch, axis=0).iloc[:, 1].dropna()\n", " )" ] }, { "cell_type": "markdown", "id": "7ef40ff0", "metadata": {}, "source": [ "## 5. Visualize the results\n", "We can now plot the simulated groundwater levels and the estimated groundwater recharge for both models. For the recharge estimates, here the sums over 10-day intervals, we also plot the 95% confidence intervals. " ] }, { "cell_type": "code", "execution_count": null, "id": "0116dab5", "metadata": {}, "outputs": [], "source": [ "plt.figure(figsize=(12, 10))\n", "ax = plt.subplot2grid((4, 1), (0, 0), rowspan=2)\n", "\n", "head.plot(ax=ax, linestyle=\" \", c=\"gray\", marker=\".\", x_compat=True, label=\"observed\")\n", "ml.oseries.series.plot(ax=ax, marker=\".\", c=\"k\", linestyle=\" \", label=\"calibration\")\n", "\n", "# Create the arrows indicating the calibration and validation period\n", "kwargs = dict(\n", " textcoords=\"offset points\",\n", " arrowprops=dict(arrowstyle=\"->\"),\n", " va=\"center\",\n", " ha=\"center\",\n", " xycoords=\"data\",\n", ")\n", "ax.annotate(\"calibration period\", xy=(\"2007-01-01\", 262.25), xytext=(270, 0), **kwargs)\n", "ax.annotate(\"\", xy=(\"2017-01-01\", 262.25), xytext=(-200, 0), **kwargs)\n", "ax.annotate(\"validation\", xy=(\"2017-01-01\", 262.25), xytext=(80, 0), **kwargs)\n", "ax.annotate(\"\", xy=(\"2020-01-01\", 262.25), xytext=(-50, 0), **kwargs)\n", "\n", "# Plot the recharge fluxes\n", "for i, ml in enumerate(mls.values()):\n", " axn = plt.subplot2grid((4, 1), (i + 2, 0), sharex=ax)\n", " axn.grid(True, zorder=-10)\n", " axn.set_axisbelow(True)\n", "\n", " ml.simulate(tmax=tmax_val).plot(ax=ax, label=ml.name, x_compat=True)\n", " rch = ml.get_stress(\"rch\", tmin=tmin, tmax=tmax_val).resample(freq).sum()\n", "\n", " axn.fill_between(\n", " yerrl[i].index,\n", " yerrl[i].values,\n", " yerru[i].values,\n", " step=\"pre\",\n", " edgecolor=\"C0\",\n", " zorder=0,\n", " facecolor=\"C0\",\n", " )\n", " axn.plot(rch.index, rch.values, c=\"k\", drawstyle=\"steps\", label=ml.name, lw=1)\n", "\n", " axn.set_yticks([-50, 0, 50, 100, 150])\n", "\n", " axn.set_ylabel(\"R [mm 10 d$^{-1}$]\")\n", " axn.legend(loc=2, ncol=2)\n", " axn.axvline(tmax, c=\"k\", linestyle=\"--\")\n", " axn.set_xticks([])\n", "\n", "ax.legend(ncol=5, loc=2, bbox_to_anchor=(0.01, 1.15), framealpha=1, fancybox=False)\n", "ax.axvline(tmax, c=\"k\", linestyle=\"--\")\n", "ax.set_ylabel(\"Groundwater level [m]\")\n", "ax.set_yticks([262.5, 263.0, 263.5, 264.0])\n", "ax.grid()\n", "axn.set_xlim([tmin, tmax_val])\n", "axn.set_xlabel(\"Date\")\n", "axn.set_xticks([pd.to_datetime(str(year)) for year in range(2007, 2021, 2)]);" ] }, { "cell_type": "markdown", "id": "59982c63", "metadata": {}, "source": [ "## 6. Plot the annual recharge rates\n", "We may also be interested in the estimated annual recharge rates. These can easily be computed busing the resample method from Pandas. We also show the 95% confidence interval for the estimated annual recharge rates." ] }, { "cell_type": "code", "execution_count": null, "id": "26974570", "metadata": {}, "outputs": [], "source": [ "rch = data_r.loc[tmin:\"2019\"].resample(\"A\").sum()\n", "yerr = [[-l, u] for l, u in zip(yerrl_A, yerru_A)]\n", "ax = rch.plot.bar(figsize=(12, 2), width=0.91, yerr=yerr)\n", "ax.set_xticklabels(labels=rch.index.year, rotation=0, ha=\"center\")\n", "ax.set_ylabel(\"Recharge [mm a$^{-1}$]\")\n", "ax.legend(ncol=3);" ] }, { "cell_type": "markdown", "id": "8985f7a4", "metadata": {}, "source": [ "## 7. Plot the step and block response functions\n", "We can also plot the calibrated impulse response functions." ] }, { "cell_type": "code", "execution_count": null, "id": "42fe268f", "metadata": {}, "outputs": [], "source": [ "fig, [ax, ax1] = plt.subplots(1, 2, figsize=(5, 2.5), sharex=True)\n", "\n", "for ml in mls.values():\n", " ml.get_block_response(\"rch\").plot(ax=ax)\n", " ml.get_step_response(\"rch\").plot(ax=ax1)\n", "\n", "ax.set_ylabel(\"Block Response [m]\")\n", "ax.legend(mls.keys())\n", "\n", "ax1.set_ylabel(\"Step Response [m]\")\n", "ax1.set_xlim(0, 700)\n", "\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "id": "4207edcc", "metadata": {}, "source": [ "## 8. Compute the goodness-of-fit for groundwater level simulation\n", "The following code-block shows how a summary table of the goodness-of-fit with different metrics can be made. Here we show the goodness-of-fit metrics for the groundwater level simulation. " ] }, { "cell_type": "code", "execution_count": null, "id": "590bf150", "metadata": {}, "outputs": [], "source": [ "# Make a table of the head performance\n", "periods = [\"Cal.\", \"Val.\"]\n", "mi = pd.MultiIndex.from_product([mls.keys(), periods])\n", "idx = [\"mae\", \"rmse\", \"nse\", \"kge_2012\"]\n", "index = [\"MAE [m]\", \"RMSE [m]\", \"NSE [-]\", \"KGE [-]\"]\n", "\n", "metrics = pd.DataFrame(index=idx, columns=mi)\n", "for name, ml in mls.items():\n", " metrics.loc[metrics.index, (name, periods[0])] = (\n", " ml.stats.summary(stats=metrics.index, tmin=tmin, tmax=tmax)\n", " .to_numpy()\n", " .reshape(4)\n", " )\n", " metrics.loc[metrics.index, (name, periods[1])] = (\n", " ml.stats.summary(stats=metrics.index, tmin=tmax, tmax=tmax_val)\n", " .to_numpy()\n", " .reshape(4)\n", " )\n", "\n", "metrics.index = index\n", "metrics.astype(float).round(2)" ] }, { "cell_type": "markdown", "id": "b8eb3c54", "metadata": {}, "source": [ "## Data sources:\n", "- The groundwater level time series for the hydrological research station Wagna in Austria were obtained from the government of Styria in cooperation with [JR-AquaConsol](https://www.jr-aquaconsol.at/). We acknowledge JR-AquaConsol for providing the time series and allowing its use in this example. This data may not be redistributed without explicit permission from JR-AquaConsol.\n", "\n", "- Precipitation and evaporation time series were obtained from the gridded E-OBS database [(Cornes et al. (2018)](#References). We acknowledge the E-OBS dataset from the EU-FP6 project UERRA (https://www.uerra.eu) and the Copernicus Climate Change Service, and the data providers in the ECA&D project (https://www.ecad.eu). \n", "\n", "## References\n", "\n", "- Cornes, R., G. van der Schrier, E.J.M. van den Besselaar, and P.D. Jones. 2018: An Ensemble Version of the E-OBS Temperature and Precipitation Datasets, J. Geophys. Res. Atmos., 123. doi:10.1029/2017JD028200.\n", "- Collenteur, R. A., Bakker, M., Klammler, G., and Birk, S. (2021) Estimation of groundwater recharge from groundwater levels using nonlinear transfer function noise models and comparison to lysimeter data, Hydrol. Earth Syst. Sci., 25, 2931–2949, https://doi.org/10.5194/hess-25-2931-2021." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 08:41:22) [MSC v.1929 64 bit (AMD64)]" }, "vscode": { "interpreter": { "hash": "29475f8be425919747d373d827cb41e481e140756dd3c75aa328bf3399a0138e" } } }, "nbformat": 4, "nbformat_minor": 5 }