{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Synthetic Data\n",
    "\n",
    "*Developed by Stijn Klop and Mark Bakker*\n",
    "\n",
    "This Notebook contains a number of examples and tests with synthetic data. The purpose of this notebook is to demonstrate the noise model of Pastas.\n",
    "\n",
    "In this Notebook, heads are generated with a known response function. Next, Pastas is used to solve for the parameters of the model it is verified that Pastas finds the correct parameters back. Several different types of errors are introduced in the generated heads and it is tested whether the confidence intervals computed by Pastas are reasonable. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from scipy.special import gammainc, gammaincinv\n",
    "import pandas as pd\n",
    "import pastas as ps\n",
    "\n",
    "ps.show_versions()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load data and define functions\n",
    "The rainfall observations are read from file. The rainfall series is taken from KNMI station De Bilt. Heads are generated with a Gamma response function which is defined below. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rain = pd.read_csv('../examples/data/rain_260.csv', index_col=0, parse_dates=[0]).squeeze() / 1000"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def gamma_tmax(A, n, a, cutoff=0.999):\n",
    "    return gammaincinv(n, cutoff) * a\n",
    "\n",
    "def gamma_step(A, n, a, cutoff=0.999):\n",
    "    tmax = gamma_tmax(A, n, a, cutoff)\n",
    "    t = np.arange(0, tmax, 1)\n",
    "    s = A * gammainc(n, t / a)\n",
    "    return s\n",
    "\n",
    "def gamma_block(A, n, a, cutoff=0.999):\n",
    "    # returns the gamma block response starting at t=0 with intervals of delt = 1\n",
    "    s = gamma_step(A, n, a, cutoff)\n",
    "    return np.append(s[0], s[1:] - s[:-1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Gamma response function requires 3 input arguments; A, n and a. The values for these parameters are defined along with the parameter d, the base groundwater level. The response function is created using the functions defined above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Atrue = 800\n",
    "ntrue = 1.1\n",
    "atrue = 200\n",
    "dtrue = 20\n",
    "h = gamma_block(Atrue, ntrue, atrue)\n",
    "tmax = gamma_tmax(Atrue, ntrue, atrue)\n",
    "plt.plot(h)\n",
    "plt.xlabel('Time (days)')\n",
    "plt.ylabel('Head response (m) due to 1 mm of rain in day 1')\n",
    "plt.title('Gamma block response with tmax=' + str(int(tmax)));"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create synthetic observations\n",
    "Rainfall is used as input series for this example. No errors are introduced. A Pastas model is created to test whether Pastas is able to . The generated head series is purposely not generated with convolution.\n",
    "Heads are computed for the period 1990 - 2000. Computations start in 1980 as a warm-up period. Convolution is not used so that it is clear how the head is computed. The computed head at day 1 is the head at the end of day 1 due to rainfall during day 1. No errors are introduced."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "step = gamma_block(Atrue, ntrue, atrue)[1:]\n",
    "lenstep = len(step)\n",
    "h = dtrue * np.ones(len(rain) + lenstep)\n",
    "for i in range(len(rain)):\n",
    "    h[i:i + lenstep] += rain[i] * step\n",
    "head = pd.DataFrame(index=rain.index, data=h[:len(rain)],)\n",
    "head = head['1990':'1999']\n",
    "\n",
    "plt.figure(figsize=(12,5))\n",
    "plt.plot(head,'k.', label='head')\n",
    "plt.legend(loc=0)\n",
    "plt.ylabel('Head (m)')\n",
    "plt.xlabel('Time (years)');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create Pastas model\n",
    "The next step is to create a Pastas model. The head generated using the Gamma response function is used as input for the Pastas model. \n",
    "\n",
    "A `StressModel` instance is created and added to the Pastas model. The `StressModel` intance takes the rainfall series as input aswell as the type of response function, in this case the Gamma response function ( `ps.Gamma`).\n",
    "\n",
    "The Pastas model is solved without a noise model since there is no noise present in the data. The results of the Pastas model are plotted."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ml = ps.Model(head)\n",
    "sm = ps.StressModel(rain, ps.Gamma, name='recharge', settings='prec')\n",
    "ml.add_stressmodel(sm)\n",
    "ml.solve(noise=False, ftol=1e-8)\n",
    "ml.plots.results();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The results of the Pastas model show the calibrated parameters for the Gamma response function. The parameters calibrated using pastas are equal to the `Atrue`, `ntrue`, `atrue` and `dtrue` parameters defined above. The Explained Variance Percentage for this example model is 100%. \n",
    "\n",
    "The results plots show that the Pastas simulation is identical to the observed groundwater. The residuals of the simulation are shown in the plot together with the response function and the contribution for each stress.\n",
    "\n",
    "Below the Pastas block response and the true Gamma response function are plotted."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "plt.plot(gamma_block(Atrue, ntrue, atrue), label='Synthetic response')\n",
    "plt.plot(ml.get_block_response('recharge'), '-.', label='Pastas response')\n",
    "plt.legend(loc=0)\n",
    "plt.ylabel('Head response (m) due to 1 m of rain in day 1')\n",
    "plt.xlabel('Time (days)');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Test 1: Adding noise\n",
    "In the next test example noise is added to the observations of the groundwater head. The noise is normally distributed noise with a mean of 0 and a standard deviation of 1 and is scaled with the standard deviation of the head. \n",
    "\n",
    "The noise series is added to the head series created in the previous example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "random_seed = np.random.RandomState(15892)\n",
    "\n",
    "noise = random_seed.normal(0,1,len(head)) * np.std(head.values) * 0.5\n",
    "head_noise = head[0] + noise"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create Pastas model\n",
    "\n",
    "A pastas model is created using the head with noise. A stress model is added to the Pastas model and the model is solved."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ml2 = ps.Model(head_noise)\n",
    "sm2 = ps.StressModel(rain, ps.Gamma, name='recharge', settings='prec')\n",
    "ml2.add_stressmodel(sm2)\n",
    "ml2.solve(noise=True)\n",
    "ml2.plots.results();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The results of the simulation show that Pastas is able to filter the noise from the observed groundwater head. The simulated groundwater head and the generated synthetic head are plotted below. The parameters found with the Pastas optimization are similair to the original parameters of the Gamma response function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(12,5))\n",
    "plt.plot(head_noise, '.k', alpha=0.1, label='Head with noise')\n",
    "plt.plot(head, '.k', label='Head true')\n",
    "plt.plot(ml2.simulate(), label='Pastas simulation')\n",
    "plt.title('Simulated Pastas head compared with synthetic head')\n",
    "plt.legend(loc=0)\n",
    "plt.ylabel('Head (m)')\n",
    "plt.xlabel('Date (years)');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Test 2: Adding correlated noise\n",
    "In this example correlated noise is added to the observed head. The correlated noise is generated using the noise series created in the previous example. The correlated noise is implemented as exponential decay using the following formula:\n",
    "\n",
    "$$ n_{c}(t) = e^{-1/\\alpha} \\cdot n_{c}(t-1) + n(t)$$\n",
    "\n",
    "where $n_{c}$ is the correlated noise, $\\alpha$ is the noise decay parameter and $n$ is the uncorrelated noise. The noise series that is created is added to the observed groundwater head. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "noise_corr = np.zeros(len(noise))\n",
    "noise_corr[0] = noise[0]\n",
    "\n",
    "alphatrue = 2\n",
    "\n",
    "for i in range(1, len(noise_corr)):\n",
    "    noise_corr[i] = np.exp(-1/alphatrue) * noise_corr[i - 1] + noise[i]\n",
    "    \n",
    "head_noise_corr = head[0] + noise_corr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create Pastas model\n",
    "A Pastas model is created using the head with correlated noise as input. A stressmodel is added to the model and the Pastas model is solved. The results of the model are plotted. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ml3 = ps.Model(head_noise_corr)\n",
    "sm3 = ps.StressModel(rain, ps.Gamma, name='recharge', settings='prec')\n",
    "ml3.add_stressmodel(sm3)\n",
    "ml3.solve(noise=True)\n",
    "ml3.plots.results();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Pastas model is able to calibrate the model parameters fairly well. The calibrated parameters are close to the true values defined above. The `noise_alpha` parameter calibrated by Pastas is close the the `alphatrue` parameter defined for the correlated noise series.\n",
    "\n",
    "Below the head simulated with the Pastas model is plotted together with the head series and the head series with the correlated noise."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(12,5))\n",
    "plt.plot(head_noise_corr, '.k', alpha=0.1, label='Head with correlated noise')\n",
    "plt.plot(head, '.k', label='Head true')\n",
    "plt.plot(ml3.simulate(), label='Pastas simulation')\n",
    "plt.title('Simulated Pastas head compared with synthetic head')\n",
    "plt.legend(loc=0)\n",
    "plt.ylabel('Head (m)')\n",
    "plt.xlabel('Date (years)');"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}