AR(1) noise model with irregular synthetic time series#

R.A. Collenteur, University of Graz, May 2020

In this notebook the classical Autoregressive AR(1) noise model is tested for Pastas models. This noise model is tested against synthetic data generated with NumPy or Statsmodels’ ARMA model. This noise model is tested on head time series with regular and irregular time steps.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import pastas as ps

ps.set_log_level("ERROR")
ps.show_versions()
Pastas version: 1.8.0b
Python version: 3.11.10
NumPy version: 2.0.2
Pandas version: 2.2.3
SciPy version: 1.15.0
Matplotlib version: 3.10.0
Numba version: 0.60.0
DeprecationWarning: As of Pastas 1.5, no noisemodel is added to the pastas Model class by default anymore. To solve your model using a noisemodel, you have to explicitly add a noisemodel to your model before solving. For more information, and how to adapt your code, please see this issue on GitHub: https://github.com/pastas/pastas/issues/735

1. Develop the AR(1) Noise Model for Pastas#

The following formula is used to calculate the noise according to the ARMA(1,1) process:

\[\upsilon(t_i) = r(t_i) - r(t_{i-1}) \text{e}^{-\Delta t_i / \alpha}\]

where \(\upsilon\) is the noise, \(\Delta t_i\) the time step between the residuals (\(r\)), and \(\alpha\) [days] the AR parameter of the model. The model is named NoiseModel and can be found in noisemodel.py. It can be added to a Pastas model as follows: ml.add_noisemodel(ps.ArNoiseModel())

2. Generate synthetic head time series#

# Read in some data
rain = (
    pd.read_csv("../examples/data/rain_260.csv", index_col=0, parse_dates=[0]).squeeze()
    / 1000
)

# Set the True parameters
Atrue = 800
ntrue = 1.4
atrue = 200
dtrue = 20

# Generate the head
step = ps.Gamma().block([Atrue, ntrue, atrue], cutoff=0.9999)
h = dtrue * np.ones(len(rain) + step.size)
for i in range(len(rain)):
    h[i : i + step.size] += rain[i] * step
head = pd.DataFrame(
    index=rain.index,
    data=h[: len(rain)],
)
head = head["1990":"2015"]

# Plot the head without noise
plt.figure(figsize=(10, 2))
plt.plot(head, "k.", label="head")
plt.legend(loc=0)
plt.ylabel("Head (m)")
plt.xlabel("Time (years)");
Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
../_images/e1f161bb7b07bc91fd65d302aa894abc1bb3a10a21ef6b17389858bd03a63d72.png

3. Generate AR(1) noise and add it to the synthetic heads#

In the following code-block, noise is generated using an AR(1) process using Numpy.

# reproduction of random numbers
np.random.seed(1234)
alpha = 0.8

# generate samples using NumPy
random_seed = np.random.RandomState(1234)
noise = random_seed.normal(0, 1, len(head)) * np.std(head.values) * 0.2
a = np.zeros_like(head[0])

for i in range(1, noise.size):
    a[i] = noise[i] + a[i - 1] * alpha

head_noise = head[0] + a

4. Create and solve a Pastas Model#

ml = ps.Model(head_noise)
sm = ps.StressModel(rain, ps.Gamma(), name="recharge", settings="prec")
ml.add_stressmodel(sm)
ml.add_noisemodel(ps.ArNoiseModel())

ml.solve(tmin="1991", tmax="2015-06-29", report=False)

# Plot the results
axes = ml.plots.results(figsize=(10, 5))
axes[-2].plot(ps.Gamma().step([Atrue, ntrue, atrue], cutoff=0.999));
../_images/4f177f7082e56fefd263e09a7a1246f6f839b852d9095dad8c44ce585290c0cb.png

5. Did we find back the original AR parameter?#

print(np.exp(-1.0 / ml.parameters.loc["noise_alpha", "optimal"]).round(2), "vs", alpha)
0.8 vs 0.8

The estimated parameters for the noise model are almost the same as the true parameters, showing that the model works for regular time steps.

6. So is the autocorrelation removed correctly?#

ml.plots.diagnostics(figsize=(10, 4));
../_images/f53c3cf1718ded330fd1141746f6855432b9e198c126d1d8d0ca77e68b3dfcdb.png

That seems okay. It is important to understand that this noisemodel will only help in removing autocorrelations at the first time lag, but not at larger time lags.

7. Test the NoiseModel for irregular time steps#

In this final step the Ar(1) noisemodel is tested for irregular timesteps, using the indices from a real groundwater level time series.

index = (
    pd.read_csv("../examples/data/test_index.csv", parse_dates=True, index_col=0)
    .index.round("D")
    .drop_duplicates()
)
head_irregular = head_noise.reindex(index)
ml = ps.Model(head_irregular)
sm = ps.StressModel(rain, ps.Gamma(cutoff=0.99), name="recharge", settings="prec")
ml.add_stressmodel(sm)
ml.add_noisemodel(ps.ArNoiseModel())

# ml.set_parameter("noise_alpha", 9.5, vary=False)
# ml.set_parameter("recharge_A", 800, vary=False)
# ml.set_parameter("recharge_n", 1.2, vary=False)
# ml.set_parameter("recharge_a", 200, vary=False)

ml.solve(tmin="1991", tmax="2015-06-29", report=False)
axes = ml.plots.results(figsize=(10, 5))
axes[-2].plot(ps.Gamma().step([Atrue, ntrue, atrue]))

print(np.exp(-1.0 / ml.parameters.loc["noise_alpha", "optimal"]).round(2), "vs", alpha)
0.78 vs 0.8
../_images/fd2f2ce37948593695e509a07c8b0ab1518abcc6bd05149d5d44a7f74c1fadbd.png
ml.plots.diagnostics(figsize=(10, 4), acf_options={"bin_width": 0.5});
../_images/f3295020ac7a7f49cd9910e803e6a1f08613084aefda38f30bb023f249c53159.png

This autocorrelation plot looks good too.

v = ml.noise()
print("", v.loc[:"2010-01-01"].std())
print("", v.loc["2010-01-01":].std())
0.08935765978220056
 0.06053755434982547