pastas.stats.tests.ljung_box#

ljung_box(series, lags=15, nparam=0, full_output=False)[source]#

Ljung-box test for autocorrelation.

Parameters
  • series (pandas.Series, optional) – series to calculate the autocorrelation for that is used in the Ljung-Box test.

  • lags (int, optional) – The maximum lag to compute the Ljung-Box test statistic for.

  • nparam (int, optional) – Number of calibrated parameters in the model.

  • full_output (bool, optional) – Return the result of the test as a boolean (True) or not (False).

Returns

  • q_stat (float) – The computed Q test statistic.

  • pval (float) – The probability of the computed Q test statistic.

Return type

Tuple[float, float]

Notes

The Ljung-Box test [Ljung_1978] tests the null-hypothesis that a time series are independently distributed up to a desired time lag $k$ and is computed as follows:

\[Q(k) = n (n + 2) \sum_{k=1}^{h} \frac{\rho^2(k)}{n - k}\]

where \(\rho_k\) is the autocorrelation at lag $k$, $h$ is the maximum lag used for calculation, and $n$ is the number of values in the noise series. The computed $Q$-statistic is then compared to a critical value computed from a \(\chi^2_{\alpha, h-p}\) distribution with a significance level \(\alpha\) and $h-p$ degrees of freedom, where $h$ is the number of lags and $p$ the number of the noise model parameters.

Considerations for this test:

  • The time series should have equidistant time steps. An adapted version of the Ljung-Box test is available through ps.stats.stoffer_toloi.

  • A potential problem of the Ljung-Box test is the low power of the test when testing for a large number of lags using a small sample size $n$. It has been suggested that suggested that \(k \leq n/4\) but also as low as \(k \leq n/20\). If we are using daily groundwater levels observations, and we want to test for autocorrelation for lags up to one year (365 days) this means that we need between 4 and ten years of data.

References

Ljung_1978

Ljung, G. and Box, G. (1978). On a Measure of Lack of Fit in Time Series Models, Biometrika, 65, 297-303.

Examples

>>> res = pd.Series(index=pd.date_range(start=0, periods=1000, freq="D"),
>>>                 data=np.random.rand(1000))
>>> stat, p = ps.stats.ljung_box(res, lags=15)
>>> if p > alpha:
>>>    print("Failed to reject the Null-hypothesis, no significant"
>>>          "autocorrelation. p =", p.round(2))
>>> else:
>>>    print("Reject the Null-hypothesis. p =", p.round(2))

See also

pastas.stats.acf

This method is called to compute the autocorrelation function.

pastas.stats.stoffer_toloi

Similar method but adapted for time series with missing data.