pastas.stats.tests.ljung_box#
- ljung_box(series, lags=15, nparam=0, full_output=False)[source]#
Ljung-box test for autocorrelation.
- Parameters
series (pandas.Series, optional) – series to calculate the autocorrelation for that is used in the Ljung-Box test.
lags (int, optional) – The maximum lag to compute the Ljung-Box test statistic for.
nparam (int, optional) – Number of calibrated parameters in the model.
full_output (bool, optional) – Return the result of the test as a boolean (True) or not (False).
- Returns
q_stat (float) – The computed Q test statistic.
pval (float) – The probability of the computed Q test statistic.
- Return type
Notes
The Ljung-Box test [Ljung_1978] tests the null-hypothesis that a time series are independently distributed up to a desired time lag $k$ and is computed as follows:
\[Q(k) = n (n + 2) \sum_{k=1}^{h} \frac{\rho^2(k)}{n - k}\]where \(\rho_k\) is the autocorrelation at lag $k$, $h$ is the maximum lag used for calculation, and $n$ is the number of values in the noise series. The computed $Q$-statistic is then compared to a critical value computed from a \(\chi^2_{\alpha, h-p}\) distribution with a significance level \(\alpha\) and $h-p$ degrees of freedom, where $h$ is the number of lags and $p$ the number of the noise model parameters.
Considerations for this test:
The time series should have equidistant time steps. An adapted version of the Ljung-Box test is available through ps.stats.stoffer_toloi.
A potential problem of the Ljung-Box test is the low power of the test when testing for a large number of lags using a small sample size $n$. It has been suggested that suggested that \(k \leq n/4\) but also as low as \(k \leq n/20\). If we are using daily groundwater levels observations, and we want to test for autocorrelation for lags up to one year (365 days) this means that we need between 4 and ten years of data.
References
- Ljung_1978
Ljung, G. and Box, G. (1978). On a Measure of Lack of Fit in Time Series Models, Biometrika, 65, 297-303.
Examples
>>> res = pd.Series(index=pd.date_range(start=0, periods=1000, freq="D"), >>> data=np.random.rand(1000)) >>> stat, p = ps.stats.ljung_box(res, lags=15) >>> if p > alpha: >>> print("Failed to reject the Null-hypothesis, no significant" >>> "autocorrelation. p =", p.round(2)) >>> else: >>> print("Reject the Null-hypothesis. p =", p.round(2))
See also
pastas.stats.acf
This method is called to compute the autocorrelation function.
pastas.stats.stoffer_toloi
Similar method but adapted for time series with missing data.