pastas.stats.core.ccf#
- ccf(x, y, lags=365, bin_method='regular', bin_width=0.5, max_gap=inf, min_obs=50, full_output=False, alpha=0.05, fallback_bin_method='gaussian')[source]#
Method to compute the cross-correlation for irregular time series.
- Parameters
x (pandas.Series) – Pandas Series containing the values to calculate the cross-correlation on. The index has to be a Pandas.DatetimeIndex.
y (pandas.Series) – Pandas Series containing the values to calculate the cross-correlation on. The index has to be a Pandas.DatetimeIndex.
lags (array_like, optional) – numpy array containing the lags in days for which the cross-correlation is calculated. Defaults is all lags from 1 to 365 days.
bin_method (str, optional) – method to determine the type of bin. Options are “regular” for regular data (default), and “gaussian” and “rectangle” for irregular data.
bin_width (float, optional) – number of days used as the width for the bin to calculate the correlation.
max_gap (float, optional) – Maximum timestep gap in the data. All timesteps above this gap value are not used for calculating the average timestep. This can be helpful when there is a large gap in the data that influences the average timestep.
min_obs (int, optional) – Minimum number of observations in a bin to determine the correlation.
full_output (bool, optional) – If True, also estimated uncertainties are returned. Default is False.
alpha (float) – alpha level to compute the confidence interval (e.g., 1-alpha).
fallback_bin_method (str, optional) – method to determine the type of bin used to compute the correlations if the data has irregular time steps between the measurements. Options are “gaussian” (default) and “rectangle” .
- Returns
result – If full_output=True, a DataFrame with columns “ccf”, “conf”, and “n”, containing the cross-correlation function, confidence intervals (depends on alpha), and the number of samples n used to compute these, respectively. If full_output=False, only the CCF is returned.
- Return type
Examples
>>> ccf = ps.stats.ccf(x, y, bin_method="gaussian")
Notes
The CCF method primarily tries to estimate the correlation using common techniques if the time step between the measurements is regular. If the time step is irregular, the method falls back to an alternative method to calculate the correlation function for irregular timesteps based on the slotting technique Rehfeld et al. [2011]. Different methods (kernels) to bin the data are available.
Estimating the correlation for irregular time steps can be challenging. Depending on the data and the binning method and settings used, the correlation can be above 1 or below -1. If this occurs, a warning is raised.