pysteps.utils.cleansing.detect_outliers

pysteps.utils.cleansing.detect_outliers#

pysteps.utils.cleansing.detect_outliers(input_array, thr, coord=None, k=None, verbose=False)#

Detect outliers in a (multivariate and georeferenced) dataset.

Assume a (multivariate) Gaussian distribution and detect outliers based on the number of standard deviations from the mean.

If spatial information is provided through coordinates, the outlier detection can be localized by considering only the k-nearest neighbours when computing the local mean and standard deviation.

Parameters:
  • input_array (array_like) – Array of shape (n) or (n, m), where n is the number of samples and m the number of variables. If m > 1, the Mahalanobis distance is used. All values in input_array are required to have finite values.

  • thr (float) – The number of standard deviations from the mean used to define an outlier.

  • coord (array_like or None, optional) – Array of shape (n, d) containing the coordinates of the input data into a space of d dimensions. Passing coord requires that k is not None.

  • k (int or None, optional) – The number of nearest neighbours used to localize the outlier detection. If set to None (the default), it employs all the data points (global detection). Setting k requires that coord is not None.

  • verbose (bool, optional) – Print out information.

Returns:

out – A 1-D boolean array of shape (n) with True values indicating the outliers detected in input_array.

Return type:

array_like