xrspatial.classify.natural_breaks#

xrspatial.classify.natural_breaks(agg: xarray.core.dataarray.DataArray, num_sample: Optional[int] = 20000, name: Optional[str] = 'natural_breaks', k: int = 5) xarray.core.dataarray.DataArray[source]#

Reclassifies data for array agg into new values based on Natural Breaks or K-Means clustering method. Values are grouped so that similar values are placed in the same group and space between groups is maximized.

Parameters
  • agg (xarray.DataArray) – 2D NumPy DataArray of values to be reclassified.

  • num_sample (int, default=20000) – Number of sample data points used to fit the model. Natural Breaks (Jenks) classification is indeed O(n²) complexity, where n is the total number of data points, i.e: agg.size When n is large, we should fit the model on a small sub-sample of the data instead of using the whole dataset.

  • k (int, default=5) – Number of classes to be produced.

  • name (str, default='natural_breaks') – Name of output aggregate.

Returns

natural_breaks_agg – 2D aggregate array of natural break allocations. All other input attributes are preserved.

Return type

xarray.DataArray of the same type as agg

References

Examples

natural_breaks() works with numpy backed xarray DataArray. .. sourcecode:: python

>>> import numpy as np
>>> import xarray as xr
>>> from xrspatial.classify import natural_breaks
>>> elevation = np.array([
    [np.nan,  1.,  2.,  3.,  4.],
    [ 5.,  6.,  7.,  8.,  9.],
    [10., 11., 12., 13., 14.],
    [15., 16., 17., 18., 19.],
    [20., 21., 22., 23., np.inf]
])
>>> agg_numpy = xr.DataArray(elevation, attrs={'res': (10.0, 10.0)})
>>> numpy_natural_breaks = natural_breaks(agg_numpy, k=5)
>>> print(numpy_natural_breaks)
<xarray.DataArray 'natural_breaks' (dim_0: 5, dim_1: 5)>
array([[nan,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  2.],
       [ 2.,  2.,  2.,  2.,  3.],
       [ 3.,  3.,  3.,  3.,  4.],
       [ 4.,  4.,  4.,  4., nan]], dtype=float32)
Dimensions without coordinates: dim_0, dim_1
Attributes:
    res:      (10.0, 10.0)

natural_breaks() works with cupy backed xarray DataArray. .. sourcecode:: python

>>> import cupy
>>> agg_cupy = xr.DataArray(cupy.asarray(elevation))
>>> cupy_natural_breaks = natural_breaks(agg_cupy)
>>> print(type(cupy_natural_breaks))
<class 'xarray.core.dataarray.DataArray'>
>>> print(cupy_natural_breaks)
<xarray.DataArray 'natural_breaks' (dim_0: 5, dim_1: 5)>
array([[nan,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  2.],
       [ 2.,  2.,  2.,  2.,  3.],
       [ 3.,  3.,  3.,  3.,  4.],
       [ 4.,  4.,  4.,  4., nan]], dtype=float32)
Dimensions without coordinates: dim_0, dim_1