xrspatial.classify.natural_breaks

xrspatial.classify.natural_breaks(agg: xarray.core.dataarray.DataArray, num_sample: Optional[int] = None, name: Optional[str] = 'natural_breaks', k: int = 5)xarray.core.dataarray.DataArray[source]

Groups data for array (agg) by distributing values using the Jenks Natural Breaks or k-means clustering method. Values are grouped so that similar values are placed in the same group and space between groups is maximized. The result is an xarray.DataArray.

agg: xarray.DataArray

2D array of values to bin. NumPy, CuPy, NumPy-backed Dask, or Cupy-backed Dask array

num_sample: int (optional)

Number of sample data points used to fit the model. Natural Breaks (Jenks) classification is indeed O(n²) complexity, where n is the total number of data points, i.e: agg.size When n is large, we should fit the model on a small sub-sample of the data instead of using the whole dataset.

k: int (default = 5)

Number of classes to be produced.

name: str, optional (default = “natural_breaks”)

Name of output aggregate.

natural_breaks_agg: xarray.DataArray

2D array, of the same type as the input, of class allocations.

Map Classify: - PySAL, Source code for mapclassify.classifiers, https://pysal.org/mapclassify/_modules/mapclassify/classifiers.html#NaturalBreaks, Accessed Apr. 21, 2021. # noqa perrygeo: - perrygeo/jenks, https://github.com/perrygeo/jenks/blob/master/jenks.pyx, Apr. 21, 2021. # noqa

Imports >>> import numpy as np >>> import xarray as xr >>> from xrspatial.classify import natural_breaks

Create DataArray >>> np.random.seed(0) >>> agg = xr.DataArray(np.random.rand(4,4),

dims = [“lat”, “lon”])

>>> height, width = agg.shape
>>> _lat = np.linspace(0, height - 1, height)
>>> _lon = np.linspace(0, width - 1, width)
>>> agg["lat"] = _lat
>>> agg["lon"] = _lon
>>> print(agg)
<xarray.DataArray (lat: 4, lon: 4)>
array([[0.5488135 , 0.71518937, 0.60276338, 0.54488318],
       [0.4236548 , 0.64589411, 0.43758721, 0.891773  ],
       [0.96366276, 0.38344152, 0.79172504, 0.52889492],
        [0.56804456, 0.92559664, 0.07103606, 0.0871293 ]])
Coordinates:
* lon      (lon) float64 0.0 1.0 2.0 3.0
* lat      (lat) float64 0.0 1.0 2.0 3.0

Create Natural Breaks Aggregate >>> natural_breaks_agg = natural_breaks(agg, k = 5) >>> print(natural_breaks_agg) <xarray.DataArray ‘natural_breaks’ (lat: 4, lon: 4)> array([[2., 3., 2., 2.],

[1., 2., 1., 4.], [4., 1., 3., 2.], [2., 4., 0., 0.]], dtype=float32)

Coordinates:
  • lat (lat) float64 0.0 1.0 2.0 3.0

  • lon (lon) float64 0.0 1.0 2.0 3.0