ST_ClusterDBSCAN
Name
ST_ClusterDBSCAN — Windowing function that returns integer id for the cluster each input geometry is in based on 2D implementation of Density-based spatial clustering of applications with noise (DBSCAN) algorithm.
Synopsis
integer
ST_ClusterDBSCAN
(
geometry winset
geom
, float8
eps
, integer
minpoints
)
;
Description
Returns cluster number for each input geometry, based on a 2D implementation of the
Density-based spatial clustering of applications with noise (DBSCAN)
algorithm. Unlike
ST_ClusterKMeans
, it does not require the number of clusters to be specified, but instead
uses the desired distance (
eps
) and density(
minpoints
) parameters to construct each cluster.
An input geometry will be added to a cluster if it is either:
-
A "core" geometry, that is within
eps
distance (Cartesian) of at leastminpoints
input geometries (including itself) or -
A "border" geometry, that is within
eps
distance of a core geometry.
Note that border geometries may be within
eps
distance of core geometries in more than one cluster; in this
case, either assignment would be correct, and the border geometry will be arbitrarily asssigned to one of the available clusters.
In these cases, it is possible for a correct cluster to be generated with fewer than
minpoints
geometries.
When assignment of a border geometry is ambiguous, repeated calls to ST_ClusterDBSCAN will produce identical results if an ORDER BY
clause is included in the window definition, but cluster assignments may differ from other implementations of the same algorithm.
Input geometries that do not meet the criteria to join any other cluster will be assigned a cluster number of NULL. |
Availability: 2.3.0 - requires GEOS
Examples
Assigning a cluster number to each parcel point:
SELECT parcel_id, ST_ClusterDBSCAN(geom, eps := 0.5, minpoints := 5) over () AS cid FROM parcels;
Combining parcels with the same cluster number into a single geometry. This uses named argument calling
SELECT cid, ST_Collect(geom) AS cluster_geom, array_agg(parcel_id) AS ids_in_cluster FROM ( SELECT parcel_id, ST_ClusterDBSCAN(geom, eps := 0.5, minpoints := 5) over () AS cid, geom FROM parcels) sq GROUP BY cid;