Python – Draw boundary lines based on kmeans cluster centers

Draw boundary lines based on kmeans cluster centers… here is a solution to the problem.

Draw boundary lines based on kmeans cluster centers

I’m new to Scikit Learn but want to try an interesting project.

I

have the longitude and latitude of points in the UK, and I used scikit to learn KMeans classes to create cluster centers. To visualize this data, rather than having points as clusters, I want to draw boundaries around each cluster. For example, if one cluster is London and the other is Oxford, I currently have a point in the center of each city, but I wonder if there is a way to use this data to create boundary lines based on my cluster?

So far, here’s the code I used to create the cluster:

import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

location1="XXX"
df = pd.read_csv(location1, encoding = "ISO-8859-1")

#Run kmeans clustering
X = df[['long','lat']].values #~2k locations in the UK
y=df['label'].values   #Label is a 0 or 1
kmeans = KMeans(n_clusters=30, random_state=0).fit(X, y)
centers=kmeans.cluster_centers_
plt.scatter(centers[:,0],centers[:,1], marker='s', s=100)

So I’d like to be able to convert the center in the example above to a line dividing each area – is this possible?

Thanks,

Anant

Solution

I’m guessing you’re talking about spatial boundaries, in which case you should follow Bunyk’s advice and use the Voronoi diagram [ 1 ]. http://nbviewer.jupyter.org/gist/pv/8037100 .

Related Problems and Solutions