k points closest to the origin… here is a solution to the problem.
k points closest to the origin
I have a list of points: [(x, y), (x, y), (x,y) ... (x, y)]
。
I want k
nearest point (0, 0).
I’m trying to implement content associations like this. However, I implemented the algorithm incorrectly, and I’m not sure what went wrong. I guess maybe heapify
doesn’t know how to maintain the order between points. How do I fix this?
Code
import matplotlib.pyplot as plt
from random import randint
from heapq import heappush, heappop, heapify
from math import sqrt
def distance(pointA, pointB):
return sqrt((pointB[0] - pointA[0]) ** 2 + (pointB[1] - pointA[1]) ** 2)
def closest(points, k, origin):
heap = []
for point in points[:k]:
heappush(heap, point)
for point in points[k:]:
if distance(point, origin) < distance(heap[0], origin):
heappop(heap)
heappush(heap, point)
return heap
def naive(points, k, origin):
sortedPoints = sorted(points, key=lambda p: distance(p, origin))
return sortedPoints[:k]
points = [(randint(0, 100), randint(0, 100)) for i in range(100)]
k = 4
resA = closest(points, k, (0, 0))
resB = naive(points, k, (0, 0))
plt.scatter(*zip(*points))
plt.scatter(*zip(*resA))
plt.scatter(*zip(*resB))
plt.show()
Result
The green dots are given
by the naïve method, and the orange dots are given using the heap method.
Solution
The heap invariants in your solution use the first element of the point. You want to use the distance from point to origin:
def closest(points, k, origin):
heap = [(-distance(p, origin), p) for p in points[:k]]
heapify(heap)
for p in points[k:]:
d = distance(p, origin)
heappushpop(heap, (-d, p))
return [p for nd, p in heap]
Note: I also imported >heappushpop From heapq
, as it is more efficient than calling it individually.
EDIT: Removed conditional enclosing for heappushpop
calls, as conditions also exist in the function.