Python – Tweepy location filter does not work

Tweepy location filter does not work… here is a solution to the problem.

Tweepy location filter does not work

The issue has been resolved, see the solution in the accepted post

I’m trying to collect 50 tweets from a specific geographic region. My code below will print 50 tweets, but many tweets have coordinates of “NONE”. Does this mean that these tweets with “NONE” are not generated from the designated region? Can you explain what’s going on here? And how to collect 50 tweets from this designated geographic area? Thanks in advance.

# Import Tweepy, sys, sleep, credentials.py
try:
    import json
except ImportError:
    import simplejson as json
import tweepy, sys
from time import sleep
from credentials import *

# Access and authorize our Twitter credentials from credentials.py
auth = tweepy. OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy. API(auth)

# Assign coordinates to the variable
box = [-74.0,40.73,-73.0,41.73]

#override tweepy. StreamListener to add logic to on_status
class MyStreamListener(tweepy. StreamListener):
    def __init__(self, api=None):
        super(MyStreamListener, self).__init__()
        self.counter = 0

def on_status(self, status):
        record = {'Text': status.text, 'Coordinates': status.coordinates, 'Created At': status.created_at}
        self.counter += 1
        if self.counter <= 50:
            print record
            return True
        else:
            return False

def on_error(self, status_code):
        if status_code == 420:
            #returning False in on_data disconnects the stream
            return False

myStreamListener = MyStreamListener()
myStream = tweepy. Stream(api.auth, listener=myStreamListener)
myStream.filter(locations=box, async=True)
print myStream

The results are as follows:

{'Text': u"What?...", 'Created At': datetime.datetime(2017, 3, 12, 2, 55, 6), 'Coordinates': {u'type': u'Point', u'coordinates': [-74.
1234567, 40.1234567]}}
{'Text': u'WHEN?...', 'Created A
t': datetime.datetime(2017, 3, 12, 2, 55, 8), 'Coordinates': None}
{'Text': u'Wooo...', 'Created At': datetime.datetime(2017, 3, 12, 2, 55, 9), 'Coordinates': None}
{'Text': u'Man...', 'Created At': datetime.datetime(2017, 3, 12, 2, 55, 9), 'Coordina
tes': None}
{'Text': u'The...', 'Created At': datetime.datetime(201
7, 3, 12, 2, 55, 10), 'Coordinates': None}

Solution

From the documentation:

Only geolocated Tweets falling
within the requested bounding boxes will be included—unlike the Search
API, the user’s location field is not used to filter Tweets.

This guarantees that the tweets in the response come from the bounding box provided.

How do bounding box filters work?

The streaming API uses the following heuristic to determine whether a
given Tweet falls within a bounding box:

  • If the coordinates field is populated, the values there will be tested against the bounding box. Note that this field uses geoJSON
    order (longitude, latitude).

  • If coordinates is empty but place is populated,the region defined in place is checked for intersection against the locations bounding box.
    Any overlap will match. If none of the rules listed above match, the
    Tweet does not match the location query.

Again, this means that the coordinate field can be None, but the bbox filter is guaranteed to return tweets from the bounding box area

Source: https://dev.twitter.com/streaming/overview/request-parameters#locations

Edit: place is a coordinates-like field in the response.

Related Problems and Solutions