Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API.search_30_day and API.search_full_archive #1294

Merged
merged 4 commits into from Jul 25, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
99 changes: 97 additions & 2 deletions docs/api.rst
Expand Up @@ -713,8 +713,8 @@ Saved Searches Methods
:rtype: :class:`SavedSearch` object


Help Methods
------------
Search Methods
--------------

.. method:: API.search(q, [geocode], [lang], [locale], [result_type], \
[count], [until], [since_id], [max_id], \
Expand Down Expand Up @@ -768,6 +768,101 @@ Help Methods
:rtype: :class:`SearchResults` object


.. method:: API.search_30_day(environment_name, query, [tag], [fromDate], \
[toDate], [maxResults], [next])

Premium search that provides Tweets posted within the last 30 days.

:param environment_name: The (case-sensitive) label associated with your
search developer environment, as displayed at
https://developer.twitter.com/en/account/environments.
:param query: The equivalent of one premium rule/filter, with up to 1,024
characters (256 with Sandbox dev environments).
This parameter should include ALL portions of the rule/filter, including
all operators, and portions of the rule should not be separated into
other parameters of the query.
:param tag: Tags can be used to segregate rules and their matching data into
different logical groups. If a rule tag is provided, the rule tag is
included in the 'matching_rules' attribute.
It is recommended to assign rule-specific UUIDs to rule tags and maintain
desired mappings on the client side.
:param fromDate: The oldest UTC timestamp (from most recent 30 days) from
which the Tweets will be provided. Timestamp is in minute granularity and
is inclusive (i.e. 12:00 includes the 00 minute).
Specified: Using only the fromDate with no toDate parameter will deliver
results for the query going back in time from now( ) until the fromDate.
Not Specified: If a fromDate is not specified, the API will deliver all
of the results for 30 days prior to now( ) or the toDate (if specified).
If neither the fromDate or toDate parameter is used, the API will deliver
all results for the most recent 30 days, starting at the time of the
request, going backwards.
:param toDate: The latest, most recent UTC timestamp to which the Tweets
will be provided. Timestamp is in minute granularity and is not inclusive
(i.e. 11:59 does not include the 59th minute of the hour).
Specified: Using only the toDate with no fromDate parameter will deliver
the most recent 30 days of data prior to the toDate.
Not Specified: If a toDate is not specified, the API will deliver all of
the results from now( ) for the query going back in time to the fromDate.
If neither the fromDate or toDate parameter is used, the API will deliver
all results for the entire 30-day index, starting at the time of the
request, going backwards.
:param maxResults: The maximum number of search results to be returned by a
request. A number between 10 and the system limit (currently 500, 100 for
Sandbox environments). By default, a request response will return 100
results.
:param next: This parameter is used to get the next 'page' of results. The
value used with the parameter is pulled directly from the response
provided by the API, and should not be modified.


.. method:: API.search_full_archive(environment_name, query, [tag], \
[fromDate], [toDate], [maxResults], [next])

Premium search that provides Tweets from as early as 2006, starting with the
first Tweet posted in March 2006.

:param environment_name: The (case-sensitive) label associated with your
search developer environment, as displayed at
https://developer.twitter.com/en/account/environments.
:param query: The equivalent of one premium rule/filter, with up to 1,024
characters (256 with Sandbox dev environments).
This parameter should include ALL portions of the rule/filter, including
all operators, and portions of the rule should not be separated into
other parameters of the query.
:param tag: Tags can be used to segregate rules and their matching data into
different logical groups. If a rule tag is provided, the rule tag is
included in the 'matching_rules' attribute.
It is recommended to assign rule-specific UUIDs to rule tags and maintain
desired mappings on the client side.
:param fromDate: The oldest UTC timestamp (from most recent 30 days) from
which the Tweets will be provided. Timestamp is in minute granularity and
is inclusive (i.e. 12:00 includes the 00 minute).
Specified: Using only the fromDate with no toDate parameter will deliver
results for the query going back in time from now( ) until the fromDate.
Not Specified: If a fromDate is not specified, the API will deliver all
of the results for 30 days prior to now( ) or the toDate (if specified).
If neither the fromDate or toDate parameter is used, the API will deliver
all results for the most recent 30 days, starting at the time of the
request, going backwards.
:param toDate: The latest, most recent UTC timestamp to which the Tweets
will be provided. Timestamp is in minute granularity and is not inclusive
(i.e. 11:59 does not include the 59th minute of the hour).
Specified: Using only the toDate with no fromDate parameter will deliver
the most recent 30 days of data prior to the toDate.
Not Specified: If a toDate is not specified, the API will deliver all of
the results from now( ) for the query going back in time to the fromDate.
If neither the fromDate or toDate parameter is used, the API will deliver
all results for the entire 30-day index, starting at the time of the
request, going backwards.
:param maxResults: The maximum number of search results to be returned by a
request. A number between 10 and the system limit (currently 500, 100 for
Sandbox environments). By default, a request response will return 100
results.
:param next: This parameter is used to get the next 'page' of results. The
value used with the parameter is pulled directly from the response
provided by the API, and should not be modified.


List Methods
------------

Expand Down
32 changes: 31 additions & 1 deletion tweepy/api.py
Expand Up @@ -7,7 +7,7 @@

import six

from tweepy.binder import bind_api
from tweepy.binder import bind_api, pagination
from tweepy.error import TweepError
from tweepy.parsers import ModelParser, Parser
from tweepy.utils import list_to_csv
Expand Down Expand Up @@ -1279,6 +1279,36 @@ def search(self):
'max_id', 'until', 'result_type', 'count',
'include_entities']
)

@pagination(mode='next')
def search_30_day(self, environment_name, *args, **kwargs):
""" :reference: https://developer.twitter.com/en/docs/tweets/search/api-reference/premium-search
:allowed_param: 'query', 'tag', 'fromDate', 'toDate', 'maxResults',
'next'
"""
return bind_api(
api=self,
path='/tweets/search/30day/{}.json'.format(environment_name),
payload_type='status', payload_list=True,
allowed_param=['query', 'tag', 'fromDate', 'toDate', 'maxResults',
'next'],
require_auth=True
)(*args, **kwargs)

@pagination(mode='next')
def search_full_archive(self, environment_name, *args, **kwargs):
""" :reference: https://developer.twitter.com/en/docs/tweets/search/api-reference/premium-search
:allowed_param: 'query', 'tag', 'fromDate', 'toDate', 'maxResults',
'next'
"""
return bind_api(
api=self,
path='/tweets/search/fullarchive/{}.json'.format(environment_name),
payload_type='status', payload_list=True,
allowed_param=['query', 'tag', 'fromDate', 'toDate', 'maxResults',
'next'],
require_auth=True
)(*args, **kwargs)

@property
def reverse_geocode(self):
Expand Down
10 changes: 9 additions & 1 deletion tweepy/binder.py
Expand Up @@ -234,7 +234,8 @@ def execute(self):
raise TweepError(error_msg, resp, api_code=api_error_code)

# Parse the response payload
self.return_cursors = self.return_cursors or 'cursor' in self.session.params
self.return_cursors = (self.return_cursors or
'cursor' in self.session.params or 'next' in self.session.params)
result = self.parser.parse(self, resp.text, return_cursors=self.return_cursors)

# Store result into cache if one is available.
Expand Down Expand Up @@ -266,3 +267,10 @@ def _call(*args, **kwargs):
_call.pagination_mode = 'page'

return _call


def pagination(mode):
def decorator(method):
method.pagination_mode = mode
return method
return decorator
24 changes: 24 additions & 0 deletions tweepy/cursor.py
Expand Up @@ -17,6 +17,8 @@ def __init__(self, method, *args, **kwargs):
self.iterator = DMCursorIterator(method, *args, **kwargs)
elif method.pagination_mode == 'id':
self.iterator = IdIterator(method, *args, **kwargs)
elif method.pagination_mode == "next":
self.iterator = NextIterator(method, *args, **kwargs)
elif method.pagination_mode == 'page':
self.iterator = PageIterator(method, *args, **kwargs)
else:
Expand Down Expand Up @@ -201,6 +203,28 @@ def prev(self):
return self.method(page=self.current_page, *self.args, **self.kwargs)


class NextIterator(BaseIterator):

def __init__(self, method, *args, **kwargs):
BaseIterator.__init__(self, method, *args, **kwargs)
self.next_token = self.kwargs.pop('next', None)
self.page_count = 0

def next(self):
if self.next_token == -1 or (self.limit and self.page_count == self.limit):
raise StopIteration
data = self.method(next=self.next_token, return_cursors=True, *self.args, **self.kwargs)
self.page_count += 1
if isinstance(data, tuple):
data, self.next_token = data
else:
self.next_token = -1
return data

def prev(self):
raise TweepError('This method does not allow backwards pagination')


class ItemIterator(BaseIterator):

def __init__(self, page_iterator):
Expand Down
20 changes: 12 additions & 8 deletions tweepy/models.py
Expand Up @@ -61,14 +61,18 @@ def parse_list(cls, api, json_list):
"""
results = ResultSet()

# Handle map parameter for statuses/lookup
if isinstance(json_list, dict) and 'id' in json_list:
for _id, obj in json_list['id'].items():
if obj:
results.append(cls.parse(api, obj))
else:
results.append(cls.parse(api, {'id': int(_id)}))
return results
if isinstance(json_list, dict):
# Handle map parameter for statuses/lookup
if 'id' in json_list:
for _id, obj in json_list['id'].items():
if obj:
results.append(cls.parse(api, obj))
else:
results.append(cls.parse(api, {'id': int(_id)}))
return results
# Handle premium search
if 'results' in json_list:
json_list = json_list['results']

for obj in json_list:
if obj:
Expand Down
4 changes: 3 additions & 1 deletion tweepy/parsers.py
Expand Up @@ -50,7 +50,9 @@ def parse(self, method, payload, return_cursors=False):
raise TweepError('Failed to parse JSON payload: %s' % e)

if return_cursors and isinstance(json, dict):
if 'next_cursor' in json:
if 'next' in json:
return json, json['next']
elif 'next_cursor' in json:
if 'previous_cursor' in json:
cursors = json['previous_cursor'], json['next_cursor']
return json, cursors
Expand Down