Releases: DocNow/twarc
v1.12.1
v1.12.0
This release adds initial support for the Gnip Full Archive premium endpoint if you are fortunate enough to have access to that. Many thanks to @epicfaace for adding the functionality and testing it.
v1.10.1
v1.10.1 includes option to manually enter user access token and secret during twarc configure
. This is in response to several reports of users who are not given the PIN when authenticating with their app at Twitter. User access token and secret can be generated manually at https://developer.twitter.com/en/apps
v1.10.0
v1.10.0 changes the behavior of the command line client so that it can also fetch retweets from a file of tweet_ids.
twarc retweets ids.txt > retweets.jsonl
This is in addition to the previous behavior where it accepted a tweet id to fetch the retweets for:
twarc retweets 20 > retweets.jsonl
You can also comma separate the tweet ids on the command line if you want:
twarc retweets 20,21 > retweets.jsonl
The internal interface to the Twarc.reteets has changed to now accept an iterator of tweet ids.
from twarc import Twarc
twitter = Twarc()
for tweet in twitter.retweets([20, 21]):
print(tweet['id_str'])
# etc
If you have been using the retweets method in your code you will want to adjust it to pass in a list of ids rather than the bare ids.
v1.9.1
v1.9.0
Premium Search API
v1.9.0 adds new functionality that allows you to use the Twitter Premium Search API endpoints. To use the Premium Search you will need to visit the Twitter Developer Dashboard and set up an environment that is attached to one of your apps. Then you should be able to use the label for your environment in your twarc search command.
For example to use the docnowdev
environment to search the 30 day endpoint you can:
twarc search blacklivesmatter --30day docnowdev > tweets.jsonl
or to search the full archive endpoint:
twarc search blacklivesmatter --fullarchive docnowdev > tweets.jsonl
Warning: Depending on your query this could quickly use up your budget! So you will likely also want to use --from_date
and/or --to_date
to limit the time range that you are searching. You can also use --limit
to limit the total number of tweets that are retrieved.
twarc search blacklivesmatter --30day docnowdev --to_date 2013-08-01 > tweets.jsonl
If your app is only authorized for the sandbox you must use the --sandbox
parameter which will alter the maximum number of tweets you will can retrieve in a request down to 100.
This functionality is also made available through the new Twarc.premium_search
method.
Twitter Labs
v1.9.0 also includes some initial support for the Twitter Labs endpoints. At the moment only the sample stream is supported, but we anticipate adding more as they are requested.
twarc --app_auth labs_v1_sample > sample.jsonl
v1.8.5
v1.8.4
v1.8.3
v1.8.2
The twarc command catches SIGTERM so that users can ctrl-c to stop the process. But previously it did so quietly, so as not to clutter up the console with a stack trace. But being quiet can be confusing when users put twarc into the background, and then log out, which can cause SIGTERM being sent to the process.
To aid in diagnosing twarc will now log when it has received a SIGTERM message before it stops. Thanks to @mielverkerken for identifying the issue, and helping diagnose it.