Releases: DocNow/twarc
v2.1.5
v2.1.4
The log will now include a message about where the config file has been loaded from. In addition --verbose
can be used to cause more information to be logged, such as the keys that are being used to access the API. Most of the time this is probably not a good idea since it makes your keys available in a log, but it can be useful in situations where debugging error responses from the API.
v2.1.3
v2.1.2
This release adds two new twarc2 subcommands:
conversations
conversations will read a file of tweets and look for any conversations they are a part of and will download the full conversation thread for them.
twarc2 conversations tweets.jsonl > conversations.jsonl
You can also give it a file of conversation_ids to download instead:
twarc2 conversations ids.txt > conversations.jsonl
timelines
Similarly timelines will read in a file of tweet ids and will download the user timeline for any user who authored the tweets.
twarc2 timelines tweets.jsonl > timelines.jsonl
You can also give it a file of user ids or usernames:
twarc2 timelines users.txt > timelines.jsonl
This functionality was first developed in the twarc-timelines plugin which has been renamed to twarc-timeline-archive because it does some extra things like writing timelines to separate directories and being able to be run on a schedule without redownloading previously downloaded data.
v2.1.1
v2.1.0
v2.1.0 removes the --flatten
option from many commands in the hopes of encouraging users to mostly use the original data as retrieved from the Twitter API. The subcommand twarc2 flatten
remains mostly for use in data processing pipelines that expect line oriented json where each object is a tweet:
twarc2 search blacklivesmatter | twarc2 flatten | jq .text
The twarc.expansions.flatten()
function has been updated to always return a list of tweets, and twarc.expansions.ensure_flattened()
can be used to make sure data has been flattened already when processing tweet data. Since it is designed for use in twarc plugins and other pieces of software that need to work with tweets it is also available for import from twarc:
from twarc import ensure_flattened
In addition this release also includes twarc conversation
for retrieving tweets from a a particular conversation thread.
v2.0.13
This bugfix release irons out some wrinkles that have been discovered during usage:
- Fix for handling search responses with missing
data
stanza but that contain a next token for another page of results. #464 - Stream diagnostics now to go stderr to not interfere with JSON being written to stdout. #456
- twarc can be run from the command line as a module now: e.g.
python -m twarc2 search barackobama
#455 - twarc2 command help text indicates times are UTC