-
Notifications
You must be signed in to change notification settings - Fork 255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to manually specify expansions and fields #493
Comments
I thought this might be coming :-) I'd really like to shield the user as much as possible from this complexity. I also don't want us to bend over backwards because of Twitter's Fail Whale. That being said it would be nice for twarc to be able to work... |
Yeah - i'm kinda leaning towards making these available, but not encouraged - maybe not inferring and helping with the settings after all. Just directly reading the settings. |
If Twitter doesn't fix their API then we won't have much choice. |
I'm a hard disagree on this one right now. We're really only a few months into early access, I think it's a little premature to be working around Twitter's API instability. Especially since that has impacts on downstream plugins. Also I live 15000km from most of the internet, so 503's aren't exactly rare ;) |
Yeah it's a good bit of work implementing it alright - i would still do it just to support the API "as is" but maybe as a lower priority - maybe for these 503s the changes here might actually be a better solution: https://github.com/DocNow/twarc/compare/503_search_all_workaround |
How are we feeling about this now? Based on a bit more handson work with the API, the only thing I'd really want to turn off is the context annotations so that I can collect data faster. Maybe instead of full customisability an off-by-default --exclude-context-annotations flag to support the 500 requests/page would cover most of this? |
I think since the expansions code can deal with missing expansions easily, it shouldn't be a problem. I've been meaning to put these in for the same reason, trading off context annotations for bigger pages - but without anything complicated or clever, it'll assume you know what you're specifying. I'll make the PR later! |
Yes, being able to turn off context-annotations came up in some work I was doing recently. It would be nice to be able to selectively be able to turn them off for the 5X turbo boost. |
Normally, twarc aims to grab everything. But this seems like it's causing problems in the API if the requests are too big, eg #449
It would be good to have a manual override for the expansions and fields. The extra command line parameters to align with the API https://developer.twitter.com/en/docs/twitter-api/data-dictionary/using-fields-and-expansions should have:
--expansions "author_id,geo.place_id"
where the valid ones are: https://github.com/DocNow/twarc/blob/main/twarc/expansions.py#L16Same for:
--user-fields
--tweet-fields
--media-fields
--poll-fields
--place-fields
Ideally it should also complain with an error or automatically set things fro you - if you specify
--poll-fields
but fail to specifyattachments.poll_ids
in--expansions
. It would be nice to parse these and validate them for the user, but if that's too complicated and cumbersome, just a check and a warning should be enough.The text was updated successfully, but these errors were encountered: