Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

max_results=500 incompatible with tweet.fields=context_annotations #504

Closed
igorbrigadir opened this issue Jul 2, 2021 · 5 comments
Closed

Comments

@igorbrigadir
Copy link
Contributor

igorbrigadir commented Jul 2, 2021

Twitter now restricts max_results to be 100 instead of 500 if tweet.fields=context_annotations is used.

{
  "errors": [
    {
      "parameters": {
        "tweet.fields": [
          "context_annotations"
        ],
        "max_results": [
          "500"
        ]
      },
      "message": "when requesting `tweet.fields=context_annotations` `max_results` must be less than or equal to `100`"
    }
  ],
  "title": "Invalid Request",
  "detail": "One or more parameters to your request was invalid.",
  "type": "https://api.twitter.com/2/problems/invalid-request"
}

Currently twarc requests all fields and expansions, so this effectively limits us to only request 100 tweets per call.

The trade-off here is: request 500 tweets without the context annotations, or request 100 with. Requesting 100 will make retrieval slower.

Related to #493

As a stopgap, i committed a change to make the limit 100.

igorbrigadir added a commit that referenced this issue Jul 3, 2021
@edsu
Copy link
Member

edsu commented Jul 3, 2021

This will be an interesting one for Twitter to explain in their documentation. We should have some fun with this when we implement "support" for it in twarc.

@igorbrigadir
Copy link
Contributor Author

igorbrigadir commented Jul 3, 2021

Ok it's a permanent change https://twittercommunity.com/t/max-results-and-context-annotations/156427/2?u=igorbrigadir

So the decision for us is - by default - do we:

  1. Keep all fields and the 100 limit (slower searches but more complete)
  2. Remove context_annotations and get 500 results by default
  3. Make context_annotations optional - also addressing Add ability to manually specify expansions and fields #493
  4. Something else?

2. doesn't appeal to me. Right now i'm still in favour of 1. and then eventually 3. That way if you don't want context_annoatations you can go faster by setting the --max-results manually, like this:

twarc2 search --archive --tweet-fields "id,text,etc.." --max-results 500 "query" output.jsonl

@edsu
Copy link
Member

edsu commented Jul 3, 2021

Yeah, I agree: 1 now and 3 maybe later.

The silver lining here is that this really only adversely effects search/all for Academic Research product track accounts, which is constrained by the monthly quota. You could collect ~ 130,000,000 tweets per month at the rate of 100 every 2 seconds, which is well over the 50,000,000 limit. So being able to collect 5 times as many tweets in each request won't help too much. If they start to make higher tiers of research access available then 3 might become more important?

@igorbrigadir
Copy link
Contributor Author

It's fixed in main already, so a new release will do it. And we can do #493 later

@edsu
Copy link
Member

edsu commented Jul 5, 2021

Ok lets close and revisit it if/when it needs adjustment.

@edsu edsu closed this as completed Jul 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants