Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Twitter track query #4

Open
aychedee opened this issue Apr 3, 2015 · 6 comments
Open

Twitter track query #4

aychedee opened this issue Apr 3, 2015 · 6 comments

Comments

@aychedee
Copy link
Collaborator

aychedee commented Apr 3, 2015

Using the documentation for track (https://dev.twitter.com/streaming/overview/request-parameters#track) we should be able to construct a better twitter query. Current one is just "trans" and it's pretty useless and doesn't actually capture the tweets we want.

@JessFairbairn
Copy link
Collaborator

Is there a size limit? We should also include terms from different cultures (Kathoey, hijra etc.). We'd probably have to make sure it was in English if we're doing sentiment analysis ourselves?

@oluoluoxenfree
Copy link
Collaborator

We could definitely crowdsource or find a master list of different terms somewhere I'm sure.

Different cultures is genius @JessFairbairn, terrible I hadn't thought of that at all!

@RubyOffThe
Copy link
Owner

Yeah I missed that too...

@aychedee
Copy link
Collaborator Author

aychedee commented Apr 7, 2015

Cool, @JessFairbairn each 'word/phrase' is limited to 60 characters. And it's not clear from the documentation how many you're allowed to use. But I think it's quite a lot. So maybe if we set an artificial limit of 100 that would be a good start?

I've created a constant called TRACK_WORDS in hack/app.py which is joined into a comma separated list and then sent to twitter. So that would be a good place to add words or phrases. Don't worry about breaking anything.

@JessFairbairn
Copy link
Collaborator

Cool beans, I've added some words. Ultimately Olu's crowdsourcing idea sounds good.

I assume we're going to have old fashioned and "politically incorrect" words in, but what about out-and-out slurs? I'd imagine we should probably include everything, as people can use them out of ignorance while still being positive, and people can find different things offensive.

@aychedee
Copy link
Collaborator Author

It's up for debate isn't it? My thought is that if we want it to be accurate then it's going to need the slurs, as long as they're not overly broad. Things that are specifically transgender related rather than general gender queer Ultimately the choice of track terms is going to heavily influence the outcome... But as long as we don't change the track terms then the analysis might be interesting over time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants