Twitter Tools

This repo holds a collection of tools for the TREC Microblog tracks, which officially ended in 2015. The track mailing list can be found at trec-microblog@googlegroups.com.

Archival Documents

API Access

The Microblog tracks in 2013 and 2014 used the "evaluation as a service" (EaaS) model, where teams interact with the official corpus via a common API. Although the evaluation has ended, the API is still available for researcher use.

To request access to the API, follow these steps:

Fill out the API usage agreement.
Email the usage agreement to microblog-request@nist.gov.
After NIST receives your request, you will receive an access token from NIST.
The code for accessing the API can be found in this repository. The endpoint of API itself (i.e., hostname, port) will be provided by NIST.

Getting Stated

The main Maven artifact for the TREC Microblog API is twitter-tools-core. The latest releases of Maven artifacts are available at Maven Central.

You can clone the repo with the following command:

$ git clone git://github.com/lintool/twitter-tools.git

Once you've cloned the repository, change directory into twitter-tools-core and build the package with Maven:

$ cd twitter-tools-core
$ mvn clean package appassembler:assemble

For more information, see the project wiki.

Replicating TREC Baselines

One advantage of the TREC Microblog API is that it is possible to deploy a community baseline whose results are replicable by anyone. The raw results are simply the output of the API unmodified. The baseline results are the raw results that have been post-processed to remove retweets and break score ties by reverse chronological order (earliest first).

To run the raw results for TREC 2011, issue the following command:

sh target/appassembler/bin/RunQueriesThrift \
 -host [host] -port [port] -group [group] -token [token] \
 -queries ../data/topics.microblog2011.txt > run.microblog2011.raw.txt

And to run the baseline results for TREC 2011, issue the following command:

sh target/appassembler/bin/RunQueriesBaselineThrift \
 -host [host] -port [port] -group [group] -token [token] \
 -queries ../data/topics.microblog2011.txt > run.microblog2011.baseline.txt

Note that trec_eval is included in twitter-tools/etc (just needs to be compiled), and the qrels are stored in twitter-tools/data (just needs to be uncompressed), so you can evaluate as follows:

../etc/trec_eval.9.0/trec_eval ../data/qrels.microblog2011.txt run.microblog2011.raw.txt

Similar commands will allow you to replicate runs for TREC 2012 and TREC 2013. With trec_eval, you should get exactly the following results:

MAP	raw	baseline
TREC 2011	0.3050	0.3576
TREC 2012	0.1751	0.2091
TREC 2013	0.2044	0.2532
TREC 2014	0.3090	0.3924

P30	raw	baseline
TREC 2011	0.3483	0.4000
TREC 2012	0.2831	0.3311
TREC 2013	0.3761	0.4450
TREC 2014	0.5145	0.6182

License

Licensed under the Apache License, Version 2.0.

Acknowledgments

This work is supported in part by the National Science Foundation under award IIS-1218043. Any opinions, findings, and conclusions or recommendations expressed are those of the researchers and do not necessarily reflect the views of the National Science Foundation.

Name	Name	Last commit message	Last commit date
Latest commit lintool Merge branch 'extract-tweets': tweaks to utility tool to extract JSON… Aug 14, 2016 7777760 · Aug 14, 2016 History 253 Commits
data	data	TREC 2014 topics and qrels.	May 5, 2016
etc	etc	ttg_eval script	Oct 13, 2014
twitter-tools-core	twitter-tools-core	Fixed minor bug.	Aug 13, 2016
twitter-tools-hadoop	twitter-tools-hadoop	removed comment.	Aug 5, 2014
twitter-tools-rm3	twitter-tools-rm3	Tweaked artifact name.	Jun 10, 2014
twitter-tools-ttgbaseline	twitter-tools-ttgbaseline	fixed bug in threshold finding	Jul 24, 2014
.gitignore	.gitignore	Tweaks.	May 5, 2016
API-agreement.pdf	API-agreement.pdf	API Agreement.	May 21, 2014
HISTORY.md	HISTORY.md	Updated to v1.4.3	Dec 26, 2014
README.md	README.md	Updated README	May 5, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Twitter Tools

Archival Documents

API Access

Getting Stated

Replicating TREC Baselines

License

Acknowledgments

About

Releases

Packages

Contributors 7

Languages

lintool/twitter-tools

Folders and files

Latest commit

History

Repository files navigation

Twitter Tools

Archival Documents

API Access

Getting Stated

Replicating TREC Baselines

License

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages