Short Description

This is a RAILS 2.3.7 based application that helps you to collect Twitter data.

How to Install

Checkout with :


  git clone git@github.com:plotti/twitterlyzer.git

Install RVM

Install rvm if you are not using it (http://beginrescueend.com/rvm/install/)


 $ bash -s stable < <(curl -s https://raw.github.com/wayneeseguin/rvm/master/binscripts/rvm-installer)
 $ source ~/.bash_profile
 $ rvm requirements

The rvm spec file is already in the repo
Install ruby 1.8.7
```
  $ rvm install 1.8.7
```

Create your gemset


 rvm gemset create 'socializer'

*If you have permission problems try to create the gem dir:


sudo mkdir ~/.gem/specs
sudo chmod 777 ~/.gem/specs

Setup Files

To get it running you will need to create a:
- twitter.yml that contains your twitter credentials
- bitly.yml that contains your bitly credentials
- database.yml containting the database credentials
- see twitter.example.yml or bitly.example.yml for details
- Make sure when your Twitter account is NOT whitelisted that you dont use up your API limitations when using too many workers
- Create the directory “/data” under your rails root to store the lists

Install Gems Dependencies

The app is using 2.3.7 rails so all gems are chosen to match that framework
```
gem install rails -v =2.3.7
gem update --system 1.5.3
```

To install them first install:


gem install rails_gem_install
RAILS_ENV=development rails_gem_install

Test

Test if the application works correctly
You will need rspec/rspec-rails and factory girl to test it.

You will need to start solr in test mode


RAILS_ENV=test rake sunspot:solr:start
spec spec

Get Delayed Jobs working

Create the necessary files with: script/generate delayed_job
To start collecting persons or feeds you need to start a couple of delayed job workers. To do so use the script
```
"./script/delayed_job -n 4"
```
- The Benchmarks I measured are depending on the number of workers (n):
  - Collecting Tweets: n 4: 40.000 tweets in 10min
  - n 8: 90.000 tweets in 10min
  - n 16: 180.000 tweets in 10 min (70% CPU usage)

Start Solr and Webserver

All of the tweets are indexed by a lucene solr server in the background
It uses sunspot and solr gems.
Before starting the server make sure to start solr.
```
rake sunspot:solr:start 
./script/server
```

Dumping the DB and restoring it

In order to exchange your results it contains a rake task that dumps the existing DB into /dump
It uses the dump plugin for Rails 2.3 https://github.com/toy/dump
There is a small example db in dump containing 57 persons in one project and ~ 100K Tweets inc. Retweets

You can use it to experiment on the data


rake dump
rake dump:restore # to restore a db

FEATURES

It does the following:

It uses Delayed Jobs to get the collection done.
The Twitter API is wrapped using grackle and twitter gems

Projects
Persons are organized in projects that contain a set of people

Persons
collect one person
collect multiple persons based on a csv import
collect the egonetwork of a given person
show all people
show statistics of the people collected (friends, follower distributions, origin etc..)

Connections between persons
Connections between persons are stored not in the DB but on the HD in a PStore

Tweets
collect the tweets of a person
collect the tweets of all persons
collects tweets based on a csv list
collect all retweets of all collected tweets
export all tweets into a csv
show statistics on the tweets (links used, keywords, timeline)

Networks
export the friendship network of the collected persons in a project the formats:
UCINET
Gephi
export the retweet networks of persons
export the @ networks between persons
export the person stats
export the twitter links of persons

Tasks
It has some onboard scrapers under tasks that scrape the following websites
Murack.com
Google
Twellow
Wefollow

It can compute some sentiment for german tweets

Name		Name	Last commit message	Last commit date
Latest commit History 141 Commits
analysis		analysis
app		app
config		config
db/migrate		db/migrate
doc		doc
dump		dump
graphs		graphs
lib		lib
public		public
script		script
solr/conf		solr/conf
spec		spec
tasks		tasks
test		test
vendor/plugins		vendor/plugins
.gitignore		.gitignore
.rvmrc		.rvmrc
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
README.textile		README.textile
Rakefile		Rakefile
socializer.komodoproject		socializer.komodoproject
socializer.kpf		socializer.kpf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Short Description

How to Install

Install RVM

Setup Files

Install Gems Dependencies

Test

Get Delayed Jobs working

Start Solr and Webserver

Dumping the DB and restoring it

FEATURES

About

Releases

Packages

Languages

plotti/twitterlyzer

Folders and files

Latest commit

History

Repository files navigation

Short Description

How to Install

Install RVM

Setup Files

Install Gems Dependencies

Test

Get Delayed Jobs working

Start Solr and Webserver

Dumping the DB and restoring it

FEATURES

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages