Skip to content

Commit

Permalink
Don't tokenize empty tweets
Browse files Browse the repository at this point in the history
  • Loading branch information
Jacob Harris committed Dec 29, 2013
1 parent 3027681 commit 31aa842
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions ebook.rb
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ def filtered_tweets(tweets)
tokenizer = Punkt::SentenceTokenizer.new(source_tweets.join(" ")) # init with corpus of all sentences

source_tweets.each do |twt|
next if twt.nil? || twt == ''
sentences = tokenizer.sentences_from_text(twt, :output => :sentences_text)

# sentences = text.split(/[.:;?!]/)
Expand Down

1 comment on commit 31aa842

@peteyreplies
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.