Skip to content

Commit

Permalink
replace html tags with single space
Browse files Browse the repository at this point in the history
  • Loading branch information
WardCunningham committed Feb 15, 2024
1 parent 4a027a8 commit 2b7124b
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion scrape.rb
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ def update site, pageinfo
story.each do |item|
next unless text = item['text']
text.gsub! /[a-zA-Z0-9\+\/]{50,}/,''
text.gsub! /<(.|\n)*?>/, '' if item['type'] == 'html'
text.gsub! /<(.|\n)*?>/, ' ' if item['type'] == 'html'
text.gsub! /\[((http|https|ftp):.*?) (.*?)\]/, '\3'
text.scan /[A-Za-z]+/ do |word|
word = word.downcase
Expand Down

0 comments on commit 2b7124b

Please sign in to comment.