Skip to content

Commit 0cc9ce9

Browse files
committed
initial commit
0 parents  commit 0cc9ce9

File tree

5 files changed

+328
-0
lines changed

5 files changed

+328
-0
lines changed

.gitignore

+10
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
pom.xml
2+
*.jar
3+
*.class
4+
/lib/
5+
/classes/
6+
/target/
7+
.lein-deps-sum
8+
.lein-failures
9+
.lein-plugins/
10+
.lein-repl-history

LICENSE

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Copyright 2018 zsau
2+
3+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4+
5+
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6+
7+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

README.md

+60
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# Feedmail
2+
3+
Feedmail fetches RSS/Atom feeds, then converts their entries to emails and uploads them to an IMAP account. This lets you check your feeds with any email client, and you get synchronization across devices for free. You'd normally run feedmail as a cron job.
4+
5+
## Config file
6+
7+
To use feedmail, give it your IMAP credentials and a list of feeds to fetch ("subscriptions") by modifying `config.clj`. The file lives in `~/.config/feedmail/` by default, but you can put it elsewhere with the `-c` command-line argument.
8+
9+
Feedmail is written in Clojure, and its config file uses Clojure syntax. If you don't know Clojure, the basic syntax is simple: `{:key1 "value", :key2 1234}` is a map (commas optional), and `[thing1 thing2]` is a list. This minimal config should hopefully be self-explanatory:
10+
11+
```clojure
12+
{:imap {
13+
:host "imap.gmail.com"
14+
:user "nobody"
15+
:password "hunter2"}
16+
:subscriptions [
17+
{:url "http://example.com/rss.xml"
18+
:folder "Example Feed"}
19+
{:url "https://www.theonion.com/rss"
20+
:folder "Humor/The Onion"}]}
21+
```
22+
23+
Here's a more comprehensive example, which assumes more Clojure knowledge. Keys not included in the above config are optional.
24+
25+
```clojure
26+
{:imap {
27+
:host "imap.gmail.com"
28+
:user "nobody"
29+
:password "hunter2"
30+
:port 993}
31+
;Feedmail uses a cache to remember which items it has seen before. You probably won't need to change these values.
32+
:cache {
33+
:path "/home/nobody/.cache/feedmail"
34+
:size 100}
35+
:email {
36+
;This template should include at least one <a class='link'>, plus elements with classes `title` and `content`.
37+
:template "<h1><a class='link title'></a></h1><p class='content'></p>"}
38+
:subscriptions [
39+
{:url "https://news.ycombinator.com/rss"
40+
:folder "Job Hunting"
41+
;include only items whose title includes the text "Who is hiring?"
42+
:filter #(.contains (:title %) "Who is hiring?")}
43+
{:url "http://nedroid.com/feed/"
44+
:folder "Comics/Nedroid"
45+
;Override the author of each entry to be "Beartato"
46+
:transform #(assoc % :author "Beartato")
47+
;Squelches output from any errors encountered while fetching this feed. Useful for flaky feeds that generate too many cron emails.
48+
:suppress-errors true}
49+
;URLs like `script:/path/to/script` tell feedparser to execute an external command, which should output RSS or Atom content. Useful for HTML-scraping wrappers that generate RSS feeds for sites that don't provide one, or even for generating feeds from sources other than websites.
50+
{:url "script:/home/nobody/my-rss-generator.py"
51+
:folder "Special"}]}
52+
```
53+
54+
## Important note about config.clj
55+
56+
Feedparser *evals* the config file, allowing you to use Clojure code anywhere. This is powerful but dangerous: it's not sandboxed in any way, allowing arbitrary code execution. Because of that and the `script:` URL feature, **don't run feedmail with an untrusted config file!** Since your config will also contain your IMAP credentials, it's best to `chmod 600` your `config.clj`.
57+
58+
## License
59+
60+
Released under the [MIT License](https://opensource.org/licenses/MIT).

project.clj

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
(defproject zsau/feedmail "0.1.0"
2+
:description "RSS/Atom feed client that converts items to emails"
3+
:url "https://github.com/zsau/feedmail"
4+
:license {
5+
:name "MIT License"
6+
:url "https://opensource.org/licenses/MIT"}
7+
:aot [feedmail]
8+
:main feedmail
9+
:dependencies [
10+
[ch.qos.logback/logback-classic "1.2.3"]
11+
[clj-http "3.8.0"]
12+
[clj-time "0.14.2"]
13+
[enlive "1.1.6"]
14+
[org.clojure/clojure "1.9.0"]
15+
[org.clojure/core.memoize "0.7.1"]
16+
[org.clojure/tools.cli "0.3.5"]
17+
[zsau/feedparser-clj "0.6.0"]
18+
[zsau/mail "0.1.0"]])

src/feedmail.clj

+233
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,233 @@
1+
(ns feedmail
2+
(:gen-class)
3+
(:import [java.io PushbackReader ByteArrayInputStream])
4+
(:require
5+
[clj-http.client :as http]
6+
[clj-time.core :as time]
7+
[clj-time.format :as time.fmt]
8+
[clojure.core.memoize :as memo]
9+
[clojure.java.shell :as shell]
10+
[clojure.string :as str]
11+
[clojure.tools.cli :as cli]
12+
[feedparser-clj.core :as feed]
13+
[mail]
14+
[net.cgrand.enlive-html :as html]))
15+
16+
; based on https://gist.github.com/Gonzih/5814945
17+
(defmacro try+ "Like try, but can catch multiple exception types with (catch+ [classname*] name expr)."
18+
[& body]
19+
(letfn [
20+
(catch+? [form]
21+
(and (seq form)
22+
(= (first form) 'catch+)))
23+
(expand [[_catch* classes & catch-tail]]
24+
(map #(list* 'catch % catch-tail) classes))
25+
(transform [form]
26+
(if (catch+? form)
27+
(expand form)
28+
[form]))]
29+
(cons 'try (mapcat transform body))))
30+
31+
(defn sha1-bytes [s]
32+
(.digest (java.security.MessageDigest/getInstance "SHA1")
33+
(.getBytes s java.nio.charset.StandardCharsets/UTF_8)))
34+
35+
(defn sha1 [s]
36+
(.toLowerCase
37+
(javax.xml.bind.DatatypeConverter/printHexBinary
38+
(sha1-bytes s))))
39+
40+
(defn die [status message]
41+
(binding [*out* *err*]
42+
(println message))
43+
(System/exit status))
44+
45+
(defn read-config [path]
46+
(with-open [in (PushbackReader. (clojure.java.io/reader path))]
47+
(eval (read in)))) ; !!! DANGER: evals any code in the config !!!
48+
49+
(defn cache-path [config url]
50+
(str (:path (:cache config)) "/" (sha1 url)))
51+
52+
(defn recipient [imap]
53+
(str (:user imap) "@" (:host imap)))
54+
55+
(defn read-cache [path]
56+
(try
57+
(let [[date & ids] (str/split-lines (slurp path))]
58+
{:date date :ids ids})
59+
(catch java.io.FileNotFoundException e)))
60+
61+
(defn write-cache [path cache]
62+
(spit path
63+
(str/join "\n"
64+
(cons (:date cache)
65+
(:ids cache)))))
66+
67+
(defn item-date [item]
68+
((some-fn :updated-date :published-date) item))
69+
70+
(defn item-author [feed item]
71+
(let [author (first (:authors item))] {
72+
:email (or (:email author) "feedmail@localhost")
73+
:name (or
74+
(:name author)
75+
(:author item) ; allows (assoc % :author "Bob") in config files
76+
(:title feed))}))
77+
78+
(defn item-content [item]
79+
((some-fn :content :description) item))
80+
81+
(defn email-template [s]
82+
(html/template (ByteArrayInputStream. (.getBytes s "UTF-8")) [item]
83+
[:a.link] (html/set-attr :href (:link item))
84+
[:.title] (html/content (:title item))
85+
[:.content]
86+
(let [{:keys [value type]} (item-content item)]
87+
(if (#{"html" "text/html" "xhtml"} type)
88+
(html/html-content value)
89+
(html/content value)))))
90+
91+
(defn sort-entries [entries]
92+
(if (item-date (first entries))
93+
(sort-by item-date #(compare %2 %1) entries)
94+
entries))
95+
96+
(defn entry->email [config feed item]
97+
(let [author (item-author feed item)] {
98+
:from (mail/address (:email author) (:name author))
99+
:to (mail/address (recipient (:imap config)))
100+
:subject (:title item)
101+
:date (item-date item)
102+
:body (clojure.string/join
103+
((email-template (:template (:email config)))
104+
item))}))
105+
106+
(defn parse-feed [body]
107+
(when body
108+
(feed/parse-feed
109+
(ByteArrayInputStream.
110+
(.getBytes body "UTF-8")))))
111+
112+
(defn script-get [path]
113+
(let [{:keys [exit out]} (shell/sh path)] ; !!! DANGER: running arbitrary external command from config !!!
114+
{:body out, :status (if (zero? exit) 200 500)}))
115+
116+
(defn fetch* [url date]
117+
(let [uri (java.net.URI. url)]
118+
(if (= "script" (.getScheme uri))
119+
(script-get (.getSchemeSpecificPart uri))
120+
(let [headers (if date {"If-Modified-Since" date} {})]
121+
(http/get url {
122+
:http-builder-fns [#(.disableCookieManagement %)]
123+
:headers headers
124+
:throw-exceptions false
125+
:decode-cookies false})))))
126+
127+
; Memoize fetched URLs to facilitate using the same URL in more than one subscription (e.g. a combined feed that's split via multiple filters). A FIFO cache of size 1 suffices, since we sort subscriptions by URL before fetching.
128+
(def memo-fetch (memo/fifo fetch* :fifo/threshold 1))
129+
130+
(defn fetch [url date]
131+
(let [{:keys [status body]} (memo-fetch url date)]
132+
(condp = status
133+
200 body
134+
304 nil
135+
(throw (java.io.IOException. (str "couldn't fetch feed: HTTP " status))))))
136+
137+
(defn report-feed-error [url e]
138+
(binding [*out* *err*]
139+
(println (format "Error on feed: %s\n%s"
140+
url (.toString e)))))
141+
142+
(defn check-subscription [config store date {:keys [url] :as subscription}]
143+
(when (:verbose config)
144+
(println "checking" url))
145+
(try+
146+
(let [
147+
transform-fn (or (:transform subscription) identity)
148+
filter-fn (or (:filter subscription) identity)
149+
cache-path (cache-path config url)
150+
cache (read-cache cache-path)
151+
body (fetch url (:date cache))
152+
feed (parse-feed body)
153+
new-entries (into []
154+
(comp
155+
(take (:size (:cache config)))
156+
(keep transform-fn)
157+
(filter filter-fn)
158+
(filter (comp (complement (set (:ids cache))) :uri)))
159+
(sort-entries (:entries feed)))]
160+
(when (:verbose config)
161+
(println "-> got" (count new-entries) "new items")
162+
(doseq [e new-entries]
163+
(println e)))
164+
(when-not (:dry-run config)
165+
(mail/append-messages store {:name (:folder subscription) :create true}
166+
(map (partial entry->email config feed)
167+
(reverse new-entries)))
168+
(write-cache cache-path {
169+
:date date
170+
:ids
171+
(take (:size (:cache config))
172+
(concat (map :uri new-entries)
173+
(:ids cache)))})))
174+
; ROME throws IllegalArgumentException sometimes for invalid documents
175+
(catch+ [IllegalArgumentException java.io.IOException java.net.ConnectException java.net.UnknownHostException javax.mail.MessagingException com.rometools.rome.io.FeedException org.apache.http.HttpException] e
176+
(when-not (:suppress-errors subscription) (report-feed-error url e)))
177+
(catch Exception e (report-feed-error name url e) (throw e))))
178+
179+
(defn check-subscriptions [{:keys [imap subscriptions] :as config} subscription-names-to-check]
180+
(let [
181+
filter-fn (if (seq subscription-names-to-check) (comp (set subscription-names-to-check) :name) any?)
182+
date (time.fmt/unparse (time.fmt/formatters :rfc822) (time/now))]
183+
(mail/with-store [store imap]
184+
; sort by url so our cache can work more effectively
185+
(doseq [subscription (sort-by :url (filter filter-fn subscriptions))]
186+
(check-subscription config store date subscription)))))
187+
188+
(defn usage [options-summary]
189+
(format "usage: feedmail [options] [FEED_NAME ...]\n\nOptions:\n%s"
190+
options-summary))
191+
192+
(def cli-options [
193+
["-h" "--help"]
194+
["-v" "--verbose"]
195+
["-d" "--dry-run" "Don't upload emails or update cache"]
196+
["-c" "--config FILE" "Config file path"
197+
:default (str (System/getProperty "user.home") "/.config/feedmail/config.clj")
198+
:validate [#(.exists (java.io.File. %)) "no such file"]]]) ;FIXME: real validation (spec?)
199+
200+
(def default-config {
201+
:cache {
202+
:path (str (System/getProperty "user.home") "/.cache/feedmail")
203+
:size 100}
204+
:email {
205+
:template "<h1><a class='link title'></a></h1><p class='content'></p>"}
206+
:imap {
207+
:host nil
208+
:user nil
209+
:password nil}
210+
:subscriptions
211+
(take 0 [{ ; just for documentation:
212+
:url "http://example.com/"
213+
:folder "Folder/Subfolder"
214+
:filter any?
215+
:transform identity
216+
:suppress-errors false}])})
217+
218+
(defn -main [& args]
219+
(let [{:keys [options arguments errors summary]} (cli/parse-opts args cli-options)]
220+
(cond
221+
(:help options)
222+
(println (usage summary))
223+
errors
224+
(die 1 (str/join "\n" errors))
225+
:else
226+
(let [config (merge default-config
227+
(read-config (:config options))
228+
(select-keys options [:verbose :dry-run]))]
229+
(when (:verbose config)
230+
(println "config:" config))
231+
(.mkdir (java.io.File. (:path (:cache config))))
232+
(check-subscriptions config arguments)
233+
(shutdown-agents)))))

0 commit comments

Comments
 (0)