Skip to content

Commit

Permalink
fix(parsing): fix red parse failures in block contents and refactor p…
Browse files Browse the repository at this point in the history
…arser (#88)

* fix(parsing): fix red parse failures in block contents and refactor parser

The parse failures were because lines beginning with punctuation such as `{` were neither word characters (`\w`) or space characters (`\s`). I fixed that with the `any-char` rule.

The `any-char` rule matches `\w\W` instead of `.` to ensure it also matches newlines, no matter how Clojure and ClojureScript’s regexes work. This is desired in normal text, but probably not in links and hash-tags. That’s one reason for my comment about planning to move away from the `any-chars` rule.

I could have fixed it by just changing the `c` regexp to `#'(\\w|\\W)+'`, but that would lead to other problems down the line. See https://github.com/Engelberg/instaparse#regular-expressions-a-word-of-warning. The downside of doing it repetition in the parser is that I need to combine the individually-matched characters in the transformer later, with `join` and `combine-adjacent-strings`.

Other changes:

- Use clearer names for the parts of the parser
- Use `defparser` to precompile the parser for ClojureScript (following the advice in https://github.com/Engelberg/instaparse#defparser)
- Add classes to output elements to aid in future styling and debugging
- Move the block-ref parsing rule closer to the block-link parsing rule
- Link to the Instaparse docs to make contributing to this file easier

* feat: parse simple bold formatting in blocks

This is Bardia’s code from PR #87 redone to work after my refactoring.

I say “simple” bold formatting because it doesn’t handle other formatting nested in bold. Roam supports italics but not every type of formatting in bold.

* refactor(parsing): make the parser testable and add a few tests

The tests run with Clojure, not ClojureScript, so the code to test had to be moved to a `.cljc` file to be available in both environments.

Only the parsing code, not the transforming/rendering code, was moved into the new file. That’s because the rendering code uses the re-frame and reitit libraries to return its value, so it can’t be tested without mocking them somehow. I’m not sure if testing rendered HTML is even desirable (though testing other aspects of the transformation might be nice).

* refactor(parsing): render bold text with a `class` for consistency

and make the :any-chars transformer easier to read by adding whitespace
  • Loading branch information
roryokane authored May 26, 2020
1 parent 915fcaa commit ef111fc
Show file tree
Hide file tree
Showing 8 changed files with 143 additions and 59 deletions.
16 changes: 16 additions & 0 deletions src/cljc/athens/parse_transform_helper.cljc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
(ns athens.parse-transform-helper)


(defn combine-adjacent-strings
"In a sequence of strings mixed with other values, returns the same sequence with adjacent strings concatenated.
(If the sequence contains only strings, use clojure.string/join instead.)"
[coll]
(reduce
(fn [elements-so-far elmt]
(if (and (string? elmt) (string? (peek elements-so-far)))
(let [previous-elements (pop elements-so-far)
combined-last-string (str (peek elements-so-far) elmt)]
(conj previous-elements combined-last-string))
(conj elements-so-far elmt)))
[]
coll))
32 changes: 32 additions & 0 deletions src/cljc/athens/parser.cljc
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
(ns athens.parser
(:require
#?(:cljs [instaparse.core :as insta :refer-macros [defparser]]
:clj [instaparse.core :as insta :refer [defparser]])))


(declare block-parser)


;; Instaparse docs: https://github.com/Engelberg/instaparse#readme

(defparser block-parser
"(* This first rule is the top-level one. *)
block = ( syntax-in-block / any-char )*
(* `/` ordered alternation is used to, for example, try to interpret a string beginning with '[[' as a block-link before interpreting it as raw characters. *)
<syntax-in-block> = (block-link | block-ref | hashtag | bold)
block-link = <'[['> any-chars <']]'>
block-ref = <'(('> any-chars <'))'>
hashtag = <'#'> any-chars | <'#'> <'[['> any-chars <']]'>
bold = <'**'> any-chars <'**'>
(* It’s useful to extract this rule because its transform joins the individual characters everywhere it’s used. *)
(* However, I think in many cases a more specific rule can be used. So we will migrate away from uses of this rule. *)
any-chars = any-char+
<any-char> = #'\\w|\\W'
")
2 changes: 1 addition & 1 deletion src/cljs/athens/core.cljs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
[athens.config :as config]
#_[athens.db :as db]
[athens.events]
#_[athens.parser :refer [parser]]
#_[athens.parse-renderer :refer [parse-and-render]]
[athens.router :as router]
[athens.subs]
[athens.views :as views]
Expand Down
4 changes: 2 additions & 2 deletions src/cljs/athens/page.cljs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
(ns athens.page
(:require
[athens.parser :refer [parse]]
[athens.parse-renderer :refer [parse-and-render]]
[athens.patterns :as patterns]
[athens.router :refer [navigate-page toggle-open]]
[re-frame.core :refer [subscribe dispatch]]
Expand Down Expand Up @@ -43,7 +43,7 @@
:cursor "pointer" :display "inline-block" :background-color "black"
:vertical-align "middle"}
:on-click #(navigate-page uid)}]]]
[:span (parse string)]]
[:span (parse-and-render string)]]
(when open
[:div {:style {:margin-left 20}}
[render-blocks uid]])])))])))
Expand Down
60 changes: 60 additions & 0 deletions src/cljs/athens/parse_renderer.cljs
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
(ns athens.parse-renderer
(:require
[athens.parse-transform-helper :refer [combine-adjacent-strings]]
[athens.parser :as parser]
[instaparse.core :as insta]
[re-frame.core :refer [subscribe]]
[reitit.frontend.easy :as rfee]))


(declare parse-and-render)


;; Instaparse transforming docs: https://github.com/Engelberg/instaparse#transforming-the-tree

(defn transform
"Transforms Instaparse output to Hiccup."
[tree]
(insta/transform
{:block (fn [& raw-contents]
;; use combine-adjacent-strings to collapse individual characters from any-char into one string
(let [collapsed-contents (combine-adjacent-strings raw-contents)]
(concat [:span {:class "block"}] collapsed-contents)))
:any-chars (fn [& chars]
(clojure.string/join chars))
:block-link (fn [title]
(let [id (subscribe [:block/uid [:node/title title]])]
[:span {:class "block-link"}
[:span {:style {:color "gray"}} "[["]
[:a {:href (rfee/href :page {:id (:block/uid @id)})
:style {:text-decoration "none" :color "dodgerblue"}} title]
[:span {:style {:color "gray"}} "]]"]]))
:block-ref (fn [id]
(let [string (subscribe [:block/string [:block/uid id]])]
[:span {:class "block-ref"
:style {:font-size "0.9em" :border-bottom "1px solid gray"}}
[:a {:href (rfee/href :page {:id id})} (parse-and-render (:block/string @string))]]))
:hashtag (fn [tag-name]
(let [id (subscribe [:block/uid [:node/title tag-name]])]
[:a {:class "hashtag"
:style {:color "gray" :text-decoration "none" :font-weight "bold"}
:href (rfee/href :page {:id (:block/uid @id)})}
(str "#" tag-name)]))
:bold (fn [text]
[:strong {:class "bold"} text])}
tree))


(defn parse-and-render
"Converts a string of block syntax to Hiccup, with fallback formatting if it can’t be parsed."
[string]
(let [result (parser/block-parser string)]
(if (insta/failure? result)
[:span
{:content-editable true
:title (pr-str (insta/get-failure result))
:style {:color "red"}}
string]
[:span
{:content-editable true}
(vec (transform result))])))
56 changes: 0 additions & 56 deletions src/cljs/athens/parser.cljs

This file was deleted.

18 changes: 18 additions & 0 deletions test/athens/parse_transform_helper_test.clj
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
(ns athens.parse-transform-helper-test
(:require
[athens.parse-transform-helper :refer [combine-adjacent-strings]]
[clojure.test :refer [deftest is are]]))


(deftest combine-adjacent-strings-tests
(are [x y] (= x (combine-adjacent-strings y))
[]
, []
["some text"]
, ["some" " " "text"]
["some text" [:link] "around a link"]
, ["some" " " "text" [:link] "around " "a link"]
[{:something nil} "more text" [:link] "between elements" 39]
, [{:something nil} "more" " " "text" [:link] "between" " " "elements" 39]
[{:a 1 :b 2} 3 ["leave" "intact"]]
, [{:a 1 :b 2} 3 ["leave" "intact"]]))
14 changes: 14 additions & 0 deletions test/athens/parser_test.clj
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
(ns athens.parser-test
(:require
[athens.parser :refer [block-parser]]
[clojure.test :refer [deftest is]]))


(deftest block-parser-tests
(is (= [:block] (block-parser "")))
(is (= [:block "O" "K" "?" " " "Y" "e" "s" "."] (block-parser "OK? Yes.")))
(is (= [:block [:block-link [:any-chars "l" "i" "n" "k"]]] (block-parser "[[link]]")))
(is (= [:block "[" "[" "t" "e" "x" "t"] (block-parser "[[text")))
;; Not including tests for every type of syntax because I expect the trees they are parsed to to change soon.
;; For now, additional tests would probably be more annoying than useful.
)

0 comments on commit ef111fc

Please sign in to comment.