Some commonly used examples for Clojure parallel data processing framework Clojask.
The example codes are stored in src/clojask_examples
.
Change the value of :main
in project.clj
to the corresponding namespace of the example you want to run.
Run the example using
lein run
-
Cover the basic APIs in the Clojask library. Also cover how to read and write to different file formats.
-
Group-by, then aggregate v.s. direct aggregate.
-
Natural inner join, left join and right join.
-
For datasets that are smaller than memory, you can store the result in memory and reuse it faster. This function is also necessary to read and write excel files.
-
Connection with
tech.ml.dataset
Convert from and to the popular Clojure DataFrame library
tech.ml.dataset
. -
Forward and backward rolling join with thresholds. See the definition here.
-
Cbind, rbind, melt and dcast. See the definition of them in R.
-
How to define parsers and formatters for fields of zoned datetime.
-
How to do Outer Join / Cartesian Product.