Hazelcast bells and whistles under the Clojure belt
- Creating a Cluster
- Working with Data Structures
- Connecting as a Client
- Distributed SQL Queries
- Continuous Query Cache
- Near Cache
- Distributed Tasks
- Distributed Reliable Topic
- Map Event Listeners
- Serialization
- Stats
- License
user=> (require '[chazel.core :refer :all])
let's start a 3 node cluster
user=> (cluster-of 3)
Jul 25, 2020 9:47:57 PM com.hazelcast.internal.cluster.ClusterService
INFO: [192.168.0.107]:5702 [dev] [4.0.2]
Members {size:3, ver:3} [
Member [192.168.0.107]:5701 - b48fb15a-ad9d-4ca7-8d8c-461920ee71d6
Member [192.168.0.107]:5702 - f050f3b1-71ea-4814-884e-52d150f3781e this
Member [192.168.0.107]:5703 - 93f7dad3-4f34-400b-9cf5-58a11c95a59c
]
("HazelcastInstance{name='confident_mahavira', node=[192.168.0.107]:5701}"]
"HazelcastInstance{name='unruffled_mahavira', node=[192.168.0.107]:5702}"]
"HazelcastInstance{name='lucid_mahavira', node=[192.168.0.107]:5703}"])
create a map (or multimap, or queue, etc):
user=> (def appl (hz-map :appl))
#'user/appl
user=> (type appl)
com.hazelcast.map.impl.proxy.MapProxyImpl
use the map:
user=> (put! appl :apple 42)
user=> appl
{:apple 42}
some other cool things:
user=> (def goog (hz-map :goog))
#'user/goog
user=> (put! goog :goog 42)
user=> (find-all-maps)
({:appl 42}
{:goog 42})
Since Hazelcast collection API implement Java collection API, Hazelcast distributed datastructures can be navigated and manipulated in the same way as Java local collections:
=> (def alpha (hz-map :alpha))
#'chazel/alpha
=> (put-all! alpha {6 :f 1 :a 3 :c 4 :d 2 :b 5 :e})
=> alpha
{3 :c, 1 :a, 2 :b, 4 :d, 5 :e, 6 :f}
and now:
=> (into (sorted-map) alpha)
{1 :a, 2 :b, 3 :c, 4 :d, 5 :e, 6 :f}
all works as it would with any other java.util.Map
, only in this case the map is distributed and lives across cluster nodes:
=> (type alpha)
com.hazelcast.map.impl.proxy.MapProxyImpl
Other Hazelcast data structures, such as lists for example, could be manipulated with the "same old" Java/Clojure API:
=> (def anum (hz-list :alpha-num))
#'chazel/anum
=> (add-all! anum [0 :a 1 :b 2 :c 3 :d 4 :e 5 :f 6 7 8 9])
=> (group-by int? anum)
{true [0 1 2 3 4 5 6 7 8 9],
false [:a :b :c :d :e :f]}
=> (type anum)
com.hazelcast.collection.impl.list.ListProxyImpl
user=> (def c (client-instance {:cluster-name "dev"
:hosts ["127.0.0.1"]}))
INFO chazel.core - connecting to: {:cluster-name dev, :hosts [127.0.0.1]}
user=> c
"com.hazelcast.client.impl.clientside.HazelcastClientInstanceImpl@4786a6b6"
Hazelcast has a concept of Distributed Query with quite rich SQL syntax supported.
chazel embraces it into a single function select
. Let's look at the example that is taught at Jedi Order.
Since Hazelcast internally works with Java objects, it relies on getter/setter accessors for its full SQL power. This is not that bad as it might seem at the first glance. Think Google Protobufs, or many other Java serialization protocols, the all produce objects with getters and setters.
Let's call for the Jedi Masters:
[chazel]$ boot dev ;; this will load Jedi type
Compiling 1/1 chazel.jedis...
nREPL server started..
chazel=> (require '[chazel.core :refer :all])
chazel=> (import '[chazel.jedis Jedi])
chazel=> (def masters {1 (Jedi. "Yoda" "vim")
2 (Jedi. "Mace Windu" "emacs")
3 (Jedi. "Qui-Gon Jinn" "cursive")
4 (Jedi. "Obi-Wan Kenobi" "vim")
5 (Jedi. "Luke Skywalker" "vim")
6 (Jedi. "Mara Jade Skywalker" "emacs")
7 (Jedi. "Leia Organa Solo" "emacs")
8 (Jedi. "Jaina Solo Fel" "atom")})
Jedi is an example type that has name
and editor
fields.
You guessed it right, we are going to rely on SQL query powers to finally find out which editors Jedis Masters use!
Now as we called upon the masters, let's put them into a Hazelcast map. We can use a put-all!
for that:
chazel=> (def jedis (hz-map "jedis"))
#'chazel/jedis
chazel=> (put-all! jedis masters)
Let's now run some distributed SQL on the new Jedi Master database:
chazel=> (select jedis "editor = vim")
#{#<Jedi {:name Obi-Wan Kenobi :editor vim}>
#<Jedi {:name Yoda :editor vim}>
#<Jedi {:name Luke Skywalker :editor vim}>}
chazel=> (select jedis "name like %Sky%")
#{#<Jedi {:name Luke Skywalker :editor vim}>
#<Jedi {:name Mara Jade Skywalker :editor emacs}>}
chazel=> (select jedis "name like %Sky% and editor != emacs")
#{#<Jedi {:name Luke Skywalker :editor vim}>}
niice!
In case a database / map is large, we can add field indices
chazel=> (add-index jedis "editor")
now this query will run waaay faster:
chazel=> (select jedis "editor = vim")
#{#<Jedi {:name Obi-Wan Kenobi :editor vim}>
#<Jedi {:name Yoda :editor vim}>
#<Jedi {:name Luke Skywalker :editor vim}>}
for larger datasets.
By default a distributed query will return a set:
chazel=> (type (select jedis "editor = vim"))
clojure.lang.PersistentHashSet
In case you need an actual submap: i.e. all the matching map entries (k,v pairs), just ask:
chazel=> (select jedis "editor = vim" :as :map)
{1 #object[chazel.jedis.Jedi 0x44bb1c0a "{:name Yoda :editor vim}"],
4 #object[chazel.jedis.Jedi 0x4ad0c3c5 "{:name Obi-Wan Kenobi :editor vim}"],
5 #object[chazel.jedis.Jedi 0x2725fbd0 "{:name Luke Skywalker :editor vim}"]}
chazel=> (type (select jedis "editor = vim" :as :map))
clojure.lang.PersistentArrayMap
For a better interop, you can also ask for a Hazelcast "native" type:
chazel=> (select jedis "editor = vim" :as :native)
#{#object[java.util.AbstractMap$SimpleImmutableEntry 0x69cfa867 "1={:name Yoda :editor vim}"]
#object[java.util.AbstractMap$SimpleImmutableEntry 0x3b0a56f9 "4={:name Obi-Wan Kenobi :editor vim}"]
#object[java.util.AbstractMap$SimpleImmutableEntry 0x3b498787 "5={:name Luke Skywalker :editor vim}"]}
chazel=> (type (select jedis "editor = vim" :as :native))
com.hazelcast.map.impl.query.QueryResultCollection
In case a wrong / unknown format is asked for, chazel will tell you so:
chazel=> (select jedis "editor = vim" :as :foo)
ERROR: can't return a result of a distributed query as ":foo" (an unknown format you provided). query: "editor = vim", running on: "jedis"
SQL would not be too useful if we could not do things like "I only need first 100 results out of millions you have" or "sort the results by the revenue". In more SQL like speak, these two would be: LIMIT 100
and ORDER BY "revenue"
.
Hazelcast supports both through Paging Predicates:
Hazelcast provides paging for defined predicates. With its PagingPredicate class, you can get a collection of keys, values, or entries page by page by filtering them with predicates and giving the size of the pages. Also, you can sort the entries by specifying comparators.
Think about it as LIMIT
and ORDER BY
with pagination built in: i.e. once you get a resultset back you can navigate it by pages. Pretty neat :)
With chazel it's just a couple of optional keys to the select
function.
Using Jedis Masters example from above:
chazel=> jedis
{6 #object[chazel.jedis.Jedi 0x7eb421d4 "{:name Mara Jade Skywalker :editor emacs}"],
1 #object[chazel.jedis.Jedi 0x39208ed7 "{:name Yoda :editor vim}"],
4 #object[chazel.jedis.Jedi 0x4f001c4f "{:name Obi-Wan Kenobi :editor vim}"],
5 #object[chazel.jedis.Jedi 0x417eede1 "{:name Luke Skywalker :editor vim}"],
2 #object[chazel.jedis.Jedi 0x1e9bde9b "{:name Mace Windu :editor emacs}"],
8 #object[chazel.jedis.Jedi 0x2370bda9 "{:name Jaina Solo Fel :editor atom}"],
3 #object[chazel.jedis.Jedi 0x6cdd2fec "{:name Qui-Gon Jinn :editor cursive}"],
7 #object[chazel.jedis.Jedi 0x5a7ac673 "{:name Leia Organa Solo :editor emacs}"]}
Let's bring them ALL (i.e. *
) back to the client side in pages of 3
:
chazel=> (select jedis "*" :page-size 3)
{:pages #object[chazel.Pages 0x58406675 "chazel.Pages@58406675"],
:results
#{#object[chazel.jedis.Jedi 0x170a94e7 "{:name Jaina Solo Fel :editor atom}"]
#object[chazel.jedis.Jedi 0x2b4d73f4 "{:name Leia Organa Solo :editor emacs}"]
#object[chazel.jedis.Jedi 0x6f1e19da "{:name Mara Jade Skywalker :editor emacs}"]}}
notice the chazel.Pages
under the :pages
key that is returned, let's use it to get the next page, and then the next page, and then the next:
chazel=> (def paging-jedis (select jedis "*" :page-size 3))
#'chazel/paging-jedis
chazel=> (-> paging-jedis :pages next-page)
#{#object[chazel.jedis.Jedi 0x7122e00f "{:name Obi-Wan Kenobi :editor vim}"]
#object[chazel.jedis.Jedi 0x599d002f "{:name Qui-Gon Jinn :editor cursive}"]
#object[chazel.jedis.Jedi 0x5c4e9eda "{:name Luke Skywalker :editor vim}"]}
chazel=> (-> paging-jedis :pages next-page)
#{#object[chazel.jedis.Jedi 0x7eabb220 "{:name Yoda :editor vim}"]
#object[chazel.jedis.Jedi 0x422d73b3 "{:name Mace Windu :editor emacs}"]}
chazel=> (-> paging-jedis :pages next-page)
#{}
niice!
Of course we can also filter page results with Hazelcast SQL (i.e. "editor = vim"
):
chazel=> (select jedis "editor = vim" :page-size 2)
{:pages #object[chazel.Pages 0x140ff895 "chazel.Pages@140ff895"],
:results
#{#object[chazel.jedis.Jedi 0x77b276a3 "{:name Luke Skywalker :editor vim}"]
#object[chazel.jedis.Jedi 0x562345a4 "{:name Obi-Wan Kenobi :editor vim}"]}}
Yoda here did not make to the first page, but it is comfortably watching Luke and Obi-Wan from the second / last page with Jedis who use vim
:
chazel=> (-> (select jedis "editor = vim" :page-size 2) :pages next-page)
#{#object[chazel.jedis.Jedi 0x59e58d76 "{:name Yoda :editor vim}"]}
A simple Java Comparator can be used to sort paginated results. While you can create it with a comparator functoin, in most cases (as it works 99% in Clojure) a simple function will do.
First, since in this example Jedis are Java Beans and the Hazelcast SQL resultset is a collection of SimpleImmutableEntrys let's create an editor
field accessor:
(defn jedit [m]
(let [jedi (.getValue m)]
(.getEditor jedi)))
which just wraps a couple of Java calls to get the value of the map entry and get the editor form the Jedi.
Now let's create a "comparator" function:
(defn by-editor [a b]
(compare (jedit a) (jedit b)))
which compares Jedis by the editor they use.
Let's get those pages sorted with this comparator providing it to a :order-by
optional param of select
:
chazel=> (select jedis "*" :page-size 4 :order-by by-editor)
{:pages #object[chazel.Pages 0x544d44f3 "chazel.Pages@544d44f3"],
:results
#{#object[chazel.jedis.Jedi 0x57367fa1 "{:name Qui-Gon Jinn :editor cursive}"]
#object[chazel.jedis.Jedi 0x1f14b62c "{:name Mara Jade Skywalker :editor emacs}"]
#object[chazel.jedis.Jedi 0x3b6118af "{:name Mace Windu :editor emacs}"]
#object[chazel.jedis.Jedi 0x57999413 "{:name Jaina Solo Fel :editor atom}"]}}
Hm.. did not seem to work.
Ah, remember from Query Results Format, the default resultset is a set
, hence the order is lost. Let's try to change a format to, say, a :map
:
chazel=> (select jedis "*" :page-size 4 :order-by by-editor :as :map)
{:pages #object[chazel.Pages 0x4e42e6e2 "chazel.Pages@4e42e6e2"],
:results
{8 #object[chazel.jedis.Jedi 0x2cc64579 "{:name Jaina Solo Fel :editor atom}"],
3 #object[chazel.jedis.Jedi 0x27400d5f "{:name Qui-Gon Jinn :editor cursive}"],
2 #object[chazel.jedis.Jedi 0x6908aeee "{:name Mace Windu :editor emacs}"],
6 #object[chazel.jedis.Jedi 0x56899da8 "{:name Mara Jade Skywalker :editor emacs}"]}}
now it's sorted, so as the page right after it:
chazel=> (def pages (-> (select jedis "*" :page-size 4 :order-by by-editor :as :map) :pages))
#'chazel/pages
chazel=> (next-page pages)
{7 #object[chazel.jedis.Jedi 0x6aa98c05 "{:name Leia Organa Solo :editor emacs}"],
1 #object[chazel.jedis.Jedi 0x5d4a1841 "{:name Yoda :editor vim}"],
4 #object[chazel.jedis.Jedi 0x63cd1e72 "{:name Obi-Wan Kenobi :editor vim}"],
5 #object[chazel.jedis.Jedi 0x2838423b "{:name Luke Skywalker :editor vim}"]}
Luke Skywalker comes last in this chapter, but no worries, this is just the beginning...
A continuous query cache is used to cache the result of a continuous query. After the construction of a continuous query cache, all changes on the underlying IMap are immediately reflected to this cache as a stream of events. Therefore, this cache will be an always up-to-date view of the IMap. You can create a continuous query cache either on the client or member. (more from Hazelcast docs)
We'll continue working with Jedi masters from Jedi Order:
=> (select jedis "*")
#{#object[chazel.jedis.Jedi 0x1f987361 "{:name Yoda :editor vim}"]
#object[chazel.jedis.Jedi 0x5f645ec6 "{:name Mara Jade Skywalker :editor emacs}"]
#object[chazel.jedis.Jedi 0xc9654d "{:name Mace Windu :editor emacs}"]
#object[chazel.jedis.Jedi 0x144ca20f "{:name Obi-Wan Kenobi :editor vim}"]
#object[chazel.jedis.Jedi 0x49279cf0 "{:name Jaina Solo Fel :editor atom}"]
#object[chazel.jedis.Jedi 0x6e35e872 "{:name Leia Organa Solo :editor emacs}"]
#object[chazel.jedis.Jedi 0x63b4f296 "{:name Luke Skywalker :editor vim}"]
#object[chazel.jedis.Jedi 0x1b0037fe "{:name Qui-Gon Jinn :editor cursive}"]}
Let's say we need to cache masters who use Vim editor. We also need this cache to be continuously updating whenever records are added or removed to/from the source jedis
map. In order to do that all we need to do is to create a QueryCache.
In order to create such a QueryCache, we'll use a query-cache
function that take these arguments:
- source map: which maps to create this cache for
- cache name: a internal name of this cache
- predicate: to filter the exiting source map entries
- include value?: a boolean flag => "true" (default) if this QueryCache is allowed to cache values of entries, otherwise "false"
- listener: a MapListener which will be used to listen this QueryCache
At a minimum query-cache
would need a "source map", "cache name" and "predicate":
=> (def vim (query-cache jedis "vim-cache" "editor = vim"))
here jedis
is a source map, "vim-cache"
is the cache name and "editor = vim"
is a predicate.
Let's look at vim's type:
=> (type vim)
com.hazelcast.map.impl.querycache.subscriber.DefaultQueryCache
This query cache is as "select
able" as any other map:
=> (select vim "*")
#{#object[chazel.jedis.Jedi 0x10fcece9 "{:name Luke Skywalker :editor vim}"]
#object[chazel.jedis.Jedi 0x5fe3f833 "{:name Yoda :editor vim}"]
#object[chazel.jedis.Jedi 0x2267043e "{:name Obi-Wan Kenobi :editor vim}"]}
=> (select vim "name like %Sky%")
#{#object[chazel.jedis.Jedi 0x44a03065 "{:name Luke Skywalker :editor vim}"]}
Optionally query-cache
function also takes include-value?
and a listener
which makes it 4 possible combinations:
- [source-map cache-name]
- [source-map cache-name pred]
- [source-map cache-name pred include-value?]
- [source-map cache-name pred listener include-value?]
In case only a source-map
and a cache-name
are given, query-cache
will look this cache up by name and will return nil
in case this cache does not exist, otherwise it will return a previously created cache that was created with this name.
=> (query-cache jedis "vim-cache")
#object[com.hazelcast.map.impl.querycache.subscriber.DefaultQueryCache 0x5a6895e5 "DefaultQueryCache{mapName='jedis', cacheId='fd40614a-20e3-4c97-bb9c-2ab6a82bf638', cacheName='vim-cache'}"]
=> (query-cache jedis "vc") ;; this cache does not exist
nil
Whenever underlying data in the source map changes query cache will always be upto date if these changes affect the predicate of course:
=> (put! jedis 42 (Jedi. "Hazel Caster" "vim"))
=> (select vim "*")
#{#object[chazel.jedis.Jedi 0x7f518c04 "{:name Luke Skywalker :editor vim}"]
#object[chazel.jedis.Jedi 0x4fd2044b "{:name Hazel Caster :editor vim}"]
#object[chazel.jedis.Jedi 0x206c812 "{:name Yoda :editor vim}"]
#object[chazel.jedis.Jedi 0x237fbc94 "{:name Obi-Wan Kenobi :editor vim}"]}
=> (remove! jedis 5) ;; removing "Luke Skywalker"
=> (select vim "*")
#{#object[chazel.jedis.Jedi 0x1a7320c0 "{:name Obi-Wan Kenobi :editor vim}"]
#object[chazel.jedis.Jedi 0x695908a1 "{:name Yoda :editor vim}"]
#object[chazel.jedis.Jedi 0x4d51aba4 "{:name Hazel Caster :editor vim}"]}
Nice, vim
is a pretty "view" that is also "materialized"
Continuous query cache can be created on a cluster member as well as on a cluster client. And when it is created on the client, given that there is enough memory to keep the cache, it really gains client a lot of performance.
Let's connect as a client to a remote cluster (that has Jedi type on its classpath):
=> (def client (client-instance {:hosts ["remote-hz-cluster.host"] :cluster-name "dev"}))
#'chazel/client
and work with this remote "jedis" map:
=> (def jedis (hz-map "jedis" client))
#'chazel/jedis
Measure the time it takes to run "a where editor = vim" query remotely:
=> (time (select jedis "editor = vim"))
#{#object[chazel.jedis.Jedi 0x28a045c4 "{:name Obi-Wan Kenobi :editor vim}"]
#object[chazel.jedis.Jedi 0x2b64eda6 "{:name Hazel Caster :editor vim}"]
#object[chazel.jedis.Jedi 0x566f6bd7 "{:name Yoda :editor vim}"]}
"Elapsed time: 36.261975 msecs"
Create a query cache with the same predicate:
dev=> (def vim (query-cache jedis "vim-cache" "editor = vim"))
#'chazel/vim
and time it:
=> (time (select vim "*"))
#{#object[chazel.jedis.Jedi 0x312ee7e0 "{:name Yoda :editor vim}"]
#object[chazel.jedis.Jedi 0x5ddb17e1 "{:name Hazel Caster :editor vim}"]
#object[chazel.jedis.Jedi 0x3aa47722 "{:name Obi-Wan Kenobi :editor vim}"]}
"Elapsed time: 0.355571 msecs"
more than 100 times faster: it's local and continuous.
Near Cache is highly recommended for data structures that are mostly read. The idea is to bring data closer to the caller, and keep it in sync with the source.
Here is from the official Near Cache docs:
Map or Cache entries in Hazelcast are partitioned across the cluster members. Hazelcast clients do not have local data at all. Suppose you read the key k a number of times from a Hazelcast client or k is owned by another member in your cluster. Then each
map.get(k)
orcache.get(k)
will be a remote operation, which creates a lot of network trips. If you have a data structure that is mostly read, then you should consider creating a local Near Cache, so that reads are sped up and less network traffic is created.
Near Cache can be configured on the client as well as on the server (on a particular member). The configuration can be done via XML or programmatically. chazel
adds a conveniece of an EDN based config.
For example an XML Near Cache config:
<near-cache name="myDataStructure">
<in-memory-format>(OBJECT|BINARY|NATIVE)</in-memory-format>
<invalidate-on-change>(true|false)</invalidate-on-change>
<time-to-live-seconds>(0..INT_MAX)</time-to-live-seconds>
<max-idle-seconds>(0..INT_MAX)</max-idle-seconds>
<eviction eviction-policy="(LRU|LFU|RANDOM|NONE)"
max-size-policy="(ENTRY_COUNT
|USED_NATIVE_MEMORY_SIZE|USED_NATIVE_MEMORY_PERCENTAGE
|FREE_NATIVE_MEMORY_SIZE|FREE_NATIVE_MEMORY_PERCENTAGE"
size="(0..INT_MAX)"/>
<cache-local-entries>(false|true)</cache-local-entries>
<local-update-policy>(INVALIDATE|CACHE_ON_UPDATE)</local-update-policy>
<preloader enabled="(true|false)"
directory="nearcache-example"
store-initial-delay-seconds="(0..INT_MAX)"
store-interval-seconds="(0..INT_MAX)"/>
</near-cache>
or a Java based config:
EvictionConfig evictionConfig = new EvictionConfig()
.setMaxSizePolicy(MaxSizePolicy.ENTRY_COUNT
|USED_NATIVE_MEMORY_SIZE|USED_NATIVE_MEMORY_PERCENTAGE
|FREE_NATIVE_MEMORY_SIZE|FREE_NATIVE_MEMORY_PERCENTAGE);
.setEvictionPolicy(EvictionPolicy.LRU|LFU|RANDOM|NONE);
.setSize(0..INT_MAX);
NearCachePreloaderConfig preloaderConfig = new NearCachePreloaderConfig()
.setEnabled(true|false)
.setDirectory("nearcache-example")
.setStoreInitialDelaySeconds(0..INT_MAX)
.setStoreIntervalSeconds(0..INT_MAX);
NearCacheConfig nearCacheConfig = new NearCacheConfig()
.setName("myDataStructure")
.setInMemoryFormat(InMemoryFormat.BINARY|OBJECT|NATIVE)
.setInvalidateOnChange(true|false)
.setTimeToLiveSeconds(0..INT_MAX)
.setMaxIdleSeconds(0..INT_MAX)
.setEvictionConfig(evictionConfig)
.setCacheLocalEntries(true|false)
.setLocalUpdatePolicy(LocalUpdatePolicy.INVALIDATE|CACHE_ON_UPDATE)
.setPreloaderConfig(preloaderConfig);
with chazel
would look like:
{:in-memory-format :BINARY,
:invalidate-on-change true,
:time-to-live-seconds 300,
:max-idle-seconds 30,
:cache-local-entries true,
:local-update-policy :CACHE_ON_UPDATE,
:preloader {:enabled true,
:directory "nearcache-example",
:store-initial-delay-seconds 15,
:store-interval-seconds 60},
:eviction {:eviction-policy :LRU,
:max-size-policy :ENTRY_COUNT,
:size 800000}}
and can be passed directly to the client or server Hazelcast instance.
On the client Near Cache can be passed via a :near-cache
key. For example:
(client-instance {:near-cache {:name "events"}})
would create a Hazelcast client instance with Near Cache configured for a map named "events"
.
Since only :name
is provided in this case all the other Near Cache values will be created with Hazelcast defaults.
More config options can be added of course, for example:
(client-instance {:near-cache {:name "events"
:time-to-live-seconds 300
:eviction {:eviction-policy :LRU}}})
This config can be combined with other client config options:
(client-instance {:cluster-name "dev"
:hosts ["127.0.0.1"]
:near-cache {:name "events"
:time-to-live-seconds 300
:eviction {:eviction-policy :LRU}}})
chazel
allows to compose configurations that will be passed to the cluster on startup:
(cluster-of 3 :conf (->> (connect-to {:cluster-name "foo"})
(with-near-cache {:in-memory-format :OBJECT
:local-update-policy :CACHE_ON_UPDATE
:preloader {:enabled true}}
"events")))
This would create a cluster of 3 nodes with credentials and Near Cache for a map called "events"
.
Sending work to be done remotely on the cluster is very useful, and Hazelcast has a rich set of APIs to do that.
chazel does not implement all the APIs, but it does provide a simple way of sending tasks to be executed remotely on the cluster:
(task do-work)
done.
task
here is a chazel's built-in function, and do-work
is your function.
A couple of gotchas:
do-work
must exist on both: sending and "doing the work" JVMs- in case you'd like to pass a function with arguments use
partial
(task (partial do-work arg1 arg2 ..))
In example above do-work
gets wrapped into a Runnable internal chazel Task
and gets send to the cluster to execute.
Say the function we are sending is:
(defn do-work [& args]
(println "printing remotely..." args)
(str "doing work remotely with args: " args))
If we send it with (task do-work)
, you'll see printing remotely... nil
in logs of a cluster member that picked up this task.
But you won't see doing the work...
since it was silently executed on that member.
In case you do want to know when the task is done, or you'd like to own the result of the tasks, you can send a task that will return you a future back.
chazel calls this kind of task an ftask
:
chazel=> (ftask do-work)
#<ClientCancellableDelegatingFuture com.hazelcast.client.util.ClientCancellableDelegatingFuture@6148ce19>
In case of ftask
chazel also wraps a function do-work
into Task, but now it cares of Task's Callable skills, hence we get a tasty future back. Let's deref it:
chazel=> @(ftask do-work)
"doing work remotely with args: "
and send it some args:
chazel=> @(ftask (partial do-work 42 "forty two"))
"doing work remotely with args: (42 \"forty two\")"
A task that is sent with task
of ftask
by default will be picked up by any one member to run it.
Sometimes it is needed to send a task to be executed on all of the cluster members:
chazel=> (ftask (partial do-work 42 "forty two") :members :all)
{#<MemberImpl Member [192.168.1.4]:5702> #<ClientCancellableDelegatingFuture com.hazelcast.client.util.ClientCancellableDelegatingFuture@2ae5cde4>,
#<MemberImpl Member [192.168.1.4]:5701> #<ClientCancellableDelegatingFuture com.hazelcast.client.util.ClientCancellableDelegatingFuture@7db6db4>}
here we have a small local two node cluster, and what comes back is a {member future} map. Let's get all the results:
chazel=> (def work (ftask (partial do-work 42 "forty two") :members :all))
#'chazel/work
chazel=> (into {} (for [[m f] work] [m @f]))
{#<MemberImpl Member [192.168.1.4]:5702>
"doing work remotely with args: (42 \"forty two\")",
#<MemberImpl Member [192.168.1.4]:5701>
"doing work remotely with args: (42 \"forty two\")"}
By default chazel will look for a client instance, if it is active, it will use that, if not it will get a server instance instead. But in case you'd like to use a concrete instance in order to send out tasks from you can:
(task do-work :instance your-instance)
By default chazel will use a "default"
executor service to submit all the tasks to.
But in case you'd like to pick a different one, you can:
(task do-work :exec-svc-name "my-es")
All the options can be used with task
and ftask
:
(task do-work :instance my-instance :exec-svc-name "my-es")
(ftask do-work :instance my-instance :members :all :exec-svc-name "my-es")
Hazelcast's Reliable Topic is backed by a Ringbuffer which amongst other benefits (i.e. not destructive operations, ttl, batching, etc.) sequences all the messages, which allows for an interesting replay use cases.
Since this is Hazelcast, we are dealing with a cluster of nodes, and depending on backup-count
(a.k.a. quorum) this reliable topic is well distributed, which means it allows for better locality as well as higher availability: i.e. cluster may lose nodes, but all the topic messages will be still there to consume.
Let's say we have a system that publishes payments. We can send these payments to a reliable topic, and have some consumers that would be responsible to process these payments. So let's create this reliable topic:
chazel=> (def payments (hz-reliable-topic :payments))
#'chazel/payments
and a simple functions that would process a single payment:
chazel=> (defn process-payment [p] (info "processing payment" p))
#'chazel/process-payment
We can now add this function as one of the topic listeners by calling add-message-listener
on the topic:
chazel=> (add-message-listener payments process-payment)
#uuid "f3216455-f9c8-46ef-976a-cae942b15a8d"
This listener UUID can later be used to remove-message-listener
.
Now let's publish some payments:
chazel=> (publish payments {:name "John" :amount 4200.42M})
INFO: processing payment {:name John, :amount 4200.42M}
chazel=> (publish payments {:name "Kevin" :amount 2800.28M})
INFO: processing payment {:name Kevin, :amount 2800.28M}
chazel=> (publish payments {:name "Jessica" :amount 3400.34M})
INFO: processing payment {:name Jessica, :amount 3400.34M}
You can see that each payment is picked up by the listener and processed.
So far so good, but not much different from a regular pub/sub topic. Let's make it more interesting.
Say we have some problems with payments and we need to audit every payment that was sent. With a regular topic it would be hard to do (if at all possible) since we need to audit all the payments: from the past and ongoing. With Hazelcast's Reliable Topic is not an issue, since it is backed by a Ringbuffer and all the messages are sequenced, we can just ask to replay the messages from an arbitrary sequence number.
First let's create a function that will do the audit work:
chazel=> (defn audit-payment [p] (info "auditing payment" p))
#'chazel/audit-payment
and add it as a reliable listener:
chazel=> (add-reliable-listener payments audit-payment {:start-from 0})
"d274fab1-7f0f-47f9-a53a-58b35a4c68d1"
INFO: auditing payment {:name John, :amount 4200.42M}
INFO: auditing payment {:name Kevin, :amount 2800.28M}
INFO: auditing payment {:name Jessica, :amount 3400.34M}
Interesting, you see what happened? All the payments starting from the sequence 0
(the very beginning) were simply replayed and audited: niice!
Let's publish more payments:
chazel=> (publish payments {:name "Rudolf" :amount 1234.56M})
INFO: auditing payment {:name Rudolf, :amount 1234.56M}
INFO: processing payment {:name Rudolf, :amount 1234.56M}
chazel=> (publish payments {:name "Nancy" :amount 6543.21M})
INFO: auditing payment {:name Nancy, :amount 6543.21M}
INFO: processing payment {:name Nancy, :amount 6543.21M}
Now every ongoing payment gets processed and audited, since there are two listeners attached to a topic.
Let's replay them all again, just for fun:
chazel=> (add-reliable-listener payments audit-payment {:start-from 0})
"e2bd4912-7ccb-48b7-8102-b31e5660f68d"
INFO: auditing payment {:name John, :amount 4200.42M}
INFO: auditing payment {:name Kevin, :amount 2800.28M}
INFO: auditing payment {:name Jessica, :amount 3400.34M}
INFO: auditing payment {:name Rudolf, :amount 1234.56M}
INFO: auditing payment {:name Nancy, :amount 6543.21M}
niice!
there are other options that can be provided to a reliable listener: i.e. start-from
store-seq
loss-tolerant?
terminal?
if needed
Hazelcast has map entry listeners which can be attached to maps and listen on different operations, namely:
- entry added
- entry updated
- entry removed
- entry evicted
- entry expired
- entry loaded
- entry merged
chazel has all 7 listeners available as wrapper functions and ready to roll:
- entry-added-listener
- entry-updated-listener
- entry-removed-listener
- entry-evicted-listener
- entry-expired-listener
- entry-loaded-listener
- entry-merged-listener
A chazel map entry listener would take a function and apply it every time the event takes place:
chazel=> (def m (hz-map "appl"))
#'chazel/m
chazel=> (put! m 42 1)
chazel=> m
{42 1}
nothing fancy, usual map business. now let's add an update listener:
chazel=> (def ul (entry-updated-listener (fn [k v ov] (println "updated: " {:k k :v v :ov ov}))))
#'chazel/ul
chazel=> (def id (add-entry-listener m ul))
#'chazel/id
chazel=> id
"927b9530-630c-4bbb-995f-9c74815d9ca9"
chazel=>
ov
here is an old value
that is being updated.
When the listener is added, hazelcast assigns a uuid
to it. We'll use it a bit later. For now let's see how the listener works:
chazel=> (put! m 42 2)
1
updated: {:k 42, :v 2, :ov 1}
chazel=>
chazel=> (put! m 42 3)
updated: {:k 42, :v 3, :ov 2}
2
now every time an entry gets updated a function we created above gets applied.
Since we have listener id, we can use it to remove this listener from the map:
chazel=> (remove-entry-listener m id)
true
chazel=> (put! m 42 4)
3
chazel=> m
{42 4}
all back to vanilla, no listeners involved, map business.
Serialization is a big deal when hazelcast nodes are distributed, or when you connect to a remote hazelcast cluster. chazel solves this problem by delegating it to an optional serializer.
To start off, chazel has a transit seriailzer ready to go:
user=> (require '[chazel.serializer :refer [transit-in transit-out]])
user=> (def m (hz-map "amzn"))
user=> (put! m "bids" {:opening [429 431 430 429] :nbbo [428 430 429 427]} transit-out)
#<byte[] [B@5d9d8664>
user=>
notice transit-out
, it is an optional function to put!
that will be applied to the value before the hazelcast .put
is called. In this case a value will be serialized with transit.
user=> (cget m "bids")
#<byte[] [B@638b6eec>
a default chazel's cget
will return the value the way hazelcast has it stored: as a byte array. Similarly to put!
, cget
also takes in an optional function that is applied after the value is fetched from hazelcast:
user=> (cget m "bids" transit-in)
{:opening [429 431 430 429], :nbbo [428 430 429 427]}
user=> (type (cget m "bids" transit-in))
clojure.lang.PersistentArrayMap
In case you need to use a different serializer, you can either send a pull request updating chazel.serializer, or by specifying your own "secret" serialize function in put!
and cget
.
This is a constant area of improvement and at the moment there are 2 ways to get some stats:
First is a simplistic way to find all the maps accross the cluster with their sizes (i.e. total number of values across all nodes):
chazel=> (def appl (hz-map "appl"))
#'chazel/appl
chazel=> (def goog (hz-map "goog"))
#'chazel/goog
chazel=> (map-sizes)
{"goog" {:size 0}, "appl" {:size 0}}
now let's add some values and run `(map-sizes)` again:
chazel=> (doseq [n (range 2048)] (put! goog n (str n)))
chazel=> (doseq [n (range 1024)] (put! appl n (str n)))
chazel=> (map-sizes)
{"goog" {:size 2048}, "appl" {:size 1024}}
not too much intel, but proves to be quite useful: you see all the existing maps (IMap distributed objects) as well as their sizes.
In case you need to get all stats across the cluster, there are options:
- Management Center that comes with hazelcast, but you pay for clusters over 2 nodes
- hface will give you all the stats with GUI, free for any number of nodes, but not as powerful as the management center
- built in chazel
(cluster-stats)
function, but you'll have to include an 8KB dependency to your cluster nodes which is just a callable that is able to collect node stats
Here is an example of a built in (cluster-stats)
:
chazel=> (cluster-stats)
{"Member [192.168.1.185]:5701 this"
{:master true,
:clusterName "dev",
:instanceNames ["c:goog" "c:appl" "e:stats-exec-service"],
:memberList
["192.168.1.185:5701" "192.168.2.185:5702" "192.168.2.185:5703"],
:memberState
{:runtimeProps
{:osMemory.freePhysicalMemory 2046976000,
:runtime.loadedClassCount 10130,
;;...
}}
:executorStats {:stats-exec-service {:creationTime 1462910619108, :pending 0, :started 4, :completed 3, :cancelled 0, :totalStartLatency 0, :totalExecutionTime 49}},
:multiMapStats {},
:topicStats {},
:memoryStats {:committedNativeMemory 0, :creationTime 0, :usedNativeMemory 0, :freePhysical 2046976000, :maxNativeMemory 0, :freeNativeMemory 0, :maxHeap 3817865216, :totalPhysical 17179869184, :usedHeap 985153872, :gcStats {:creationTime 0, :minorCount 17, :minorTime 198, :majorCount 2, :majorTime 314, :unknownCount 0, :unknownTime 0}, :committedHeap 1548746752},
:mapStats
{:goog
{:creationTime 1462910602378, :maxGetLatency 0, :maxPutLatency 2, :lastAccessTime 0, :maxRemoveLatency 0, :heapCost 238277, :totalGetLatencies 0, :numberOfOtherOperations 90, :ownedEntryMemoryCost 118788, :getCount 0, :hits 0, :backupCount 1, :totalRemoveLatencies 0, :backupEntryMemoryCost 119489, :removeCount 0, :totalPutLatencies 316, :dirtyEntryCount 0, :lastUpdateTime 1462910608301, :backupEntryCount 681, :lockedEntryCount 0, :ownedEntryCount 677, :putCount 2048, :numberOfEvents 0},
:appl
{:creationTime 1462910599320, :maxGetLatency 0, :maxPutLatency 68, :lastAccessTime 0, :maxRemoveLatency 0, :heapCost 119125, :totalGetLatencies 0, :numberOfOtherOperations 90, :ownedEntryMemoryCost 60004, :getCount 0, :hits 0, :backupCount 1, :totalRemoveLatencies 0, :backupEntryMemoryCost 59121, :removeCount 0, :totalPutLatencies 390, :dirtyEntryCount 0, :lastUpdateTime 1462910604627, :backupEntryCount 338, :lockedEntryCount 0, :ownedEntryCount 343, :putCount 1024, :numberOfEvents 0}},
:replicatedMapStats {},
:queueStats {},
;; lots and lots more for this member..
}
"Member [192.168.2.185]:5703"
{:master false,
:clusterName "dev",
:instanceNames ["c:goog" "c:appl" "e:stats-exec-service"],
;; lots and lots more for this member..
}
"Member [192.168.2.185]:5702"
{:master false,
:clusterName "dev",
:instanceNames ["c:goog" "c:appl" "e:stats-exec-service"],
;; lots and lots more for this member..
}
(cluster-stats)
returns a {member stats}
map with ALL the stats available for the cluster.
again in order to make it work, add a little 8KB dependency to your cluster nodes, so it can collect stats from each node / member.
Copyright © 2020 tolitius
Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.