A replicated Akka Persistence journal backed by Apache Cassandra.
To include this journal plugin into your sbt
project, add the following lines to build.sbt
:
resolvers += "krasserm at bintray" at "http://dl.bintray.com/krasserm/maven"
libraryDependencies += "com.github.krasserm" %% "akka-persistence-cassandra" % "0.2.1"
This version of the plugin depends on Akka 2.3.2 and is cross-built against Scala 2.10.4 and 2.11.0.
The plugin requires an exiting installation of Cassandra 2.0.3 or higher. You may want to follow this Getting Started guide for basic installation instructions.
- All operations required by the Akka Persistence journal plugin API are fully supported.
- The plugin uses Cassandra in a pure log-oriented way i.e. data are only ever inserted but never updated (deletions are made on user request only or by persistent channels, see also Caveats).
- Writes of messages and confirmations are batched to optimize throughput. See batch writes for details how to configure batch sizes. The plugin was tested to work properly under high load.
- Messages written by a single processor are partitioned across the cluster to achieve scalability with data volume by adding nodes.
To activate the plugin, add the following line to your Akka application.conf
:
akka.persistence.journal.plugin = "cassandra-journal"
This will run the journal with its default settings. The default settings can be changed with the following configuration keys:
cassandra-journal.contact-points
. A comma-separated list of contact points in a Cassandra cluster. Default value is[127.0.0.1]
.cassandra-journal.keyspace
. Name of the keyspace to be used by the plugin. If the keyspace doesn't exist it is automatically created. Default value isakka
.cassandra-journal.table
. Name of the table to be used by the plugin. If the table doesn't exist it is automatically created. Default value ismessages
.cassandra-journal.replication-factor
. Replication factor to use when a keyspace is created by the plugin. Default value is1
.cassandra-journal.max-partition-size
. Maximum number of entries (messages, confirmations and deletion markers) per partition. Default value is 5000000. Do not change this setting after table creation (not checked yet).cassandra-journal.max-result-size
. Maximum number of entries returned per query. Queries are executed recursively, if needed, to achieve recovery goals. Default value is 50001.cassandra-journal.write-consistency
. Write consistency level. Default value isQUORUM
.cassandra-journal.read-consistency
. Read consistency level. Default value isQUORUM
.
The default read and write consistency levels ensure that processors can read their own writes. During normal operation, processors only write to the journal, reads occur only during recovery.
- Detailed tests under failure conditions are still missing.
- Persistent channel recovery times are not optimized yet (see also issue 4).
- Recovery time increases with the number of tombstones and drops to a minimum after compaction.
- Persistent channel throughput is independent of the number of tombstones after successful recovery.
- Range deletion performance (i.e.
deleteMessages
up to a specified sequence number) depends on the extend of previous deletions- linearly increases with the number of tombstones generated by previous permanent deletions and drops to a minimum after compaction
- linearly increases with the number of plugin-level deletion markers generated by previous logical deletions (recommended: always use permanent range deletions)
These issues are likely to be resolved in future versions of the plugin.