-
Notifications
You must be signed in to change notification settings - Fork 22
Architecture (Spider)
[Table of Contents](https://github.com/dell-oss/Doradus/wiki/Spider Databases: Table-of-Contents) | Previous | Next
Doradus Spider Databases: Architecture
Doradus is a Java server application that leverages and extends the Cassandra NoSQL database. At a high level, it is a REST service that sits between applications and a Cassandra cluster, adding powerful features to—and hiding complexities in—the underlying database. This allows applications to leverage the benefits of NoSQL such as horizontal scalability, replication, and failover while enjoying rich features such as full text searching, bi-directional relationships, and powerful analytic queries.
An overview of Doradus architecture is depicted below:
Key components of this architecture are summarized below:
-
Apps: One or more applications access a Doradus server instance using a simple REST API. A JMX API is available to monitor Doradus and perform administrative functions.
-
DoradusServer: This core component controls server startup, shutdown, and services. Entry points are provided to run the server as a stand-alone application, as a Windows service (via procrun), or embedded within another application.
-
Services: Doradus’ architecture encapsulates functions within service modules. Services are initialized based on the server’s doradus.yaml configuration. Services provide functions such as the REST API (an embedded Jetty server), Schema processing, and physical DB access. A special class of storage services provide storage and access features for specific application types. Doradus currently provides two storage services:
-
OLAP Service: A Doradus database configured to use the OLAP storage service is termed a Doradus OLAP Database. OLAP uses online analytical processing techniques to provide dense storage and very fast processing of analytical queries. This service is ideal for applications that use immutable or semi-mutable time-series data.
-
Spider Service: A Doradus database configured to use the Spider storage service is termed a Doradus Spider Database. The Spider service supports schemaless applications, fully inverted indexing, fine-grained updates, table-level sharding, and other features that support applications that use highly mutable and/or variable data.
-
Doradus can be configured to use both storage services in a single instance.
- Cassandra Cluster: Doradus currently uses the Apache Cassandra NoSQL database for persistence. Future releases are intended to use other data stores. Cassandra performs the "heavy lifting" in terms of persistent, replication, load balancing, replication, and more.
By default, Doradus operates in single-tenant mode, which means that all applications are stored in a single Cassandra keyspace. In multi-tenant mode, named tenants own one or more applications stored in a separate keyspace. Multi-tenant mode allows multiple applications to share a common Doradus cluster while providing data isolation and security. Full details on configuring and operating multi-tenant mode are described in the Doradus Administration documentation.
The minimal deployment configuration is a single Doradus instance and a single Cassandra instance running on the same machine. On Windows, these instances can be installed as services. The Doradus server can also be embedded in the same JVM as an application.
Multiple Doradus and Cassandra instances can be deployed to scale a cluster horizontally. An example of a Doradus/Cassandra multi-node cluster is shown below:
This example demonstrates several deployment features:
-
One Doradus instance and one Cassandra instance are typically deployed on each node.
-
Doradus instances are peers, hence an application can submit requests to any Doradus instance in the cluster.
-
Each Doradus instance is typically configured to use all network near Cassandra instances. This allows it to distribute requests to local Cassandra instances, providing automatic failover should a Cassandra instance fail.
-
Cassandra can be configured to know which nodes are in the same rack and which racks are in the same data center. With this knowledge, Cassandra uses replication strategies to balance network bandwidth and recoverability from node-, rack-, and data center-level failures.
Details on installing and configuring Doradus/Cassandra clusters are provided in the Doradus Administration document.
Technical Documentation
[Doradus OLAP Databases](https://github.com/dell-oss/Doradus/wiki/Doradus OLAP Databases)
- Architecture
- OLAP Database Overview
- OLAP Data Model
- Doradus Query Language (DQL)
- OLAP Object Queries
- OLAP Aggregate Queries
- OLAP REST Commands
- Architecture
- Spider Database Overview
- Spider Data Model
- Doradus Query Language (DQL)
- Spider Object Queries
- Spider Aggregate Queries
- Spider REST Commands
- [Installing and Running Doradus](https://github.com/dell-oss/Doradus/wiki/Installing and Running Doradus)
- [Deployment Guidelines](https://github.com/dell-oss/Doradus/wiki/Deployment Guidelines)
- [Doradus Configuration and Operation](https://github.com/dell-oss/Doradus/wiki/Doradus Configuration and Operation)
- [Cassandra Configuration and Operation](https://github.com/dell-oss/Doradus/wiki/Cassandra Configuration and Operation)