Skip to content

OLAP Overview

JoeWinter edited this page Sep 18, 2014 · 5 revisions

[Table of Contents](https://github.com/dell-oss/Doradus/wiki/OLAP Databases: Table-of-Contents) | Previous | Next
OLAP Database Overview: OLAP Overview


Online Analytical Processing is a decision support technology that allows large amounts of data to be analyzed. In traditional OLAP, data from OLTP databases and other sources typically undergoes an *extract/transform/load* (ETL) process, placing it in a *data warehouse* or *data mart* database. This process organizes data into time-oriented *dimension tables* that facilitate subject-based analytical queries. This structure allows a wide range of statistical queries that can compute aggregate results, detect trends, find data anomalies, and perform other analyses.

However, traditional OLAP has numerous drawbacks, including long ETL times, large disk space consumption, and complex, specialized schemas.

Doradus OLAP supports complex analytical queries but employs unique storage and access techniques that overcome drawbacks of traditional OLAP. Some advantages of Doradus OLAP are:

  • Data model: Applications can use the full Doradus data model, including bi-directional relationships via *link *fields. Doradus provides full referential integrity and bi-directional navigation of link fields.

  • Doradus Query Language: DQL is used for object queries, which retrieve specific objects and their values, and for aggregate queries, which perform statistical computations across large object sets. DQL features include full text searching, path expressions, quantifiers, transitive relationship searches, multi-level grouping, and other advanced search features.

  • Query speed: Most single-shard object and aggregate queries complete within a few seconds. Multi-shard queries scale linearly to the amount of data being accessed.

  • Space usage: Doradus stores data in a columnar format that compress very well. In one test, a ~1 billion object OLAP event database required only 2GB of disk space.

  • Schema evolution: An application’s schema can be changed at any time, allowing new tables and fields to be added. Automatic data aging is available to expire old data.

  • Load time: Data is loaded in batches and then merged to become visible to the corresponding shard. Load times of 250,000 objects/second or higher per node are typical depending on object complexity.

  • Lag time: The time required to merge new batches into the live shard is typically between 1 and 30 seconds. This means data is visible to queries is near real time with little time lag.

Clone this wiki locally