Schema

Schema from existing tsdb and blog posts

Basic time series with Cassandra

http://www.rubyscale.com/post/143067470585/basic-time-series-with-cassandra

key is cpu

create table simple.metrics (metric_key text, time timestamp, value double, PRIMARY KEY (metric_key, time));

key is cpu:2016-11-12

create table simple.metrics2 (metric_key text, offset int, value double, PRIMARY KEY (metric_key, time));

TODO:

tag is not considered
query range is not considered, have to keep the range, need to know the oldest

Influx Comparison

https://github.com/influxdata/influxdb-comparisons/blob/master/bulk_load_cassandra/main.go

create keyspace measurements with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
CREATE TABLE measurements.series (
				series_id text,
				timestamp_ns bigint,
				value double,
				PRIMARY KEY (series_id, timestamp_ns)
			 )
			 WITH COMPACT STORAGE;

TODO:

tag is not considered
one series need more than one row, because the limit of columns (2 billion, 3 weeks in millisecond (from KairosDB))

KairosDB

NOTE: the table created using thrift not cql, so it will have column1 as column name when query

private void createSchema(int replicationFactor)
{
    List<ColumnFamilyDefinition> cfDef = new ArrayList<ColumnFamilyDefinition>();

    cfDef.add(HFactory.createColumnFamilyDefinition(
            m_keyspaceName, CF_DATA_POINTS, ComparatorType.BYTESTYPE));

    cfDef.add(HFactory.createColumnFamilyDefinition(
            m_keyspaceName, CF_ROW_KEY_INDEX, ComparatorType.BYTESTYPE));

    cfDef.add(HFactory.createColumnFamilyDefinition(
            m_keyspaceName, CF_STRING_INDEX, ComparatorType.UTF8TYPE));

    KeyspaceDefinition newKeyspace = HFactory.createKeyspaceDefinition(
            m_keyspaceName, ThriftKsDef.DEF_STRATEGY_CLASS,
            replicationFactor, cfDef);

    m_cluster.addKeyspace(newKeyspace, true);
}

KairosDB seems to be doing the same as Heroic, the row key in data_points table is generated by client,

TODO: why not PRIMARY KEY((key, timestamp, tags), timestamp_offset) sow the row_key_index and string_index table is no longer needede

Spotify Heroic

Heroic

CREATE KEYSPACE IF NOT EXISTS {{keyspace}}
  WITH REPLICATION = {
    'class' : 'SimpleStrategy',
    'replication_factor' : 3
  };

CREATE TABLE IF NOT EXISTS {{keyspace}}.metrics (
  metric_key blob,
  data_timestamp_offset int,
  data_value double,
  PRIMARY KEY(metric_key, data_timestamp_offset)
) WITH COMPACT STORAGE;

Heroic store meta data in elastic search

TODO: from metric/datastax/MetricsRowKey it seems Heroic generate metric_key using key + tags (series java object and store as blob), so the key length is related with number of tags

{
  "metadata": {
    "properties": {
      "key": {
        "index": "not_analyzed",
        "type": "string",
        "doc_values": true,
        "include_in_all": false
      },
      "tags": {
        "type": "string",
        "index": "not_analyzed",
        "doc_values": true,
        "include_in_all": false
      },
      "tag_keys": {
        "type": "string",
        "index": "not_analyzed",
        "doc_values": true,
        "include_in_all": false
      }
    }
  }
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

schema.md

schema.md

Schema

Basic time series with Cassandra

Influx Comparison

KairosDB

Spotify Heroic

Files

schema.md

Latest commit

History

schema.md

File metadata and controls

Schema

Basic time series with Cassandra

Influx Comparison

KairosDB

Spotify Heroic