Skip to content
This repository was archived by the owner on Aug 23, 2023. It is now read-only.

Feature: Use scylladb as C* compatible storage backend #888

Closed
beorn- opened this issue Apr 12, 2018 · 4 comments
Closed

Feature: Use scylladb as C* compatible storage backend #888

beorn- opened this issue Apr 12, 2018 · 4 comments
Milestone

Comments

@beorn-
Copy link
Contributor

beorn- commented Apr 12, 2018

I've tried to use scylladb as a cassandra backend, mostly for performance and cost.

The table auto-creation failed with a syntax error related to compression the fix was to manually create the keyspace/keytable with :

                                                                                
CREATE TABLE IF NOT EXISTS metrictank.metric (                                  
    key ascii,                                                                  
    ts int,                                                                     
    data blob,                                                                                                                                                                                                      
    PRIMARY KEY (key, ts)                                                       
) WITH CLUSTERING ORDER BY (ts DESC)                                            
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'};
                                                                                
CREATE TABLE metrictank.metric_512 (                                            
    key ascii,                                                                  
    ts int,                                                                     
    data blob,                                                                  
    PRIMARY KEY (key, ts)                                                       
) WITH CLUSTERING ORDER BY (ts DESC)                                            
    AND comment = ''                                                            
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '26', 'compaction_window_unit': 'HOURS', 'max_threshold': '32', 'min_threshold': '4', 'tombstone_compaction_interval': '86400', 'tombstone_threshold': '0.2'};
                                                                                
CREATE TABLE metrictank.metric_16 (                                             
    key ascii,                                                                  
    ts int,                                                                     
    data blob,                                                                  
    PRIMARY KEY (key, ts)                                                       
) WITH CLUSTERING ORDER BY (ts DESC)                                            
    AND comment = ''                                                            
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '1', 'compaction_window_unit': 'HOURS', 'max_threshold': '32', 'min_threshold': '4', 'tombstone_compaction_interval': '86400', 'tombstone_threshold': '0.2'};
                                                                                
CREATE TABLE metrictank.metric_idx (                                            
    partition int,                                                              
    id text,                                                                    
    interval int,                                                               
    lastupdate int,                                                             
    metric text,                                                                
    mtype text,                                                                 
    name text,                                                                  
    orgid int,                                                                  
    tags set<text>,                                                             
    unit text,                                                                  
    PRIMARY KEY (partition, id)                                                 
) WITH CLUSTERING ORDER BY (id ASC)                                             
    AND comment = ''                                                            
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'};

and off course i've had to disable the two booleans:

cassandra-create-keyspace = false
create-keyspace = false

one of the error was

metrictank[18192]: 2018/04/11 16:54:20 [metrictank.go:271 main()] [E] failed to initialize cassandra. Missing sub-option 'sstable_compression' for the 'compression' option.
@Dieterbe
Copy link
Contributor

Dieterbe commented Apr 12, 2018

it looks like cassandra recently introduced the class property and the sstable_compression is an older one. the latest docs don't even show it anymore (http://cassandra.apache.org/doc/latest/operating/compression.html)

realistically, scylla will always lag behind cassandra to some extent and/or it may make sense to use different schemas regardless.
this ties in with another observation which is that people may want to customize their schemas and deviate from what's hardcoded into metrictank.

For that reason, I think it makes sense to move the schemas out of the source code, and into config files for the store schemas and for the index schemas. we can then have a default variant for cassandra and for scylla (and we could even use different defaults for different versions of cassandra/scylla), also allowing people to easily customize the schemas and have MT create the tables for them, simplifying the deployment.

thoughts @woodsaj ?

@beorn-
Copy link
Contributor Author

beorn- commented Apr 12, 2018

👍

@woodsaj
Copy link
Member

woodsaj commented Apr 12, 2018

This sounds good to me. We already handle the keyspace creation outside of metrictank so that we can set the replication settings.
using a "schema Template" config file sounds like a more robust solution.

beorn- added a commit to beorn-/metrictank that referenced this issue Apr 16, 2018
beorn- added a commit to beorn-/metrictank that referenced this issue Apr 16, 2018
beorn- added a commit to beorn-/metrictank that referenced this issue Apr 16, 2018
beorn- added a commit to beorn-/metrictank that referenced this issue Apr 16, 2018
beorn- added a commit to beorn-/metrictank that referenced this issue Apr 16, 2018
@Dieterbe
Copy link
Contributor

here's some criteria that I think the feature should implement: (most of these may sound obvious, but just want to have a checklist)

  • files that can easily be switched by config management (e.g. in /etc/metrictank)
  • easy to use defaults (standard file names, include in packages and in docker), that work out of the box without user having to choose a schema, or having to change configs.
  • easy to use non-defaults (by providing modified file)
  • easy to ship extra variants (e.g. a scylla default for cassandra store)
  • future store implementations should be able to add their own schema files. in particular different stores may need to setup different things (e.g. one type of store may need to just initialize a database and a table, another may need to initialize a keyspace, multiple tables, and other settings. I think we should have 1 file per type of index/store, and leave it up to the store/idx plugin to read all the data contained therein)
  • separate files for idx and store
  • should be easy to diff eg cassandra and scylla schemas, and non-default schema files (i.e. separate files)
  • ability to provide non-default interesting config files (e.g. for scylla, for older/experimental versions of cassandra, etc). just a directory separate from /etc/metrictank would do (because the latter is used as actual recommended schema that just works for someone installing MT and starting it). convenient to include this directory in the packages and docker in e.g. in /usr/share/metrictank/examples,
    that way our instructions for scylla or other cassandra versions are just a copy command.

beorn- added a commit to beorn-/metrictank that referenced this issue Apr 16, 2018
beorn- added a commit to beorn-/metrictank that referenced this issue Apr 16, 2018
beorn- added a commit to beorn-/metrictank that referenced this issue Apr 16, 2018
beorn- added a commit to beorn-/metrictank that referenced this issue Apr 17, 2018
beorn- added a commit to beorn-/metrictank that referenced this issue Apr 17, 2018
beorn- added a commit to beorn-/metrictank that referenced this issue Apr 23, 2018
beorn- added a commit to beorn-/metrictank that referenced this issue Apr 23, 2018
beorn- added a commit to beorn-/metrictank that referenced this issue Apr 23, 2018
beorn- added a commit to beorn-/metrictank that referenced this issue Apr 23, 2018
beorn- added a commit to beorn-/metrictank that referenced this issue Apr 23, 2018
@Dieterbe Dieterbe added this to the 0.9.0 milestone May 2, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants