Skip to content
This repository has been archived by the owner on Jan 27, 2023. It is now read-only.

Encrypted search for encrypted ActiveRecord models

License

Notifications You must be signed in to change notification settings

cipherstash-archive/activestash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ActiveStash

ActiveStash is the Rails-specific gem for using CipherStash.

ActiveStash gives you encrypted search on ActiveRecord models using application level encryption (using libraries like Lockbox and ActiveRecord Encryption).

When records are created or updated, they are indexed into a CipherStash collection which can be queried via an ActiveStash::Relation.

TL;DR - here's a video demo

Searchable Encrypted Rails models with ActiveStash!

Searchable Encrypted Rails models with ActiveStash

What is CipherStash?

Field-level encryption is a powerful tool to protect sensitive data in your Active Record models. However, when a field is encrypted, it can't be queried! Simple lookups are impossible let alone free-text search or range queries.

This is where CipherStash comes in. CipherStash is an Encrypted Search Index and using ActiveStash allows you to perform exact, free-text and range queries on your Encrypted ActiveRecord models. Queries use ActiveStash::Relation which wraps ActiveRecord::Relation so most of the queries you can do in ActiveRecord can be done using ActiveStash.

How does it work?

ActiveStash uses the "look-aside" pattern to create an external, fully encrypted index for your ActiveRecord models. Every time you create or update a record, the data is indexed to CipherStash via ActiveRecord callbacks. Queries are delegated to CipherStash but return ActiveRecord models so things just work.

If you've used Elasticsearch with gems like Searchkick, this pattern will be familiar to you.

Active Stash Lookaside Pattern

Getting started

  1. Add ActiveStash to your Gemfile:
gem "active_stash"
  1. Install the new dependencies:
$ bundle install
  1. Sign up for a CipherStash account, and then log in with the provided Workspace ID:
$ rake active_stash:signup
$ rake active_stash:login[YOURWORKSPACEID]

Note: If you are using zsh you may need to escape the brackets

rake active_stash:login\['WorkspaceId'\]
  1. Any model you use with ActiveStash::Search needs to have a stash_id column, to link search results back to database records.

For example, to add a stash_id column to the database table for the User model, add the below migration:

$ rails g migration AddStashIdToUser stash_id:string:index
$ rails db:migrate
  1. Add the ActiveStash::Search mixin to your user model, and declare what fields are searchable:
# app/models/user.rb
class User < ApplicationRecord
  include ActiveStash::Search

  # Previously added application-level encryption, by either ActiveRecord Encryption or Lockbox
  encrypts :name, :email
  encrypts :dob, type: :date

  # Fields that will be indexed into CipherStash
  stash_index do
    auto :name, :email, :dob
  end

  self.ignored_columns = %w[name email dob]
end
  1. Create CipherStash collections for your models

Each model that uses ActiveStash::Search needs to have a CipherStash collection.

To create collections for all models that use ActiveStash::Search, run:

$ rake active_stash:collections:create
  1. Reindex your existing data into CipherStash with ActiveStash
$ rake active_stash:reindexall
  1. Query a user record:
$ rails c
 >> User.where(email: "grace@example.com")
 => []  # no records, because the database isn't searchable
 >> User.query(email: "grace@example.com")
 => [
   #<User:0x00000001138a42b0
  id: 7,
  name: "Grace Hopper",
  email: "grace@example.com",
  stash_id: "6481b6dd-8e0f-456c-ac27-6c668ca539f2",
 ] # a record, because CipherStash makes your encrypted database searchable

Installation

Add this line to your applications Gemfile:

gem 'active_stash'

And then execute:

$ bundle install

To use, include ActiveStash::Search in a model and define which fields you want to make searchable:

class User < ActiveRecord::Base
  include ActiveStash::Search

  # Searchable fields
  stash_index do
    auto :name, :email, :dob
  end

  # fields encrypted with EncryptedRecord
  encrypts :name
  encrypts :email
  encrypts :dob

  # ...the rest of your code
end

Any model in which you include ActiveStash::Search, will need to have a stash_id column added of type string. For example, to add this to the table underlying your User model:

$ rails g migration AddStashIdToUser stash_id:string:index
$ rails db:migrate

The above command also ensures that an index is created on stash_id.

Configuration

ActiveStash supports all CipherStash configuration described in the docs.

In addition to configuration via JSON files and environment variables, ActiveStash supports Rails config and credentials.

For example, to use a specific profile in development, you could include the following in config/environments/development.rb:

Rails.application.configure do
  config.active_stash.profile_name = "dev-profile"

  # Other config...
end

For secrets, you can add ActiveStash config to your credentials (rails credentials:edit --environment <env>):

active_stash:
  aws_secret_access_key: your_secret

You can also use an initializer (e.g. config/initializers/active_stash.rb):

ActiveStash.configure do |config|
  config.aws_secret_access_key = Rails.application.credentials.aws.secret_access_key
end

Index Types

CipherStash supports three main types of indexes: exact, range (allows queries like < and >) and match which supports free-text search. Additionally, ActiveStash supports auto indexes which automatically determine what kinds of indexes to create based on the underlying data type.

Auto indexes

auto indexes automatically determine what kinds of indexes to create based on the underlying data type.

The following example adds an auto index to an encrypted email field in a model named User:

class User < ActiveRecord::Base
  include ActiveStash::Search

  encrypts :email

  stash_index do
    auto :email
  end
end

auto will create the following indexes for each data type:

String and Text

:string and :text types automatically create the following indexes. Range indexes on strings typically only work for ordering.

Indexes Created Allowed Operators Example
match =~ User.query { |q| q.name =~ "foo" }
exact == User.query(email: "foo@example.com)
range <, <=, ==, >=, > User.query.order(:email)

Numeric Types

:timestamp, :date, :datetime, :float, :decimal, and :integer types all have range indexes created.

Indexes Created Allowed Operators Example
range <, <=, ==, >=, >, between User.query { |q| q.dob > 20.years.ago }

Using Specific Index Types

ActiveStash provides more specific index methods for when you need finer-grained control over what types of indexes are created for a field.

These are:

  • exact
  • range
  • match
  • match_all

The following example uses each of these index types:

stash_index do
  exact :email
  range :dob
  match :name

  match_all :email, :name
end

For more information on index types and their options, see the CipherStash docs.

Match All Indexes

ActiveStash can create an index across multiple string fields so that you can perform free-text queries across all specified fields at once.

To do so, you can use the match_all DSL method and specify the fields that you want to have indexed:

stash_index do
  match_all :first_name, :last_name, :email
end

Match all indexes are queryable by passing the query term directly to the query method. So to search for the term "ruby" across :first_name, :last_name and :email you would do:

User.query("ruby")

Match Index Filter Options

You can adjust the parameters of the filters used for match indexes by passing the filter_term_bits and/or filter_size options to match or match_all:

stash_index do
  match :my_string, filter_size: 512, filter_term_bits: 6
  match_all :first_name, :last_name, :email, filter_size: 1024, filter_term_bits: 5
end

For more information on filter parameters, see the CipherStash docs.

Unique indexes

ActiveStash supports adding server-side unique constraints on fields.

Unique fields can be specified by using the unique DSL method in addition to the index type.

In the below example a unique constraint is added to the email field.

stash_index do
  exact :email
  unique :email
end

ActiveStash does not support adding unique constraints on match indexes.

The below example will result in a ConfigError being raised.

stash_index do
  match :email
  # Raises an error because `match` indexes don't support unique constraints
  unique :email
end

Create a CipherStash Collection

Before you can index your models, you need a CipherStash collection. ActiveStash will create indexes as defined in your models.

All you need to do is create the collection by running:

$ rake active_stash:collections:create

This command will create collections for all the models you have set up to use ActiveStash.

(Re)indexing

To index your encrypted data into CipherStash, use the reindex task:

$ rake active_stash:reindexall

If you want to just reindex one model, for example User, run:

$ rake active_stash:reindex[User]

You can also reindex in code:

User.reindex

Depending on how much data you have, reindexing may take a while but you only need to do it once. ActiveStash will automatically index (and delete) data as it records are created, updated and deleted.

Uniqueness Validations

Standard ActiveRecord uniqueness validations won't work when data is encrypted. While it can work in some cases when using deterministic encryption, we generally recommend against this approach (deterministic encryption is vulnerable to inference and chosen-plaintext attacks).

Instead, you can include ActiveStash::Validations and your uniqueness validation will now work via a query to the CipherStash index on the validated field. If there is no index on the field then validations will fail.

Note that uniqueness validations in ActiveStash are always case sensitive.

Also, note that as of now the ActiveStash::Validations does not support the :scope or :conditions options. All other options are supported.

class Person < ApplicationRecord
  include ActiveStash::Search
  include ActiveStash::Validations

  validates :email, uniqueness: true
end

Current limitations

Presently, ActiveStash provides no means to update the schema of a CipherStash collection. Therefore, if you need to make any changes to the Collection schema itself (by changing the contents of a stash_index block) you must drop your collection and recreate it.

If your indexed model is called User for example, you should run the following commands:

$ rake active_stash:drop[User]
$ rake active_stash:collections:create
$ rake active_stash:reindex[User]

Support for zero-downtime Collection schema changes and reindexing is being actively worked on and will be available soon.

When to Reindex Your Collection

These are the rules for when you must re-index your collection:

  1. You have imported, deleted or updated data in the table that backs your ActiveStash model via some external mechanism, OR
  2. You have added or removed a string/text column from the table that backs your ActiveStash model and you are using a dynamic_match index in your model

When to Drop, Recreate and Reindex Your Collection

This is the rule to determine when you must drop, recreate and reindex your collection:

  1. Whenever add or modify one or more ActiveStash index definitions in your model

See Current Limitations for instructions on what commands to run to accomplish this.

NOTE: technically, you do not need to reindex your collection if you remove an index definition on your model. A removed index definition will not remove the index stored in CipherStash and it will not be useable in queries, but it will still be incurring CPU & network costs to keep it up to date.

Running Queries

To perform queries over your encrypted records, you can use the query method For example, to find a user by email address:

User.query(email: "person@example.com")

This will return an ActiveStash::Relation which extends ActiveRecord::Relation so you can chain most methods as you normally would!

To constrain by multiple fields, include them in the hash:

User.query(email: "person@example.com", verified: true)

To order by dob, do:

User.query(email: "person@example.com").order(:dob)

Or to use limit and offset:

User.query(verified: true).limit(10).offset(20)

This means that ActiveStash should work with pagination libraries like Kaminari.

You also, don't have to provide any constraints at all and just use the encrypted indexes for ordering! To order all records by dob descending and then created_at, do (note the call to query with no args first):

User.query.order(dob: :desc, created_at: :asc)

Retrieving just Stash IDs

If you need to "join" the results of an ActiveStash query against another model, it can be wasteful to load all the model objects just to have to throw them away again. In this case, you can use the ActiveStash::Relation#stash_ids method to execute the query and only return the record IDs, like this example to find young employees:

youngster_ids = User.query { |q| q.dob > "2000-01-01".to_date }.stash_ids
young_employees = Employee.joins(:user).where("users.stash_id IN (?)", youngster_ids)

Advanced Queries

More advanced queries can be performed by passing a block to query. For example, to find all users born in or after 1998:

User.query { |q| q.dob > "1998-01-01".to_date }

Or, to perform a free-text search on name:

User.query { |q| q.name =~ "Dan" }

To combine multiple constraints, make multiple calls in the block:

User.query do |q|
  q.dob > "1998-01-01".to_date
  q.name =~ "Dan"
end

Overriding the Collection Name

To set a different collection name, you can set one in your model:

class User < ActiveRecord::Base
  include ActiveStash::Search
  self.collection_name = "mycollection"
end

Setting a Default Scope

If you plan to use encrypted queries for all the data in your model, you can set a default scope:

class User < ActiveRecord::Base
  include ActiveStash::Search

  def self.default_scope
    ActiveStash::Relation.new(self)
  end
end

Now, all queries will use the CipherStash collection, even if you don't call query. For example, this will use encrypted indexes to order:

User.order(:dob)

# Without a default scope you'd need to call
User.query.order(:dob)

Managing Access Keys

Access keys are secret credentials that allow your application to authentication to CipherStash when it is running in a non-interactive environment (such as production or CI). ActiveStash provides rake tasks to manage the access keys for your workspace.

To create a new access key:

$ rake active_stash:access_key:create[keyname]

To list all the access keys currently associated with your workspace:

$ rake active_stash:access_key:list

Finally, to delete an access key:

$ rake active_stash:access_key:delete[keyname]

Every access key must have a unique name, so you know what it is used for (and so you don't accidentally delete the wrong one). You can have as many access keys as you like.

Collection Management

Drop a Collection

You can drop a collection directly in Ruby:

User.collection.drop!

Or via the included Rake task. This command takes the name of the model that is attached to the collection.

$ rake active_stash:collections:drop[User]

List Stash Enabled Models

A rake task is provided to list all of the models in your application that have been configured to use CipherStash.

$ rake active_stash:collections:list

Create a Collection

You can also create a collection for a specific model in Ruby:

User.collection.create!

Or via a Rake task:

$ rake active_stash:collections:create

Assess

ActiveStash Assess is a tool to identify where sensitive data lives in your database, and track your progress on encrypting it.

ActiveStash Assess comes in two parts:

Rake task

ActiveStash includes a Rake task for assessing sensitive data used in a Rails application.

This command will print results to stdout in a human-readable format and write a results file to active_stash_assessment.yml in the Rails project root. We recommend you commit this file to your repo, so you can track your progress on encrypting these fields over time.

To run an assessment and generate a report, run:

$ rake active_stash:assess

This will print results to stdout in a human-readable format:

User:
- User.name is suspected to contain: names (AS0001)
- User.email is suspected to contain: emails (AS0001)
- User.gender is suspected to contain: genders (AS0001)
- User.ccn is suspected to contain: credit card numbers (AS0003)
- User.dob is suspected to contain: dates of birth (AS0001)

Online documentation:
- https://docs.cipherstash.com/assess/checks#AS0001
- https://docs.cipherstash.com/assess/checks#AS0003

Assessment written to: /Users/you/your-app/active_stash_assessment.yml

You can follow those links to learn more about why this data is considered sensitive, why adversaries want it, and what regulations and compliance frameworks cover this data.

The active_stash:assess Rake task also writes a results file to active_stash_assessment.yml in your Rails project root.

RSpec matcher

After a report has been generated, you can use the encrypt_sensitive_fields RSpec matcher to verify that a model encrypts fields (using either Lockbox or ActiveRecord Encryption) that were reported as sensitive by rake active_stash:assess.

First, make sure that the matcher is required in spec/rails_helper.rb:

# near the top of rails_helper.rb
require 'active_stash/matchers'

Next, add the matcher to the spec for the model that you'd like to test. The following example verifies that all fields reported as sensitive for the User model are encrypted:

describe User do
  it "encrypts sensitive fields", pending: "unenforced" do
    expect(described_class).to encrypt_sensitive_fields
  end
end

This helps you keep track of what fields you need to encrypt, as you incrementally roll out Application Level Encryption on your app.

As the example above shows, we recommend you start out by marking the test as pending. This will stop the test from failing while you incrementally encrypt database fields. Once you have encrypted all the fields identified by ActiveStash Assess, remove the pending so your tests will fail if the database field becomes unencrypted.

Development

After checking out the repo, run bin/setup to install dependencies.

The test suite depends on a running postgres instance being available on localhost. You'll need to export a PGUSER env var before running the test suite.

Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine:

  1. Build the gem by running rake gem.

  2. This will create a gem file (active_stash-x.x.x.gem) in the ./pkg folder.

  3. Install the gem by running gem install with the gem file name, e.g gem install ./pkg/active_stash-x.x.x.gem

Making a Release

If you have push access to the GitHub repository, you can make a release by doing the following:

  1. Run git version-bump -n <major|minor|patch> (see the semver spec for what each of major, minor, and patch version bumps represent).

  2. Write a changelog for the release, in Git commit style (headline on the first line, blank line, then Markdown text as you see fit). Save/exit your editor. This will automatically push the newly-created annotated tag, which will in turn kick off a release build of the gem and push it to RubyGems.org.

  3. Run rake release to automagically create a new GitHub release for the project.

... and that's it!

Need help?

Head over to our support forum, and we'll get back to you super quick!

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/cipherstash/activestash. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.

License

ActiveStash is available under the CipherStash Client Library Licence Agreement.

Code of Conduct

Everyone interacting in the Activestash project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.