Skip to content

Low-tech search for ActiveRecord with MySQL or PostgreSQL

License

Notifications You must be signed in to change notification settings

makandra/minidusen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

makandra Minidusen

Tests

Low-tech search solution for ActiveRecord with MySQL or PostgreSQL

Minidusen lets you filter ActiveRecord models with a single query string. It works with your existing MySQL or PostgreSQL schema by mostly relying on simple LIKE queries. No additional indexes, tables or indexing databases are required.

This makes Minidusen a quick way to implement find-as-you-type filters for index views:

A list of records filtered by a query

Supported queries

Minidusen accepts a single, Google-like query string and converts it into WHERE conditions for an ActiveRecord scope.

The following type of queries are supported:

  • foo (case-insensitive search for foo in developer-defined columns)
  • foo bar (rows must include both foo and bar)
  • "foo bar" (rows must include the phrase "foo bar")
  • -bar (rows must not include the word bar)
  • filetype:pdf (developer-defined filter for file type)
  • some words 'a phrase' filetype:pdf -excluded -'excluded phrase' -filetype:pdf (combination of the above)

Limitations

Since Minidusen doesn't use an index, it scales linearly with the amount of of text that needs to be searched. Yet LIKE queries are pretty fast and we have found this low-tech approach to scale well for many thousand records.

It's probably not a good idea to use Minidusen for hundreds of thousands of records, or for very long text columns. For this we recommend to use PostgreSQL with pg_search or full-text databases like Solr.

Another limitation of Minidusen is that it only filters, but does not rank. A record either matches or not. Minidusen won't tell you if one record matches better than another record.

Installation

In your Gemfile say:

gem 'minidusen'

Now run bundle install and restart your server.

Basic Usage

Our example will be a simple address book:

class Contact < ApplicationRecord
  validates_presence_of :name, :street, :city, :email
end

We create a new class ContactFilter that will describe the searchable columns:

class ContactFilter
  include Minidusen::Filter

  filter :text do |scope, phrases|
    columns = [:name, :email]
    scope.where_like(columns => phrases)
  end
end

We can now use ContactFilter to filter a scope of Contact records:

# We start by building a scope of all contacts.
# No SQL query is made.
all_contacts = Contact.all
# => ActiveRecord::Relation

# Now we filter the scope to only contain contacts with "gmail" in either :name or :email column.
# Again, no SQL query is made.
gmail_contacts = ContactFilter.new.filter(all_contacts, 'gmail')
# => ActiveRecord::Relation

# Inspect the filtered scope.
gmail_contacts.to_sql
# => "SELECT * FROM contacts WHERE name LIKE '%gmail%' OR email LIKE '%gmail%'"

# Finally we load the scope to produce an array of Contact records.
gmail_contacts.to_a
# => Array

Filtering scopes with existing conditions

Note that you can also pass a scope with existing conditions to ContactFilter#filter. The returned scope will contain both the existing conditions and the conditions from the filter:

published_contacts = Contact.where(published: true)
# => ActiveRecord::Relation

published_contacts.to_sql
# => "SELECT * FROM contacts WHERE (published = 1)"

gmail_contacts = ContactFilter.new.filter(published_contacts, 'gmail')
# => ActiveRecord::Relation

gmail_contacts.to_sql
# => "SELECT * FROM contacts WHERE (published = 1) AND (name LIKE '%gmail%' OR email LIKE '%gmail%')"

How where_like works

The example above uses where_like. You can call where_like on any scope to produce a new scope where the given array of column names must contain all of the given phrases.

Let's say we call ContactFilter.new.filter(Contact.published, 'foo "bar baz" bam'). This will call the block filter :text do |scope, phrases| with the following arguments:

scope == Contact.published
phrases == ['foo', 'bar baz', 'bam']

The scope scope.where_like(columns => phrases) will now represent the following SQL query:

SELECT * FROM contacts
WHERE (name LIKE "%foo%" OR email LIKE "%foo") AND (email LIKE "%foo%" OR email LIKE "%foo")

You can also use where_like to find all the records not matching some phrases, using the :negate option:

Contact.where_like(name: 'foo', negate: true)

Filtering associated records

Minidusen lets you find text in associated records.

Assume the following model where a Contact record may be associated with a Group record:

class Contact < ApplicationRecord
  belongs_to :group

  validates_presence_of :name, :street, :city, :email
end

class Group < ApplicationRecord
  has_many :contacts

  validates_presence_of :name
end

We can filter contacts by their group name by joining the groups table and filtering on a joined column. Note how the joined column is qualified as groups.name (rather than just name):

class ContactFilter
  include Minidusen::Filter

  filter :text do |scope, phrases|
    columns = [:name, :email, 'groups.name']
    scope.includes(:group).where_like(columns => phrases)
  end
end

Supporting qualified field syntax

Google supports queries like filetype:pdf that filters records by some criteria without performing a full text search. Minidusen gives you a simple way to support such search syntax.

Let's support a query like email:foo@bar.com to explictly search for a contact's email address, without filtering against other columns.

We can learn this syntax by adding a filter:email instruction to our ContactFilter class:

class ContactFilter
  include Minidusen::Filter

  filter :email do |scope, email|
    scope.where(email: email)
  end

  filter :text do |scope, phrases|
    columns = [:name, :email]
    scope.where_like(columns => phrases)
  end

end

We can now explicitly search for a user's e-mail address:

ContactFilter.new.filter(Contact, 'email:foo@bar.com').to_sql
# => "SELECT * FROM contacts WHERE email='foo@bar.com'"

Caveat

If you search for a phrase containing a colon (e.g. deploy:rollback), Minidusen will mistake the first part as a – nonexistent – qualifier and return an empty set.

To prevent that, search for a phrase:

"deploy:rollback"

Supported Rails versions

Minidusen is tested on:

  • Rails 6.1
  • Rails 7.0
  • Rails 7.1
  • Rails 7.2
  • MySQL 5.6
  • PostgreSQL

If you need support for platforms not listed above, please submit a PR!

Development

  • There are tests in spec. We only accept PRs with tests.
  • We currently develop using the Ruby version in .ruby-version. It is required to change the Ruby Version to cover all Rails version or just use Gitlab CI.
  • Put your database credentials into spec/support/database.yml. There's a database.sample.yml you can use as a template.
  • Create a database minidusen_test in both MySQL and PostgreSQL.
  • There are gem bundles in the project root for each combination of ActiveRecord version and database type that we support.
  • You can bundle all test applications by saying bundle exec rake matrix:install
  • You can run specs from the project root by saying bundle exec rake matrix:spec. This will run all gemfiles compatible with your current Ruby.

If you would like to contribute:

  • Fork the repository.
  • Push your changes with passing specs.
  • Send me a pull request.

Note that we're very eager to keep this gem lightweight. If you're unsure whether a change would make it into the gem, open an issue.

Credits

Henning Koch from makandra

About

Low-tech search for ActiveRecord with MySQL or PostgreSQL

Resources

License

Stars

Watchers

Forks

Languages