-
Notifications
You must be signed in to change notification settings - Fork 36
Sunspot 2.0 README
The purpose of this document is to provide a framework for the development of Sunspot 2.0 using the README-driven development model. Sunspot 2.0 does not yet exist, but this document aims to describe the functionality that we hope to achieve when we build it.
Just add Sunspot to your Gemfile
:
gem 'sunspot'
# Install the optional packaged Solr server (recommended for development):
group :test, :development do
gem 'sunspot_solr'
end
Install the gem from your shell:
sudo gem install sunspot
# To install the optional packaged Solr server (recommended for development):
sudo gem install sunspot_solr
Then add the dependency to your environment.rb
:
config.gem 'sunspot'
And if you installed the sunspot_solr gem, in your project's config/development.rb
and config/test.rb
,
add the following gem dependency:
config.gem 'sunspot_solr'
Note: The sunspot_rails
gem no longer needs to be installed. As of version 2.0, Sunspot will automatically include Rails integration if you load it in a Rails environment; it still works fine in a non-Rails environment as well.
Sunspot can be run without any explicit configuration, but if you're using Rails, the easiest way to maintain a consistent configuration across your team is to use the built-in generator to add configurations to your project:
$ script/rails generate sunspot
This will create the following new files in the current directory:
config/sunspot.yml
solr/conf/schema.xml
solr/conf/solrconfig.xml
solr/data/.gitignore
The sunspot.yml
file is used for application-level configuration of Sunspot. This includes a URL at which Sunspot can access Solr; if the hostname of this URL is localhost
or 127.0.0.1
, the sunspot-solr
executable will also use the configuration you've specified to start up the bundled Solr instance. An example sunspot.yml
might look like this:
production:
solr: http://solr.my-host.com/solr
development:
solr:
url: http://localhost:8982/solr
max_memory: 1024M
# TK more configuration options here
The files in the solr/conf
directory are used directly by the bundled Solr instance when you run it locally. You probably won't need to change it, but if you need advanced customization of Solr's behavior, these files are where you can do that. The solr/data
directory contains your actual Solr index on disk, and is thus excluded from Git for you.
To start Solr in your development environment, simply run:
$ sunspot solr start # sunspot_solr gem must be installed
If you run this from the root of a Rails project, Sunspot will detect that and use your config/sunspot.yml
if it's present.
Sunspot is designed to index and search Ruby objects that are persisted to a separate primary data store. Sunspot supports ActiveRecord, DataMapper, Mongoid, and MongoMapper [TK what else?] out of the box; it's quite easy to add support for other persistence layers. See the documentation for Sunspot::Adapter.
Configuring a model class for search primarily consists of defining which fields Sunspot should index, and setting those fields up with various options. Fields do not need to correspond to database columns; Sunspot will happily any index the return value of any method your object responds to.
The examples in this README all assume we're building a straightforward blogging platform. Let's start with a simple configuration for our Post
model.
class Post < ActiveRecord::Base
include Sunspot::Searchable
searchable do
fulltext :body
integer :blog_id
end
end
The searchable
block is where all Sunspot configuration is performed. Here we have three fields: the title
and body
fulltext fields, and the blog_id
integer field. These fields exemplify the two basic field types in Sunspot: fulltext fields and attribute fields.
Fulltext fields always have the type fulltext
, and are used for keyword search. Solr breaks apart the data from fulltext fields into individual words, and when a fulltext search is performed, documents are matched against search terms on a word-by-word basis.
Attribute fields, on the other hand, are scalar data, and are indexed as-is without any analysis. Attribute fields can have several scalar types: string
, integer
, long
, float
, double
, date
, time
, and boolean
are the main ones. You can think of attribute fields as equivalent to columns in a database: they can be used for filtering search results to a certain scope (e.g. only return results with a blog_id
of 1); ordering results; and faceting, a topic we will cover in more depth later in this README.
The above example uses the simplest method of populating fields; Sunspot will simply call the method named by the field, and index the return value if it's non-nil. If you wish to give the field a different name from the method that populates it, use the :using
option:
searchable do
integer :my_blog_id, :using => :blog_id
end
This will populate a my_blog_id
in Solr using the return value of the Post#blog_id
method.
If you wish to populate a field with data that is not defined by a method on your model class, you can pass a block to the field definition; the block is evaluated in the context of the model instance, and the return value is indexed. For instance, perhaps we wish to index the number of comments on a given post:
searchable do
integer(:comments_count) { comments.count }
end
A special type of attribute field is a reference field. These are fields that hold references to other persistent objects; they're particularly useful for faceting. For example, instead of our blog_id
field above, we might simply index blog
as a reference field:
searchable do
reference :blog
end
Now instead of working with an integer when using this field, we'll be working with actual Blog
objects.
TK
The following options are available on all fields:
:stored
- By default, Sunspot does not add field data to Solr in a way that allows Solr to return that field data in search results; instead, Sunspot only stores the object's class name and primary key, and uses that information from the search result to load the original object out of the primary database. You can override this behavior on a per-field basis to instruct Solr to return the field data in search results; in certain cases, this can allow you to bypass looking up the original objects in the database altogether, giving you a performance boost.
:as
- **Advanced.** Usually, Sunspot constructs an internal field name for your fields based on the field type and options you've set; Sunspot's built-in Solr schema is set up to follow the same naming conventions. In certain cases, such as legacy schemas or for functionality not supported by Sunspot, you may want to override this and directly set the field name that will be used internally.
TK
TK
TK
$ sunspot reindex
TK
TK
TK
TK
TK
TK
TK
TK
Post.search do
where(:blog_id => 1)
where(:comments_count).gt(0)
end
TK
TK
TK
TK
TK