Elasticsearch on Rails, a primer

Elasticsearch is a full text search server powered by Lucene, the Apache Foundation text-search engine used also by Solr.

Elasticsearch is basically a a distributed real-time document store where every field is indexed and searchable, was built to be distributed from the ground up, can be used for real-time analytics, has multi-tenacy, REST API and uses JSON for its query DSL, features that make it quite attractive to developers.

Rails example application

Let’s install the Elasticsearch server on our machine: the project download page has both code and instructions organized in an attractive way, you won’t have any problem with it. If you’re on OSX you can install via homebrew using the following command: brew install elasticsearch

By default the server will run on port 9200, so after starting the service make sure everything is fine running curl -X GET http://localhost:9200/ in the terminal.

If everything works as expected we can now create the app and add the gems:


rails new demoapp
cd demoapp
echo "gem 'elasticsearch-rails'" >> Gemfile
echo "gem 'elasticsearch-model'" >> Gemfile
bundle

Create the Article model and populate the db with some data that we will search against later:


rails g model article title:string body:text
rake db:migrate
rails c

hello_world  = Article.create title: 'Hello World', body: 'Lovely day!'
power_search = Article.create title: 'Power Search', body: 'Elasticsearch rules the world!'

(Those variable names will be used later in the examples, so keep them noted).

Let’s include some modules into the Article class to make it searchable:


class Article < ActiveRecord::Base
  include Elasticsearch::Model
  include Elasticsearch::Model::Callbacks
end

Add this line to the top of the app Rakefile to include the handy elasticsearch rake tasks:

require 'elasticsearch/rails/tasks/import'

The new tasks can be listed with bundle exec rake -D elasticsearch.

Remember that in order for old records to be found by elasticsearch we need to import them:

bundle exec rake environment elasticsearch:import:all

We can do the very same thing in the rails console:

Article.import

The search DSL

Let’s play in the rails console making the simplest possible search:

search = Article.search 'world'

This will return an object with class Elasticsearch::Model::Response::Response

If you just need to show data without going through the actual models in the database you can use the results method, which conveniently wraps the JSON returned by Elasticsearch in ruby objects:


search.results.map { |res| res.title }
# => ["Hello World", "Power Search"]

If you want to see the actual found records, then you just need to call records on that object:

search.records

The above query could be rewritten using Elasticsearch JSON DSL syntax like follows:

Article.search('{"query": {"match": {"_all": "world"}}}').records

Luckily the gem allows us to use both JSON or ruby hash syntax and of course we prefer the latter, don’t we? All the following examples will use ruby hashes.

Boosting

The following example uses the caret to give a boost to the title field:


records = Article.search(query: {multi_match: {query: 'world', fields: ['title^10', 'body']}}).records
records.size # => 2
records.first == hello_world # => true

Let’s change the records order boosting the body field instead:


records = Article.search(query: {multi_match: {query: 'world', fields: ['title', 'body^10']}}).records
records.first == power_search # => true

Fuzzy search

When it comes to fulltext search the ability to match mispelled words is mandatory. We can accomplish that easily using the fuzziness param:


records = Article.search(query: {match: {_all: {query: 'wold', fuzziness: 2}}}).records
records.first.title #  => "Hello World"

fuzziness represents the maximum allowed Levenshtein distance, it accepts an integer between 0 and 2 (where 2 means the fuzziest search possible) or the string “AUTO” which will generate an edit distance based on the charachers length of the terms in the query.

Conclusions

So far we’ve just scratched the surface of Elasticsearch capabilities and complexity, there is so much more to learn: tokenizers, analyzers, indexes, facets, but that’s something far beyond the scope of this short article. I suggest you to go deeper reading the Elasticsearch documentation and the elasticsearch-rails gem readme.

Leave a Reply

wpDiscuz