Elasticsearch is a full text search server powered by Lucene, the Apache Foundation text-search engine used also by Solr.
Elasticsearch is basically a a distributed real-time document store where every field is indexed and searchable, was built to be distributed from the ground up, can be used for real-time analytics, has multi-tenacy, REST API and uses JSON for its query DSL, features that make it quite attractive to developers.
Rails example application
Let’s install the Elasticsearch server on our machine: the project download page has both code and instructions organized in an attractive way, you won’t have any problem with it. If you’re on OSX you can install via homebrew using the following command: brew install elasticsearch
By default the server will run on port 9200, so after starting the service make sure everything is fine running curl -X GET http://localhost:9200/
in the terminal.
If everything works as expected we can now create the app and add the gems:
rails new demoapp
cd demoapp
echo "gem 'elasticsearch-rails'" >> Gemfile
echo "gem 'elasticsearch-model'" >> Gemfile
bundle
Create the Article model and populate the db with some data that we will search against later:
rails g model article title:string body:text
rake db:migrate
rails c
hello_world = Article.create title: 'Hello World', body: 'Lovely day!'
power_search = Article.create title: 'Power Search', body: 'Elasticsearch rules the world!'
(Those variable names will be used later in the examples, so keep them noted).
Let’s include some modules into the Article class to make it searchable:
class Article < ActiveRecord::Base
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
end
Add this line to the top of the app Rakefile
to include the handy elasticsearch rake tasks:
require 'elasticsearch/rails/tasks/import'
The new tasks can be listed with bundle exec rake -D elasticsearch
.
Remember that in order for old records to be found by elasticsearch we need to import them:
bundle exec rake environment elasticsearch:import:all
We can do the very same thing in the rails console:
Article.import
The search DSL
Let’s play in the rails console making the simplest possible search:
search = Article.search 'world'
This will return an object with class Elasticsearch::Model::Response::Response
If you just need to show data without going through the actual models in the database you can use the results
method, which conveniently wraps the JSON returned by Elasticsearch in ruby objects:
search.results.map { |res| res.title }
# => ["Hello World", "Power Search"]
If you want to see the actual found records, then you just need to call records
on that object:
search.records
The above query could be rewritten using Elasticsearch JSON DSL syntax like follows:
Article.search('{"query": {"match": {"_all": "world"}}}').records
Luckily the gem allows us to use both JSON or ruby hash syntax and of course we prefer the latter, don’t we? All the following examples will use ruby hashes.
Boosting
The following example uses the caret to give a boost to the title
field:
records = Article.search(query: {multi_match: {query: 'world', fields: ['title^10', 'body']}}).records
records.size # => 2
records.first == hello_world # => true
Let’s change the records order boosting the body
field instead:
records = Article.search(query: {multi_match: {query: 'world', fields: ['title', 'body^10']}}).records
records.first == power_search # => true
Fuzzy search
When it comes to fulltext search the ability to match mispelled words is mandatory. We can accomplish that easily using the fuzziness
param:
records = Article.search(query: {match: {_all: {query: 'wold', fuzziness: 2}}}).records
records.first.title # => "Hello World"
fuzziness
represents the maximum allowed Levenshtein distance, it accepts an integer between 0 and 2 (where 2 means the fuzziest search possible) or the string “AUTO” which will generate an edit distance based on the charachers length of the terms in the query.
Conclusions
So far we’ve just scratched the surface of Elasticsearch capabilities and complexity, there is so much more to learn: tokenizers, analyzers, indexes, facets, but that’s something far beyond the scope of this short article. I suggest you to go deeper reading the Elasticsearch documentation and the elasticsearch-rails gem readme.
Leave a Reply