Solr vs. ElasticSearch – Feature Comparison

Recently, we were discussing moving away from out good ol’ Solr-based search engine to a more distributed environment on top of ElasticSearch, first thing I did is listing all the features I was actively using in Solr and compare them one-to-one with their ElasticSearch counterparts. I’m listing my full comparison below.

Note that this list is not a comprehensive list of all Solr features, rather it’s a realistic view of a live Solr setup.

Legend

Feature is natively supported in ElasticSearch
~ Feature is supported in ElasticSearch using a plugin and/or workaround
× Feature is not supported in ElasticSearch

 

Filters

Solr Feature Exists in ElasticSearch Notes
Query by date range
Text search across multiple fields (copyField)
Filter on distance
Filter in/out boolean value
Filter in/out numeric value(s)
Complex boolean statements (combinations of AND/OR)

Sorting Flexibility

Solr Feature Exists in ElasticSearch Notes
Ability to override sort by score
Sort by field(s)
Sort by custom function ~ Using scripting
Sort by custom query ~ Using scripting

Boost Functions and Custom Equations

Solr Feature Exists in ElasticSearch Notes
Ability to set custom scoring Using Function Score Query
Use distance in scoring ~ Supports three pre-set decay functions. Additional equations using plugins

Functions

Solr Feature Exists in ElasticSearch Notes
Custom Query within functions Using Scripting
Term Frequency (tf) Using Text scripting
Inverse Document Frequency (idf) Per shard (like Solr), Using Text scripting
sum, sub, div, product mvel scripting
min, max mvel scripting
sqrt, pow, exp mvel scripting
abs mvel scripting
duration (ms) mvel scripting – time()
if-then-else block mvel scripting
default values (def) mvel scripting
distance mvel scripting – distance(), arcDistance(), distanceInKm(), arcDistanceInKm()

Additional Features

Data Import Handler

Solr Feature Exists in ElasticSearch Notes
Feed from Postgresql ~ External Plugin – JDBC RiverCould not find complex joins or sub-entites This Link might be useful though
Feed from Solr ~ External Plugin – Solr River
Feed from ElasticSearch × Couldn’t find elasticsearch-river-elasticsearch
Custom functions and row modifications × All data should be mapped using the query (single-shot)

Plugins and extensibility

Solr Feature Exists in ElasticSearch Notes
Ability to create plugins
Post-processing ($skipDoc) ×
Plugin: Conditional Entities within dih/river ×
Plugin: Cached Entities within dih/river ×

Faceting

Solr Feature Exists in ElasticSearch Notes
Facet by field Term Facet
Facet by query (multiple fields) Query Facets
Custom labels for facet result

Grouping / Variety

  • SOLR: grouping works but breaks count (fixed by additional facets)
  • ES: Field collapsing is not supported (ticket aged 4 years)

Spell Checking

  • SOLR: Solr Implements an index-based spellchecker, which is considered rather weak.
  • ES: Using Suggester component

Text Analysis Chain

Solr Feature Exists in ElasticSearch Notes
lower case
character mapping (to ascii) Using mapping char filter
regex replacements Using pattern replace char filter
tokenizers Tokenizers Docs
stop words Using Stop token filter
synonyms Using Synonym token filter
shingles Using Compound word token filter
stemmer (porter) Using Porter, kstem, or snowball
min length filter Using Length token filter
Ability to use separate analyzers for index vs query Using query string dsl

Other

Solr Feature Exists in ElasticSearch Notes
Both cached and non-cached filters Using _cache:false
Default field value (index-time) Using null_value

Debugging and Analysis

SOLR: Detailed scoring for each result, text analyzer emulator in admin
ES: Using Explain API or plugins

Conclusions

Features

Feature-wise, ElasticSearch is catching up very fast to Solr, in some aspects surpassing it as well especially with the new aggregation framework.

Speed

On a single node, speed is very similar. Although the stability and maturity of Solr makes it the most reliable choice on single-node applications.

Distributed Setup

I couldn’t test this thoroughly, and this will vary hugely based on server setup, ZooKeeper optimizations, and other factors. However, ElasticSearch has the better reputation in this domain since it is engineered from the ground up to support distributed and cloud-based environments.

2 thoughts on “Solr vs. ElasticSearch – Feature Comparison

  1. Solr encourages you to understand a little more about what you’re doing, and the chance of you shooting yourself in the foot is somewhat lower, mainly because you’re forced to read and modify the 2 well-documented XML config files in order to have a working search app.

    • I agree. Elasticsearch does hide a lot of these details (sometimes to a harmful way).

      I recommend that people start learning using a do-it-yourself technology like solr. Once they get the concepts and start feeling that Solr’s config-first approach is redundant, they can move to a more abstracted solution like elasticsearch.

      (This goes to other technologies to, bare-bone web development vs. full fledged MVC frameworks is a good example, too)

Leave a Reply

Your email address will not be published. Required fields are marked *