How ElasticSearch and Solr Handle Data

We recently found a post that compares and contrasts the strengths and weaknesses of Solr and ElasticSearch.  The Sematext Blog points us to “Solr vs. ElasticSearch: Part 2-Data Handling” about how each open source search technology handles data, performs indexing, and language analysis.  When it comes to data indexing, both use Java API and HTTP call, but here is where ElasticSearch has two features that Solr does not: multiple document types inside a single index and nested documents.  Nested documents mean you can create more than a flat document structure and multiple document types in a single index is pretty self-explanatory.

Both Solr and ElasticSearch allow index manipulation, but both differ greatly.

Solr let’s you control all cores that live inside your cluster with the CoreAdmin API – you can create cores, rename, reload, or even merge them into another core. In addition to the CoreAdmin API Solr enables you to use the collections API to create, delete or reload a collection.

ElasticSearch, on the other hand, allows you to control indices using HTTP API:

During creation you can specify the number of shards an index should have and you can decrease and increase the number of replicas without anything more than a single API call. You cannot change the number of shards yet.  Of course, you can also define mappings and analyzers during index creation, so you have all the control you need to index a new type of data into you cluster.

Partial document updates take place too, but it is more of the documents reindexing on the search engine side so it appears to be updating.  Following the rest of the article, you will see that Solr has better multilingual capabilities and that ElasticSearch has the better internal search engine.  Still, when it comes to deciding which is the better search application, it comes down to what your project needs are.  If, however, you decide to go with Apache Lucene, you may want to check out LucidWorks.

Whitney Grace, October 19, 2012

Sponsored by ArnoldIT.com, developer of Augmentext.