Despite being almost a year old and surviving the testing field of the IT professional, SolrCloud has steadily gained recognition as a heavy-duty distributor for its parent software Solr. The Sematext Blog crew got to test SolrCloud’s capabilities when they were handed a project involving Big Data stored in a HBase cluster and search. Read about it in “The New SolrCloud: Overview.” Due to the large amounts of data (hence Big Data), the team needed a search cluster that could handle the current and future growth of the data, with indexing, and reliability. They did not want to use ElasticSearch, because Solr was designed specifically for large search clusters.
And so we took the opportunity to use SolrCloud and some of its features not present in previous versions of Solr. In particular, we wanted to make use of Distributed Indexing and Distributed Searching, both of which SolrCloud makes possible. In the process we looked at a few JIRA issues, such as SOLR-2358 and SOLR-2355, and we got familiar with relevant portions of SolrCloud source code. This confirmed SolrCloud would indeed satisfy our needs for the project and here we are sharing what we’ve learned.
Venture into the rest of the article to learn about the setup architecture and how they configured SolrCloud to handle indexing and queries. It goes to show that Apache Lucene is the premiere search base of choice to build a search application on. If you have massive loads of data that needs a search engine, LucidWorks’ software was developed to ease the burden.
Whitney Grace, October 18, 2012
Sponsored by ArnoldIT.com, developer of Augmentext.
“They did not want to use ElasticSearch, because Solr was designed specifically for large search clusters”
Seriously?
The “cloud”-bits of Solr was bolted on the 4.0 release. ElasticSearch was designed with distributed clusters from the ground up.