If you have a custom SOLR index in your Sitecore-based project and it becomes too big (more than 1 million documents), you may face a problem when the rebuild process times out because of the SOLR connection timeout exception. Today, I’ll tell you how we dealt with this issue.
Problem
In our Sitecore project, we built a custom product index containing more than two million documents. This index has a custom crawler with complex logic, which includes reading thousands of JSON files and calling different APIs (e.g product ingestion, third-party data storage).
To implement the search functionality, we used the SOLR Cloud topology (from the SearchStax vendor) with the Switch On Rebuild feature (Zero downtime rebuild).
There is a custom automation process that triggers the index every night out of work hours. It takes about six hours, give or take, to process two million documents.
Once, when the index contained 500,000–600,000 documents, we got the following SOLR timeout connection exception:
1 | Exception: SolrNet.Exceptions.SolrConnectionException |
2 | Message: The operation has timed out |
4 | at SolrNet.Impl.SolrConnection.PostStream(String relativeUrl, String contentType, Stream content, IEnumerable`1 parameters) |
5 | at SolrNet.Impl.SolrConnection.Post(String relativeUrl, String s) |
6 | at SolrNet.Impl.LowLevelSolrServer.SendAndParseHeader(ISolrCommand cmd) |
7 | at Sitecore.ContentSearch.SolrProvider.SwitchOnRebuildSolrSearchIndex.PerformRebuild(Boolean resetIndex, Boolean optimizeOnComplete, IndexingOptions indexingOptions, CancellationToken cancellationToken) |
8 | at Sitecore.ContentSearch.SolrProvider.SolrSearchIndex.Rebuild(Boolean resetIndex, Boolean optimizeOnComplete) |
This happened when the rebuild process had been completed and Sitecore sent a request to SOLR to optimize the index.
Source: Sitecore.ContentSearch.SolrProvider.SwitchOnRebuildSolrSearchIndex.PerformRebuild(Boolean resetIndex, Boolean optimizeOnComplete, IndexingOptions indexingOptions, CancellationToken cancellationToken)
Sitecore enthusiasts have already discussed this issue.
Solution
To resolve this issue, we had to increase the ConnectionTimeout value. We changed it to 60,000, equalling 10 minutes. And it worked!
But our index was growing fast, and when it exceeded 2 million documents, we got the SOLR timeout issue again.
We increased that parameter to 120,000 (20 minutes), then to 360,000 (1 hour), but that didn’t help. We had to discover the source of the problem and resolve it.
Let’s take a look at how the official SOLR documentation describes optimization:
As you can see, optimization may improve the query performance when the index has become fragmented by many updates.
In our case, we had a daily full rebuild process, which means that we had a fresh index every day. And, as noted in the SOLR documentation, we did not need to optimize it:
Optimizing is not recommended unless it can be performed regularly as it may lead to a significantly larger portion of the index consisting of deleted documents than would normally be the case.
To avoid the optimization process, you can develop your own type of index inherited from the one you used before.
In our case, it was the SwitchOnRebuildSolrCloudSearchIndex (we used index swapping due to the zero downtime).
Then, you need to override the Rebuild() method: call the base method by setting the false value for the optimizeOnComplete parameter. This means that Sitecore will not trigger the Optimize command to SOLR:
1 | public class NotOptimizedSwitchOnRebuildSolrCloudSearchIndex : |
2 | SwitchOnRebuildSolrCloudSearchIndex, |
7 | public NotOptimizedSwitchOnRebuildSolrCloudSearchIndex( |
11 | string activecollection, |
12 | string rebuildcollection, |
13 | ISolrOperationsFactory solrOperationsFactory, |
14 | IIndexPropertyStore propertyStore) : base( |
20 | solrOperationsFactory, |
25 | public NotOptimizedSwitchOnRebuildSolrCloudSearchIndex( |
29 | string activecollection, |
30 | string rebuildcollection, |
31 | IIndexPropertyStore propertyStore) : base( |
41 | public NotOptimizedSwitchOnRebuildSolrCloudSearchIndex( |
45 | string activecollection, |
46 | string rebuildcollection, |
47 | IIndexPropertyStore propertyStore, |
48 | ISolrProviderContextFactory providerContextFactory, |
49 | string @ group ) : base( |
56 | providerContextFactory, |
61 | public override void Rebuild() |
Then, insert the created type into the index config:
And that should do the trick! Happy coding, dear Sitecorian. See you next time.
P.S. Thanks to my colleague Vadim Birkos for brainstorming this issue.