We are on Sitecore 9.2 an Solr 7.5, where the Solr is hosted on Azure VMs.
All of sudden we started facing issues that rebuilding Solr Indexes was erroring out and we are seeing the below exceptions in Solr logs:
Job started: Index_Update_IndexName=sitecore_weblive_index|#Exception: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> SolrNet.Exceptions.SolrConnectionException:
)
50065org.apache.solr.common.SolrExceptionjava.io.FileNotFoundExceptionjava.io.FileNotFoundException: C:\solr\server\solr\sitecore_weblive_index\data\tlog\tlog.0000000000000631423 (The file or directory is corrupted and unreadable)org.apache.solr.common.SolrException: java.io.FileNotFoundException: C:\solr\server\solr\sitecore_weblive_index\data\tlog\tlog.0000000000000631423 (The file or directory is corrupted and unreadable)
at org.apache.solr.update.TransactionLog.<init>(TransactionLog.java:191)
at org.apache.solr.update.UpdateLog.newTransactionLog(UpdateLog.java:445)
at org.apache.solr.update.UpdateLog.ensureLog(UpdateLog.java:1104)
at org.apache.solr.update.UpdateLog.deleteByQuery(UpdateLog.java:618)
at org.apache.solr.update.DirectUpdateHandler2.deleteByQuery(DirectUpdateHandler2.java:498
We were also seeing below exception quiet frequently:
Job started: Index_Update_IndexName=sitecore_master_index|#Exception: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> SolrNet.Exceptions.SolrConnectionException:
5005this IndexWriter is closedorg.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:749)
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:763)
at org.apache.lucene.index.IndexWriter.deleteDocuments(IndexWriter.java:1525)
at org.apache.solr.update.DirectUpdateHandler2.deleteByQuery(DirectUpdateHandler2.java:492)
at org.apache.solr.update.processor.RunUpdateProcessor.processDelete(RunUpdateProcessorFactory.java:78)
at org.apache.solr.update.processor.UpdateRequestProcessor.processDelete(UpdateRequestProcessor.java:59)
Troubleshooting Step 1:
Mostly we were seeing these exceptions for Master and Publishing target databases. While troubleshooting we found that while rebuilding Master database we were seeing the below exception:
org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
The stacktrace shows the issue is coming from Solr and may not cause from Sitecore side. We thought if our Solr server has enough free disk space? We followed this article for more information.
But we checked that Solr VMs had enough space allocated.
Troubleshooting Step 2:
Where as while rebuilding Publishing Target databases we were getting the below exceptions:
java.io.FileNotFoundException: C:\solr\server\solr\sitecore_weblive_index\data\tlog\tlog.0000000000000631423 (The file or directory is corrupted and unreadable)
This again shows some file system related error. We tried to delete the documents from our index folder (weblive_index\data\index) and try to do a rebuild again.
Please take a backup of the folder before doing any delete operation. Please also check if your CM can connect to the Solr instance without any network problem.
But we were unable to delete the files under tlog
folder as it was throwing an error that files are being used by another process.
Troubleshooting Step 3:
We tried the following steps:
- Stop “SOLR” service.
- Remove content from “C:\solr\server\solr\sitecore_weblive_index\data” index;
- Start “SOLR” service.
- Populate and rebuild “sitecore_weblive_index” index.
Please see the following articles about setting up Solr and creating indexes for more details:
- https://sitecorediaries.org/2020/05/24/how-to-create-a-custom-solr-index/
- https://doc.sitecore.com/en/developers/90/platform-administration-and-architecture/walkthrough–setting-up-solr.html
To find out the process that holds the file, you can use “Process Explorer” tool: https://www.screencast.com/t/M6E1fI25Q4T
Please see the following articles for more details:
- https://helpcenter.gsx.com/hc/en-us/articles/115015880627-How-to-Identify-which-Windows-Process-is-Locking-a-File-or-Folder
- https://superuser.com/questions/117902/find-out-which-process-is-locking-a-file-or-folder-in-windows
After couple of hit and try we were able to delete the tlog files, created a new index and then rebuilt which was a success this time.
Unfortunately we are unable to trace why the tlog files were corrupted. Mostly we would be following the steps above if we face the similar issues again.
But as a parmanent solution we are soon planning to move out Solr instances to SearchStax which is a cloud based Solr solution comes with two versions: Manager Solr and Solr and SaaS. Solr being an open source we have faced a lot of such issues where we were unable to figure the actual root cause but ended up either a work around or temporary fix to resolve the problem. Where SearchStax provides you end to end management of your Solr instances. Looking forward to it.
Hope this helps!
Leave a Reply