We are on Sitecore 9.0.1 and on Azure PaaS. We use Redis Cache for Session management.
We were experience high CPU and high memory uses on CD app service. Also the website was throwing 502 request timeout errors. Additionally we were noticing below exception in Application Insights:
[TimeoutException: Timeout performing EVAL, inst: 0, mgr: Inactive, err: never, queue: 26953, qu: 26953, qs: 6, qc: 8, wr: 6, wq: 6, in:9, ar: 9, IOCP: (StackExchange.Redis.ConnectionMu1tiplexer.ExecuteSyncImpl(Message message, ResultProcessor‘1 processor, ServerEndPoint server) +661 StackExchange.Redis.RedisBase.ExecuteSync(Message message, Resu1tProcessor\1 processor, ServerEndPoint server) +122 StackExchange.Redis.RedisDatabase.ScriptEvaluate(String script, RedisKey[] keys, RedisValue[] values, CommandFlags flags) +141 Redisconnectionwrapper.TryTakewriteLockAndGetData(String sessionld, DateTime lockTime, 0bject& 1ockId, Isessionstatelten StackExchangeclientconnection.RetryForScriptNotFound(Func‘1 redisoperation) +146 StackExchangeclientconnection.RetryLogic(Func‘1 redisoperation) +159 StackExchangeclientconnection.Eval(String script, String[] keyArgs, Object[] va1ueArgs) +669 RedissessionStateProvider.GetItemFromSessionStore(Boolean iswriteLockRequired, Httpcontext context, String id, Boo1ean& RedisSessionStateProvider.GetItemExclusive(Httpcontext context, String id, Boo1ean& locked, TimeSpan& lockAge, Object& J System.web.SessionstateiSessionStateModule.GetSessionStateItem() +169
Raised a support ticket and they shared this knowledge base article which explain root caused and relevant solution very precisely.
Description:
If a burst of traffic reaches the application while no free threads are available, the Timeout Exception is thrown as a result of the Redis driver design. The Redis driver blocks the request thread until a response from the Redis Server has been received and data is fully parsed by the callback. A lack of free threads to invoke the callback for parsing received data in a timely manner (one second by default) leads to a timeout exception.
Technical background:
A thread pool is allowed to create new worker threads to process incoming load under certain conditions. Adding more threads is beneficial only if free CPU resources are available. A thread pool injects new threads when the CPU usage is below 80%.
Since the CPU performance counter shows the system state for the previous second, the load produced by the newly-created threads is reflected only in a second. This results in a creation constraint of no more than 2 threads per second to prevent overloading the CPU.
Solution (XP 8.0.0 – 9.0.2):
Since we were on Sitecore 9.0.1 following solution was suggested for in the knowledgebase article.
To be able to configure an application to secure free threads as well as tolerate network issues:
- Allow the thread pool to quickly scale up to certain min number of threads to bypass the thread pool growth speed limit
- Limit ASP.NET number of concurrent requests processes at one time to prevent thread pool clogging.
- Limit the max size of the thread pool to prevent CPU over-usage (jams).
- Increase operationTimeoutInMilliseconds to tolerate application CPU saturation (the thread does not receive CPU time).
- Define connectionTimeoutInmilliseconds and retryTimeoutInMilliseconds to tolerate network failures.
To configure an application to secure free threads and tolerate network issues:
- Put the Sitecore.Support.210408 support patch assembly (Sitecore.Support.210408.dll) into the \bin folder.
- Put the Sitecore.Support.210408.config file into the \App_Config\Include\zzz folder.
- In both the Web.config file and in the \App_Config\Sitecore\Marketing.Tracking\Sitecore.Analytics.Tracking.config file (or \App_Config\Include\Sitecore.Analytics.Tracking.config file for Sitecore XP 8.0.0-8.2.7) define:
- operationTimeoutInMilliseconds=”5000″
- retryTimeoutInMilliseconds=”16000″
- connectionTimeoutInmilliseconds= “3000”
Notes:
- The configuration values (both thread numbers and operation timeouts) are given for illustration purposes only and act only as a starting point.
- The final values must be tuned per-solution as a result of load testing.
- The source code of the patch: ConfigureThreadPool.cs.
Hope this helps!
Leave a Reply