–
Short version: Stopping the Distributed cache service gave me great performance! From 6.10 s to 79 ms
Something is just a liiitle bit off…?
Long story: This is a bit of reality right here…
I was about to give up on one of my labb SharePoint 2013 Environments because it was so extremely slow all the time.
Warmup scripts, reloads, more memory, more CPU, stopping services, stopping search…nothing helped.
I had a constant loadtime of all aspx pages of 6+ seconds, 6.10-6.20 something. Even when the page was just loaded and I pressed F5 to reload, it still took 6.10 seconds.
This was an environment that gave you sensitive nerves…
So, after looking for any solution or more like looking for the little issue that caused this all day, I gave up more or less.
– CPU was at a maximum 40% on SQL, SharePoint cranked it up to 18%…
– Memory consumtion was at 25% of the 12GB SharePoint had…
– SQL was Lightning fast to all other SharePoint farms…
– Network utilization showed about 100Kbps at the most…
I scavenged the internet as usual and found nothing but the standard: add more memeory, add more CPU, stop services, stop search…
None of that helped and I had tried it all…
Then…when all hope was lost, I got on a call with my excellent SharePoint buddy Mattias Gutke, we talked about the issue, his server on a laptop with SSD disks showed 50-100ms loadtime of all pages, reload did nopt even produce a flicker…
Then as often happens, we came to discuss the Distributed cache service, what it did and why it was there and so on…I had already had a look at it but could not find any reason why a default cache would give me this lousy performance. Then, I had a look at the timestamp in the F12 Developer dashbord – Network tab – Start capturing. I saw the home.aspx load and it took the usual 6.10 seconds.
The timestamp could be found in the detailed view and on the response header.
I memorized the timestamp (that was in GMT timezone) and opened up my ULS log. In the log at the exact time of the response header, I saw errors from the distributed cache.
I decided that t-shooting the distributed cache would have to wait, it was getting late…but, before disconnecting the Lync call with Mattias, we decided to try and see just what would happen if I stopped the distributed cache service and loaded the page.
Said and done:
Now, loaded the same site:
Whit the Distributed cache service running:
Notice any difference? Now my SharePoint farm is Lightning fast!!! From 6.10 seconds down to 79 ms!
Why is this so then you ask? No idea, something misconfigured or perhaps this is standard when using a single SharePoint server…anyway, today I don’t care.
Stop the service and the performance is great!
Hope this may help you as it did me!
Thanks to:
Mattias Gutke at CAG. Again, my SharePoint sparring partner no 1…
___________________________________________________________________________________________________
Enjoy!
Regards
Disabling the disty cache should never be a knee-jerk reaction though! Not even on a dev environment. It would be interesting to see the stack trace of those excptions in the ULS?!
Anders, hardly a knee jerk reaktion 🙂
‘Warmup scripts, reloads, more memory, more CPU, stopping services, stopping search…nothing helped…’
I spent hours, I could have coosen to spend yet more hours or days on t-shooting but I did not feel like it this time…
I still have the logs though…
Regards // Thomas
nice trick, done it. witnessed it – performance increased. thanks!
Thank you. This saved many days worth of troubleshooting with Microsoft Support
NP. You should however try to find the root cause and fix it, if performance is restored and it is a production environemnt, you would want to have the distributed cache functionality restored. Some features will not even work without it.
But, you are welcome!
Regards // Thomas
Marked improvement! Can’t even imagine how much time I just saved trying to ‘tune’ SP2013 to improve the otherwise painfully slow responsiveness! Thank you!!
Thanks, works like a charm. Thanks 2 for Mattias Gutke at CAG. Again, my SharePoint sparring partner no 1…
I think, stopping Distributed Cache is not a solution! Additionally, there will be a lot of unexpected ULS logs regarding to Distributed Cache. Investigate under which account the given service is running and create for that account a managed account. Then restart Distributed Cache service.
This help me decrease response from 6000 ms to 200 ms.
Hi.
Thanks for your feedback.
You are right, in a production environment or test when you rely on The DC service.
But, in test/lab when you only want to get things going fast…stop does do the trick 9 out of 10.
If you read the entire post, I do not say stopping it is the solution…😃
Regards
// Thomas
Hi again.
To prove that I agree 😃
Locate my post named ‘How to: Change the Distributed Cache Service managed account 2013’ that I wrote a very long time ago…
Could not get the link in, using the phone…
Regards
// Thomas
Thanks for this – it sorted out my performance dramatically. I found the culprit by removing the DC Service from all machines in the farm using Remove-SPDistributedCacheServiceInstance
, except one and then moved the services one to each of the other servers, one at a time, until performance dropped off.
Cheers
Gareth
My god! It’s sooooo fast now. Thanks ever so much!
🙂 As it should be even with the DC running, my advice, try to figure out why it failed Before and fix it. Some Components require it to be running.
Regards
// Thomas
Great help, thanks, Disabling distributed cache did work.
SO FYI for anyone having SharePoint Performance issues. I found something over the course of dealing with a client who had a huge performance drop in site load time. I was stumped for some time. I found one thing that seemed to help with load time which was enabling BLOB Cache which many articles will talk about. This will pull files that you choose such as jpg and store them locally on the SharePoint server. If you have multiple WFE servers this could possibly cause a problem as this Cache won’t clear out until you force it out. So a nightly script or hourly script depending on the load of the servers could clear this data if these items are changing throughout the day. For my situation it doesn’t change much at all. Next the biggest thing that helped me that I could not find an answer on the internet was in IIS. But, In the application pools area. I reviewed all settings and noticed that the “Recycling…” configuration was set for “After every 10 transactions” it would recycle the AppPool automatically. This was my culprit. After modifying this setting to recycle every 1 hour of the day which very well maybe still excessive will boost your SharePoint performance drastically. I am getting pages to load now in under 1 second. I don’t have a huge amount of users logging in, but just if it helps someone else over come this problem I’m glad I was able to Help!
Recycling your application pool every 1 hour is totally unnecessary unless you got some custom code with memory leaks! Essentially what you do is that you throw off every single user that has a session on the WFE off the server. If I was the user on that farm I would be very unhappy with that solution 🙂 If you have memory issues you can consider setting Specific time(s) to a time of day where you have no or few users, but in general the App Pool will recycle by itself if it uses excessive amounts of RAM. Who ever set the Fixed # of requests clearly did not have a clue how that affects performance and end users on the server…
I agree 100% with Anders.
In addition, you would probably need to run a wakeup script after each recycle to avoid an uproar among the users….
Thanks for the update. The SharePoint solution is not being used consistently by all users throughout the day. For now there are only a few users who are using it as production. Eventually it will be their all-in-one solution for most of the business except their ERP system. I’m only dealing with about 70 users or so. But thought it was interesting that someone set that parameter for 10 transactions to recycle the app pool. We have custom programming in SharePoint for web parts, but nothing extreme that is causing the AppPool to fail or act different then before. I will update the system to recycle in the early morning. Thanks for the tip. I wasn’t quite sure what the “Recommended” setting was for the AppPool.
Would be interesting to know why the problem appears and how to solve it properly!!
If you need to remove a server from the Cache Cluster, the safe way to do this is first to use Stop-SPDistributedCacheServiceInstance with the –Graceful parameter. This transfers any cached data to another server, and can therefore take some time to perform. Afterwards you can safely run Remove-SPDistributedCacheServiceInstance to make the current server a non-Cache Host.
I did it too. My goodness, Sharepoint 2013 finally works!
Hi.
Its great isn’t it? If its production, try to fix the underlying issue as well. DC is required for some services in SP2013.
But short term, this usually does the trick.
Regards
// Thomas
Thomas you keep saying try to fix the underlying issue – do you or anyone else know what the underlying issue would be for a SP 2013 Server configured with all the proper specs to be so slow with DC turned on? And fast when turned off? Turning off is 100% faster. Thx.
Hi Kevin.
The times I have had this issue and successfully fixed it, it was the service accounts on the DC service.
But it can probably be a number of things causing it, the ULS log may tell you a lot about what the root cause is, in its own way…
Regards
// Thomas
I didn’t want to stop the Distributed Cache service so I went with your suggestion of running it with a different managed account and this seems to have done the trick. SharePoint pages now load much faster. Thanks!
http://www.sbrickey.com/Tech/Blog/Post/SharePoint_DEV_Environment_Tuning
very useful , Thanks 🙂
I have read a bunch of docs of Microsoft about sharepoint 2013 and the solution is on a wordpress blog 🙂
Thank you very much
thank!! that work
it works! thank you very much.
i applied on my test farm composed by a single machine
Excellent post, thanks!
I did try this on my test environment and in deed the time load dropped, thank you.