Distributed Cache Status Down, CacheHostInfo is null

This problem troubled me for like 2 days consecutively while i have to stay in the data center (with the crazily cool air-con) to troubleshoot, its kinda driving me crazy.

I have two (2) SharePoint 2013 app servers to be running Distributed Cache services. Each to be allocated with 8 GB RAM. When i setup one of the Distributed Cache server (running alone), it was fine. The issue came when i tried to start the other SharePoint App server (within the same SharePoint farm) to run the Distributed Cache service, the App Fabric Caching Services stuck at “Starting” status for about 5 mins+  Subsequently, error thrown to System Log file saying the service terminated unexpectedly.

I Followed Technet guide and it doesn’t work.

Tried to “Remove-SPDistributedCacheServiceInstance”, it loaded for a long long time then thrown “net.tcp://<servername>/” times out.

In the Central Admin > Manage Services on server. Selected the Not-working app server. I found the Distributed Cache service is not started. Click on the “Start” link, it threw “Cachehostinfo is null” friendly message…

Tried Get-SPServiceInstance to get the broken distributed cache instance, deleted and rerun “Add-SPDistributedCacheServiceInstance”, it loaded, again for a lon glong time.

Tried

[sourcecode]
Remove-CacheHost
Add-CacheHost -ProviderType "SPDistributedCacheProvider -ConnectionString "Data Source=<SPSQL ALIAS>;Initial Catalog=Config;Integrated Security=True;Enlist=False"
#####
[/sourcecode]

Still no help.

After checking the window System Log file, i noticed this error was thrown somewhen during the process.
[sourcecode]
Event ID 4
####
The kerberos client received a KRB_AP_ERR_MODIFIED error from the server &lt;My Broken Server Name&gt;.
The target name used was &lt;My Service Account&gt;. This indicates that the password used to encrypt the kerberos
service ticket is different than that on the target server.
####
[/sourcecode]
Fishy enough.

If you are having this problem, you can consider the following step

My Resolution

  1. Check Network Connection for both your SharePoint servers that are running the Distributed Cache.
  2. Go to the Properties of the connection
  3. Double click “Internet Protocol Version 4 (TCP/IPv4)
  4. Ensure both servers are using SAME and WORKING DNS Servers. Reason why is that my servers were having two Domain Controllers (one for backup replication purposes). For some reason, the DCs cannot talk to each other.
  5. Remove the Alternate DNS Server as for now.
  6. Try to Remove-SPDistributedCacheServiceInstance and Add-SPDistributedCacheServiceInstance for the non-working app server.

Hope it solves your problem =)

Leave a Reply

Your email address will not be published. Required fields are marked *