troubleshoot-unexpected-cluster-failover-issue.md

History

title

description

ms.date

author

ms.author

manager

audience

ms.topic

ms.reviewer

ms.custom

Guidance for troubleshooting unexpected cluster failover

Provides guidance to find the root cause of an unexpected failover in a Windows-based failover cluster.

01/15/2025

kaushika-msft

kaushika

dcscontentpm

itpro

troubleshooting

kaushika

sap:clustering and high availability\root cause of an unexpected failover

pcy:WinComm Storage High Avail

Unexpected cluster failover troubleshooting guidance

A cluster won't trigger a failover unless there's an actual issue with one of the cluster's components (software or hardware). It will perform a basic recovery step, and the affected resource will fail over to another node because of the following possible causes:

Resource failure
Networking issue such as node eviction
Cluster Shared Volume (CSV) disk failure

Troubleshooting checklist

Identify the occurrence timestamp in the System Event Log. Then, search for events about the source Microsoft-Windows-FailoverClustering and check for Event ID 1069, 1146, or 1230.
Match the time zone of the System event log to the GMT time zone in the cluster log.
[!NOTE] To quickly find the time zone difference, search for The current time is.
Navigate to the occurrence timestamp in the cluster log and identify the corresponding line. You may find an error such as:
- Resource <name> IsAlive has indicated failure
- IsAlive sanity check failed
[!NOTE] The error can be different depending on the issue.
Scroll up in the cluster log and try to identify if there's any other error that might be the actual cause.
Scroll down in the cluster log and search for Group move or Move of group for the affected resource. Take note of the exact timestamp and the destination node.
Switch over to the cluster log of the destination's node and check the resource's behavior when it's online. If the resource manages to come online, you'll find the following log:
- Resource <name> has come online
- Group move for <name> has completed
Otherwise, you'll find the following log:
Online for resource <name> failed

More information

For more information, see the following articles:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

troubleshoot-unexpected-cluster-failover-issue.md

troubleshoot-unexpected-cluster-failover-issue.md

Unexpected cluster failover troubleshooting guidance

Troubleshooting checklist

More information

Files

troubleshoot-unexpected-cluster-failover-issue.md

Latest commit

History

troubleshoot-unexpected-cluster-failover-issue.md

File metadata and controls

Unexpected cluster failover troubleshooting guidance

Troubleshooting checklist

More information