Skip to main content
DR Failover Checks - Environment
Updated over 8 months ago

Overview

In the event of an actual disaster, the last thing you'd want is for your DR failovers to fail. Druva preemptively flags issues in your AWS environment and the DR plan configuration that can cause your DR failovers to fail. Fix any identified issues proactively before disaster strikes and failover your VMs with confidence when you need to.

You can run the DR Failover environment checks on demand. Druva also runs these checks every day at 10:30 PM UTC.

Druva checks the following as a part of DR Failover Checks - Environment:

Accessing the DR Failover Checks - Environment

You can access the DR Failover Checks - Environment from the DR Plans page, or directly from the DR plan’s Overview page.

  1. Log in to the Management Console.

  2. From the drop-down next to All Organizations, select the Organization in which you’ve configured VMs for disaster recovery.

  3. On the menu bar at the top, click Disaster Recovery.

  4. On the DR Plans page, the Failover Check Status (Environment) column displays the status of the latest DR failover environment check for each DR plan.

    DR Plans page - Failover check status column.png

    Clicking the status takes you to the Overview page of the corresponding DR plan. You can also select the DR plan from the DR plan drop-down and then view the Overview page. The Overview page in the DR plan shows you the Failover Checks - Environment and Failover Checks - Guest OS cards.

    Overview page - Check cards.png


    Clicking the hyperlinks in either of these cards takes you to a filtered Virtual Machines page. This page is filtered by the check status and gives you details on how to fix the identified issues.

    Virtual Machines page- filtered.png

Overview page

The Overview page has the Failover Checks - Environment card and the Failover Checks - Guest OS card. The following tables describe the fields in each of these cards.

Failover Checks - Environment

Failover Checks - Environment.png

Field

Description

Subnet

This column lists the subnets in the VPC that have an issue that could cause failovers to fail.

Issues

This column explains the issue identified with the associated subnet. Click more to see details about the issue and how to resolve it. See Subnet checks. for issue resolution.

x VMs
Incorrect IP settings

Lists the number of VMs whose IP settings are incorrect. The number of VMs is a hyperlink that takes you to the Virtual Machines page, filtered to display all the VMs with incorrect IP settings. Click the hyperlinks in the Failover Checks - Environment column on the Virtual Machines page to view details and learn how to fix the issue.

x VMs
DR copy unavailable

Lists the number of VMs whose DR copy is unavailable. The number of VMs is a hyperlink that takes you to the Virtual Machines page, filtered to display all VMs whose DR copy is unavailable. Click the hyperlinks in the Failover Checks - Environment column on the Virtual Machines page to view details and learn how to fix the issue.

x VMs
Incorrect instant type

Lists the number of VMs whose instance type is incorrect. The number of VMs is a hyperlink that takes you to the Virtual Machines page, filtered to display all the VMs with incorrect instance types. Click the hyperlinks in the Failover Checks - Environment column on the Virtual Machines page to view details and learn how to fix the issue.

Fix any identified issues and then rerun the check by clicking Rerun Check. Ensure that all the checks have passed before running a failover job.

Failover Checks - Guest OS

Failover Checks - Guest OS.png

Field

Description

X/Y
VMs with issues

X of a total of Y VMs in the DR plan have issues with the guest OS that could cause failovers to fail.

Guest OS check status of VMs

(Circular chart)

This is a graphical representation of the guest OS check statuses of all VMs in the DR plan. Hovering over the colored segments gives you the number of VMs with a specific issue with the guest OS check.

Status

The number of VMs with a specific Guest OS check status. Clicking the hyperlink next to the VM status takes you to the Virtual Machines page, which is now filtered to show all the VMs with the specific status. The statuses are:

  • Successful

  • Failed

  • Warnings

  • Not initiated

  • Invalid credentials

  • Missing credentials

See DR failover checks - Guest OS for more information.

Virtual Machines page

The Failover Checks - Environment column gives details on potential causes of failover failures due to AWS environment and DR plan configuration issues. You may see the following issues in this column:

The following section describes the checks in detail and tells you how to resolve the identified issues.

Issue resolution

The following table describes the scenarios when each of these checks fail and how to resolve the identified issues.

Incorrect IP settings

The IP settings for both the Production and Test environments are checked.

Error

Resolution

Static IP x.x.x.x for subnet <subnet ID> is already in use.

The selected static IP address is already being used. Select another IP address in the specified subnet.

  1. Click Change Failover Settings for the VM whose Failover Checks - Environment has failed with this error.

  2. In the Change Failover Settings dialog box, change the static IP address under Production or Test settings (as indicated in the error message) to another available static IP in the subnet.

► See image

static IP in use.png

No IPs in subnet <subnet ID> are available.

There aren't any available IP addresses in the specified subnet. Select another subnet by updating the network settings or release IP addresses from the existing subnet using the AWS Management Console.

To select another subnet:

  1. Go to the Network Mappings page of your DR plan.

  2. Edit the network mapping and select another subnet in your production or test network as indicated in the error message.

  3. Save the network mapping.

► See image

No IPs in subnet.png

VM with target private subnet has public IP enabled.

A VM with a target private subnet must only have private IP enabled. Disable the public IP by changing the failover settings.

  1. Click Change Failover Settings for the VM whose Failover Checks - Environment has failed with this error.

  2. In the Change Failover Settings dialog box, change the Public IP field to None in the Production or Test failover settings as indicated in the error message.

  3. Save the failover settings.

VM with target public subnet has public IP disabled.

A VM with a target public subnet must have public IP enabled. Enable the public IP for the virtual machine in the failover settings.

  1. Click Change Failover Settings for the VM whose Failover Checks - Environment has failed with this error.

  2. In the Change Failover Settings dialog box, change the Public IP field to either Elastic and enter an elastic IP address or Auto Assign in the Production or Test failover settings as indicated in the error message.

  3. Save the failover settings.

► See image

public IP disabled.png

Static IP x.x.x.x is not in CIDR.

The static IP address is not in the subnet CIDR. Enter another IP address that is in the subnet CIDR in the failover settings.

  1. Click Change Failover Settings for the VM whose Failover Checks - Environment has failed with this error.

  2. In the Change Failover Settings dialog box, change the static IP address in the Private IP field to another static IP address in the subnet CIDR. Change the setting for the Production or Test failover settings as indicated in the error message.

  3. Save the failover settings.

► See image

IP not in CIDR.png

Elastic IP x.x.x.x. in the subnet <subnet ID> is not available.

The specified elastic IP address is unavailable. Enter another elastic IP address available in your AWS account or use Auto Assign in the failover settings.

  1. Click Change Failover Settings for the VM whose Failover Checks - Environment has failed with this error.

  2. Change the PublicIP address to another available elastic IP address or select Auto Assign in the Change Failover Settings dialog box. Change the setting for the Production or Test failover settings as indicated in the error message.

  3. Save the failover settings.

► See image

Elastic IP not available.png

Unable to validate IP settings as the subnet <subnet ID> does not exist.

The specified subnet may have been deleted. Select a valid subnet in the Network Mapping settings.

  1. Go to the Network Mappings page of your DR plan.

  2. Edit the network mapping and select another subnet in your production or test network as indicated in the error message.

  3. Save the network mapping.

► See image

No IPs in subnet.png

Unable to verify if the static IP is available. Ensure that the IAM role assigned to the Druva AWS proxy has the DescribeNetworkInterfaces permission.

Ensure that the IAM role assigned to the Druva AWS proxy has the DescribeNetworkInterfaces permission. See the Druva IAM Policy article for more details.

Incorrect instance type

The instance type for both Production and Test environments is checked.

Error

Resolution

Instance type is incompatible with the VM hardware configuration.

Select an instance type that is compatible with the hardware configuration of the VM in terms of CPU, RAM, AWS region, and OS. Alternatively, you can also let Druva automatically assign a compatible instance type by selecting the Auto Assign option. See Manage disaster recovery failover for details.

Instance type is unsupported in the availability zone.

Select an instance type that is in the same availability zone as the subnet.

To determine which availability zones is the instance type available in, perform the following tasks:

  1. In the search bar of the AWS Management Console, look for and go to the VPC service.

  2. In the navigation pane on the left, click Subnets.

  3. Select the subnet associated with your DR plan. The Details pane at the bottom will show you the availability zone associated with this subnet. Make a note of the availability zone.

►See image

subnet and AZ.png
  1. In the search bar of the AWS Management Console, look for and go to the EC2 service.

  2. In the navigation pane on the left, click Instance Types under Instances.

  3. The pane on the right shows you all the availability zones in which the instance type is available. Ensure that the instance type that you select in the DR plan is available in the availability zone associated with your subnet.

►See image

Instance types in availability zones.png

Alternatively, you can also let Druva automatically assign a supported instance type by selecting the Auto Assign option. See Manage disaster recovery failover for details.

Subnet checks


📝 Note
We do not support validations of:

  • VPC associated with multiple CIDRs

  • Private NAT gateways


Error

Resolution

SQS endpoint is unhealthy as: Endpoint vpce-XXXX state is not available.

Ensure that the SQS endpoint is in an available state.

  1. From the AWS Management Console, go to the VPC service.

  2. On the VPC Dashboard, click Endpoints.

  3. Ensure that the vpce-XXXX endpoint is in an available state. If the endpoint cannot be made available, delete the vpce-XXXX endpoint and create it again. See Create SQS endpoint for details.

►See image

SQS Endpoint available.png

AWS SQS Services not reachable from subnet.

Check the connectivity between the subnet and the AWS SQS service. Ensure that either the subnet has internet connectivity through an Internet Gateway or NAT or has an endpoint to the AWS SQS service. See Create SQS endpoint for details.

SQS endpoint is unhealthy as: Private DNS is not enabled for endpoint vpce-XXXX.

Enable the Private DNS name option in the endpoint settings through the AWS Management console.

  1. In the AWS Management Console, search and go to the VPC service.

  2. Under Virtual Private Cloud, click Endpoints.

  3. Select the endpoint vpce-XXXX, click actions, and then click Modify Private DNS names.

►See image

Private DNS1.png
  1. On the Modify Private DNS names screen, enable the Enable Private DNS Name option.

  2. Click Modify Private DNS names.

►See image

Private DNS2.png

SQS endpoint is unhealthy as: Security groups of vpc endpoint vpce-XXXX do not allow inbound HTTPS traffic for the VPC.

The security groups must allow inbound HTTPS traffic for the VPC CIDR block.

  1. In the AWS Management Console, search and go to the VPC service.

  2. Under Virtual Private Cloud, click Endpoints.

  3. Select the endpoint vpce-XXXX, and in the lower pane click the Security Groups tab.

  4. Click the Group ID.

  5. On the Security Groups page, edit the Inbound Rules to allow HTTPs traffic for the VPC CIDR block.

►See image

Edit Inbound Rules.png
Edit Inbound Rules 2.png

SQS endpoint is unhealthy as: Security groups of vpc endpoint vpce-XXXX do not allow outbound HTTPS traffic for the VPC.

The security groups must allow outbound HTTPS traffic for the VPC CIDR block.

  1. In the AWS Management Console, search and go to the VPC service.

  2. Under Virtual Private Cloud, click Endpoints.

  3. Select the endpoint vpce-XXXX, and in the lower pane click the Security Groups tab.

  4. Click the Group ID.

  5. On the Security Groups page, edit the Outbound Rules to allow HTTPs traffic for the VPC CIDR block.

►See image

Outbound Rules.png
Outbound Rules 2.png

SQS endpoint is unhealthy as: Security groups of vpc endpoint vpce-XXXX do not allow inbound and outbound HTTPS traffic for the VPC.

The security groups must allow inbound and outbound HTTPS traffic for the VPC CIDR block.

  1. In the AWS Management Console, search and go to the VPC service.

  2. Under Virtual Private Cloud, click Endpoints.

  3. Select the endpoint vpce-XXXX, and in the lower pane click the Security Groups tab.

  4. Click the Group ID.

  5. On the Security Groups page, edit the Inbound and Outbound Rules to allow HTTPs traffic for the VPC CIDR block.

AWS S3 and SQS Services not reachable from subnet.

Check the connectivity between the subnet and the AWS SQS service and the AWS S3 service. Ensure that either the subnet has internet connectivity via an internet gateway or NAT or has an endpoint to the AWS SQS and AWS S3 service.See Create Amazon S3 and SQS endpoint for details.

AWS S3 Service not reachable from subnet.

Check the connectivity between the subnet and the AWS S3 service. Ensure that either the subnet has internet connectivity via an internet gateway or NAT or has an endpoint to the AWS S3 service. See Create Amazon S3 endpoint for details.

S3 endpoint state is not available.

Ensure that the S3 endpoint is in an available state. If the S3 endpoint cannot be made available, Delete the S3 endpoint and create a new one See Create Amazon S3 endpoint for details.

Subnet no longer exists.

The selected subnet no longer exists. Select a different subnet from the Network Mappings page.

  1. Go to the Network Mappings page of your DR plan.

  2. Edit the network mapping and select another subnet in your production or test network as indicated in the error message.

  3. Save the network mapping.

Unable to fetch information about the subnet.

Perform the following actions:

  • Ensure that the IAM role assigned to the Druva AWS proxy has the DescribeSubnets permission. See Druva IAM Policy for details.

  • Check the connectivity between the Druva AWS proxy and the AWS EC2 service.

Unable to fetch information about the VPC endpoints. Ensure that the IAM role assigned to the Druva AWS proxy has the DescribeVpcEndpoints permission.

Ensure that the IAM role assigned to the Druva AWS proxy has the DescribeVpcEndpoints permission. See Druva IAM Policy for details.

Information for NAT gateway <nat-xxxxx> is unavailable. Ensure the NAT gateway exists in the subnet.

Ensure the NAT gateway exists in the subnet. If the NAT gateway <nat-xxxxx> has been deleted, add an active NAT gateway route in the subnet’s route table.

Information for NAT gateway <nat-xxx> is unavailable. Ensure the NAT gateway is in an available state.

Ensure the NAT gateway is in an available state. If the NAT gateway <nat-xxxxx> has been deleted or is unavailable, add an active NAT gateway route in the subnet’s route table.

Information for NAT gateway <nat-xxx> is unavailable. Ensure the Druva AWS proxy has the DescribeNatGateways permission

Ensure that the IAM role of the Druva AWS proxy has the DescribeNatGateways permission. See Druva IAM Policy for details.

DR Copy Availability

Error

Resolution

One or more EBS snapshots (snap-xxx,snap-yyy) in the latest DR copy do not exist.

Update the DR Copy (Restore DR Copy).

  1. Go to the Virtual Machines page of your DR plan.

  2. Select the VM identified in the error message, click more options, and click Update DR Copy.

►See image

Update DR Copy.png

The DR copy job was not run for this VM. Update the DR copy.

Fix any identified issues and then rerun the check by clicking Rerun Check on the Overview page of the DR plan.

Did this answer your question?