Skip to main content
Troubleshoot common disaster recovery scenarios
Updated over 9 months ago

This topic lists the scenarios that you might encounter while executing Disaster Recovery.

The failover EC2 instance may not show some disks and their contents.

Resolution:

  1. Log into the EC2 instance.

  2. Run the following command to list all the block devices:

    fdisk -l



    This command lists all the block devices with their size and other details.

    Consider a 64 GB disk whose content was not visible post failover as shown in the following output:

    Disk /dev/sda: 42.9 GB, 42949672960 bytes, 83886080 sectors
        Units = sectors of 1 * 512 = 512 bytes
        Sector size (logical/physical): 512 bytes / 512 bytes
        I/O size (minimum/optimal): 512 bytes / 512 bytes
        Disk label type: dos
        Disk identifier: 0x000b6fff
        Device Boot         Start         End      Blocks   Id   System
        /dev/sda1   *        2048    60448767    30223360   83   Linux
        /dev/sda2        60448768    83886079    11718656   82   Linux swap / Solaris
        Disk /dev/sdb: 1649.3 GB, 1649267441664 bytes, 3221225472 sectors
        Units = sectors of 1 * 512 = 512 bytes
        Sector size (logical/physical): 512 bytes / 512 bytes
        I/O size (minimum/optimal): 512 bytes / 512 bytes
        Disk /dev/sdc: 64.4 GB, 64424509440 bytes, 125829120 sectors
        Units = sectors of 1 * 512 = 512 bytes
        Sector size (logical/physical): 512 bytes / 512 bytes
        I/O size (minimum/optimal): 512 bytes / 512 bytes
        Disk label type: dos
        Disk identifier: 0x000be2b0
        Device Boot      Start         End      Blocks   Id  System
        /dev/sdc1            2048   125829119    62913536   83  Linux
  3. Check the block device name of the disk whose contents are not visible.

  4. Ensure that all the entries in the /etc/fstab file refer to the block device names as shown in the output of step 2. For example, the /etc/fstab file should have the following entry:

    [root@BackupProxy ~]# cat /etc/fstab
        # /etc/fstab
        # Created by anaconda on Fri Jan 20 17:48:45 2017
        #
        # Accessible filesystems, by reference, are maintained under '/dev/disk'
        # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
        #
        UUID=7f822040-5b0f-4a8c-b587-6abde319fd2b /                       ext4    defaults        1 1
     /dev/sdc1 /var  ext4    defaults        1 2 
        UUID=1c129f59-7b86-4b2c-b33e-175c65b672f1 swap                    swap    defaults        0 0

    The entry in the bold needs to be verified against the fdisk output.

How to resolve the post-boot script errors

Resolution: To resolve any post-boot scripts, perform the following steps:

  1. On the Management Console, click the dropdown next to All Organizations and select the required organization.

  2. In the menu bar click Disaster Recovery.

  3. In the left navigation pane, click Jobs.

  4. On the jobs listing page, click the Job ID of the corresponding DR failover job for which you want to view the error details. You can also filter the jobs page by Job Type to only view DR Failover jobs.

  5. The Summary tab in the job details page displays the error code along with a link to the documentation for more details.

  6. Click the Recovery Workflow tab, and analyze the virtual machines for errors.

  7. For errors in the post-boot script, click the Logs tab and click Download Detail Logs to download the log files containing errors for that job. The PhoenixLogs-<JobID>.zip is downloaded.

  8. Navigate to the PhoenixLogs-<JobID>/failover/<number>/Phoenix.<timestamp>-<JobID>.zip/Phoenix-TW-<timestamp>-<JobID>_phase3.zip/Logs folder. It contains the following two files:

    • stderr.log: Contains the detailed error message of the script failure.

    • stdout.log: Contains the execution log of the script.

  9. Resolve the error in the script and execute the recovery workflow again.

How to update my existing IAM policy for the Druva AWS proxy

Yes, you can update the existing DruvaIAMPolicy in your AWS account. To update the policy, perform the following steps:

  1. On the AWS Management Console, select the Services tab and click IAM.

  2. In the left pane, click the Roles tab.

  3. On the Roles page, type DruvaIAMRole in the search box.

  4. Click the DruvaIAMRole role to view the role details.

  5. On the Summary page, click the Permissions tab.

  6. In the Permissions policies section, click the DruvaIAMPolicy option.

  7. On the Edit DruvaIAMPolicy page, click the JSON tab, copy the policy content in the editor from the policy document, and click Review policy.

  8. Click Save changes.

How to fix the failover failure when an instance is not reachable from AWS SQS

If the VPC endpoint for SQS is not configured, perform the following steps:

  1. Check the subnet, security group, and the public IP address settings chosen for the failover instance.

  2. In the customer’s AWS account, go the VPC service. Under the Subnets section, enter the subnet ID.

  3. In the route-table tab, check the target for Destination=0.0.0.0.

  4. If the target is igw-xxxx, the subnet is a public subnet. For the public subnets, set the public IP address settings as Auto-Assign or <some_elastic_ip>.

  5. If the target is nat-xxxx, then the subnet is a private subnet. For the private subnets, set the public IP settings as None.

If VPC endpoint for SQS is configured, perform the following steps:

  1. Check the subnet and security group settings chosen for the failover instance.

  2. Check if the chosen subnet is present in the subnets chosen for the SQS endpoint.

If the issue is yet not resolved, contact Support.

Where can I find logs if my proxy activation fails

When the Druva AWS proxy activation fails, perform the following steps to view the logs:

  1. On the AWS Management Console, click Services, and click CloudFormation.

  2. On the CloudFormation page, click the stack for which the proxy activation failed.

  3. On the Stacks page, click the Events tab, and view the error description in the Status reason column.
    However, it does not display any activation failure message. The CloudFormation logs the cfn-signal send failure status error. To debug the instance launch, login into the instance and check the vi /var/log/cfn-init.log logs.

Did this answer your question?