Monday, 2 January 2017

When Nothing Is Left, Avtar Restore To The Rescue

There are multiple ways to restore a virtual machine in vSphere Data Protection.


When all of these fail, there is another option to restore a virtual machine. I am not sure about what it is called as, I refer to it as command line restore of a virtual machine using avtar.

**Before proceeding, please do not perform this in your production environment as the process is pretty tricky and can cause data loss if not done right. This is last of a last resort that we need to stick to. If restores are failing, the first step would be to fix it. Involve a VMware resource to perform this. That's as much as I can say. Post that, it's your call and risk**

The steps are pretty simple, you just need to be sure and careful on what is being selected. I ran into this issue while working on one of the cases logged with us. I cannot use the output from the session, so I had to reproduce this in my lab.

So, having said that. Let's have a look at the setup. I have a virtual machine on one of my ESXi host, and the name of the VM is Jump. It is a Windows box, with one virtual hard drive of 40 GB. The SCSI controller used here is 0:0. Then, I have a 512 GB of VDP deployed which has 4 drives. The SCSI controllers by default are, 0:0, 1:0, 2:0. 3:0.

With this, let's have a look at the steps:

1. It is always good to restore this disk to a new VM rather than to an existing VM because it reduces to complexity and risks by a large factor. This is because, let's say your VM has 8 drives and drive 6 and 8 has gone corrupt and there is no other means of restore available now. If you perform the avtar level restore, it is quite confusing on what disk has to be chosen and you might end up re-writing a different VMDK.

So to be safe create a new VM with a new hard disk with the same type of provisioning as the old one. Though it is not a hard requirement for the new drive to have similar provisioning, it would reduce the post restore process by a great deal. Like you would not have to SVmotion the drives to change the provisioning.

Now, when you create this new VMDK, please use a unique SCSI controller. Also, the drive created should be at least 1 GB more than the source disk. If my Source disk was 40 GB, I will create this new VMDK as 41 GB. The SCSI controller used here should be the same as any of the existing drives on the original VM or the VDP VM. Once the disk is created, keep the VM powered off, and add the same disk to the VDP appliance as well.
Basically, you will Edit Settings on the VDP appliance > Add > Hard Disk > Use existing hard disk and browse the datastore where this VM resides and add the hard drive. While adding the drive, use the same SCSI controller that was used on the newly created VM.

This would finish the step 1. Now switch to the command line of the VDP appliance for further process.

2. We will have to obtain the LabelNum of the backup existing for this VM, so that you can restore the contents from. To do this, first you will have to verify if the client is available in the GSAN. To do this, run the below command:
# avmgr getl --path=/vcenter-prod.happycow.local/VirtualMachines

The output will be similar to:
1  Request succeeded
1  Jump_UqrwzzeV6zMpRI8yfCBqgQ  location: c3d109f23e18075b48f680f0821730b417260427      pswd: e06a865b7bf4d0aadf90be28de519b8c0681354e

I just have one virtual machine in this VDP, hence one output in the GSAN. Now to get the labelNum of backups for this Jump VM, run the below command:
# avmgr getb --path=/vcenter-prod.happycow.local/VirtualMachines/Jump_UqrwzzeV6zMpRI8yfCBqgQ --format=xml

The output will be similar to:
1  Request succeeded
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<backuplist version="3.0">
  <backuplistrec flags="32768001" labelnum="1" label="Jump-1483351122072" created="1483351973" roothash="a6711baf9a0db97be019109cb7ea177ec7a8035e" totalbytes="42949988352.00" ispresentbytes="0.00" pidnum="3016" percentnew="17" expires="1488535122" created_prectime="0x1d264e0c7ce7616" partial="0" retentiontype="daily,weekly,monthly,yearly" backuptype="Full" ddrindex="0" locked="1" direct_restore="1"/>
</backuplist>

LabelNum=1 specifies this is the first backup of the virtual machine. If I back this VM one more time and run the same command we will have two <backuplist> available and the labelNum counter would be incremented to 2 and so on.

3. We will have to list out the files available for this VM. It should list out the vmx, vmdk, flat.vmdk and the nvram files for this VM backup. The command would be:
# avtar --list --labelnum=1 --path=/vcenter-prod.happycow.local/VirtualMachines/Jump_UqrwzzeV6zMpRI8yfCBqgQ

The output will be similar to:
avtar Info <5551>: Command Line: /usr/local/avamar/bin/avtar.bin --flagfile=/usr/local/avamar/etc/usersettings.cfg --password=**************** --vardir=/usr/local/avamar/var --server=vdp-dest --id=root --bindir=/usr/local/avamar/bin --vardir=/usr/local/avamar/var --bindir=/usr/local/avamar/bin --sysdir=/usr/local/avamar/etc --list --sequencenumber=1 --account=/vcenter-prod.happycow.local/VirtualMachines/Jump_UqrwzzeV6zMpRI8yfCBqgQ
avtar Info <7977>: Starting at 2017-01-03 00:25:06 IST [avtar Oct 14 2016 05:53:11 7.2.180-118 Linux-x86_64]
avtar Info <6555>: Initializing connection
avtar Info <5552>: Connecting to Avamar Server (vdp-dest)
avtar Info <5554>: Connecting to one node in each datacenter
avtar Info <5583>: Login User: "root", Domain: "default", Account: "/vcenter-prod.happycow.local/VirtualMachines/Jump_UqrwzzeV6zMpRI8yfCBqgQ"
avtar Info <5580>: Logging in on connection 0 (server 0)
avtar Info <5582>: Avamar Server login successful
avtar Info <10632>: Using Client-ID='c3d109f23e18075b48f680f0821730b417260427'
avtar Info <5550>: Successfully logged into Avamar Server [7.2.80-118]
avtar Info <8745>: Backup from Linux host "/vcenter-prod.happycow.local/VirtualMachines/Jump_UqrwzzeV6zMpRI8yfCBqgQ" (vdp-dest.happycow.local) with plugin 3016 - Windows VMWare Image
avtar Info <5538>: Backup #1 label "Jump-1483351122072" timestamp 2017-01-02 15:42:53 IST, 9 files, 40.00 GB
avtar Info <40113>: Backup #1 created by avtar version 7.2.180-118
VMConfiguration/
VMConfiguration/avamar vm configuration.xml
VMConfiguration/snapshot description.xml
VMConfiguration/vm.nvram
VMConfiguration/vm.ovf
VMConfiguration/vm.vmx
VMConfiguration/vss-manifest.zip
VMFiles/
VMFiles/1/
VMFiles/1/attributes.xml
VMFiles/1/virtdisk-descriptor.vmdk
VMFiles/1/virtdisk-flat.vmdk
avtar Info <5314>: Command completed (exit code 0: success)

4. The VMDK file obtained from the above avtar command should be accessible. To verify this, run the below command:
# avtar -x --path=/vcenter-prod.happycow.local/VirtualMachines/Jump_UqrwzzeV6zMpRI8yfCBqgQ --labelnum=1 -O VMFiles/1/virtdisk-descriptor.vmdk

The output would be similar to:
avtar Info <5551>: Command Line: /usr/local/avamar/bin/avtar.bin --flagfile=/usr/local/avamar/etc/usersettings.cfg --password=**************** --vardir=/usr/local/avamar/var --server=vdp-dest --id=root --bindir=/usr/local/avamar/bin --vardir=/usr/local/avamar/var --bindir=/usr/local/avamar/bin --sysdir=/usr/local/avamar/etc -x --account=/vcenter-prod.happycow.local/VirtualMachines/Jump_UqrwzzeV6zMpRI8yfCBqgQ --sequencenumber=1 -O VMFiles/1/virtdisk-descriptor.vmdk
avtar Info <7977>: Starting at 2017-01-03 00:28:29 IST [avtar Oct 14 2016 05:53:11 7.2.180-118 Linux-x86_64]
avtar Info <6555>: Initializing connection
avtar Info <5552>: Connecting to Avamar Server (vdp-dest)
avtar Info <5554>: Connecting to one node in each datacenter
avtar Info <5583>: Login User: "root", Domain: "default", Account: "/vcenter-prod.happycow.local/VirtualMachines/Jump_UqrwzzeV6zMpRI8yfCBqgQ"
avtar Info <5580>: Logging in on connection 0 (server 0)
avtar Info <5582>: Avamar Server login successful
avtar Info <10632>: Using Client-ID='c3d109f23e18075b48f680f0821730b417260427'
avtar Info <5550>: Successfully logged into Avamar Server [7.2.80-118]
avtar Info <5295>: Starting restore at 2017-01-03 00:28:29 IST as "root" on "vdp-dest.happycow.local" (4 CPUs) [7.2.180-118]
avtar Info <40113>: Backup #1 created by avtar version 7.2.180-118
avtar Info <5949>: Backup file system character encoding is UTF-8.
avtar Info <8745>: Backup from Linux host "/vcenter-prod.happycow.local/VirtualMachines/Jump_UqrwzzeV6zMpRI8yfCBqgQ" (vdp-dest.happycow.local) with plugin 3016 - Windows VMWare Image
avtar Info <5538>: Backup #1 label "Jump-1483351122072" timestamp 2017-01-02 15:42:53 IST, 9 files, 40.00 GB
avtar Info <5291>: Estimated size for "VMFiles/1/virtdisk-descriptor.vmdk" is 463 bytes
# comment this is an avamar backup
version=1
createType="vmfs"

# Extent description
RW 83886080 VMFS "virtdisk-flat.vmdk"

# The Disk Data Base
#DDB
dbb.adapterType = "lsilogic"
dbb.geometry.cylinders = "5221"
dbb.geometry.heads = "255"
dbb.geometry.sectors = "63"
dbb.longContentID = "9c70cdf008a0d44ace4aa9d83340427c"
dbb.thinProvisioned = "1"
dbb.toolsVersion = "10246"
dbb.uuid = "60 00 C2 98 b6 dc 27 49-cf 44 c6 73 c4 42 e2 84"
dbb.virtualHWVersion = "11"
avtar Info <5267>: Restore of "VMFiles/1/virtdisk-descriptor.vmdk" completed
avtar Info <7925>: Restored 463 bytes from selection(s) with 463 bytes in 1 files
avtar Info <6090>: Restored 463 bytes in 0.01 minutes: 4.271 MB/hour (9,673 files/hour)
avtar Info <7883>: Finished at 2017-01-03 00:28:30 IST, Elapsed time: 0000h:00m:00s
avtar Info <6645>: Not sending wrapup anywhere.
avtar Info <5314>: Command completed (exit code 0: success)

The RW section would describe the size of the VMDK. 83886080 x 512 = 42949672960 bytes, corresponds to 41943040 KB, which is 40960 MB which translates to 40 GB.

5. So now on the new VM I have a 41 GB drive with SCSI 1:1 created and this is attached to the VDP appliance with the same 1:1 controller. Rescan for storage using the below command:
# echo "- - -" > /sys/class/scsi_host/host1/scan

Here "- - -" defines the three values stored inside host*/scan i.e. channel number, SCSI target ID, and LUN values. We are simply replacing the values with wild cards so that it can detect new changes attached to the Linux box. This procedure will add LUNs, but not remove them.

6. Run fdisk -l and the new device should not be detected as a formatted partition. You should be seeing the below output:

Disk /dev/sde doesn't contain a valid partition table

7. Now the next command is restore. This command will start as soon as you hit enter. It will not give you any option to proceed with a yes or no prompt. So be careful on what is entered here before you proceed.

The command would be:
avtar -x --nostdout --account=/vcenter-prod.happycow.local/VirtualMachines/Jump_UqrwzzeV6zMpRI8yfCBqgQ --labelnum=1 -O VMFiles/1/virtdisk-flat.vmdk > /dev/sde

The labelnum can differ depending on your requirement. 
This will not show any output or any progress of the restore. If the VMDK created was a thin provisioned drive, then you can login to ESXi and run the below command:
# watch -n1 stat vm-name-flat.vmdk

This will refresh the output of the VMDK every 1 second. 

The output should be similar to:
Every 1s: stat Jump1-flat.vmdk                                                                                                                                                                                           2017-01-02 20:41:32
File: Jump1-flat.vmdk
Size: 44023414784     Blocks: 305152     IO Block: 131072 regular file
Device: 61f328bc5c1ebe26h/7058029830484180518d  Inode: 226524548   Links: 1
Access: (0600/-rw-------)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-01-02 20:39:58.000000000
Modify: 2017-01-02 20:39:58.000000000
Change: 2017-01-02 20:41:30.000000000

And this should be refreshing until the block size correlates to 40 GB. To calculate this, you need to Blocks x 512 = Size in Bytes.

8. Now, detach this drive from the VDP appliance. So in the end, you should have your new VM powered off and this drive attached to it. Power On the VM and you should be able to see the data.