Wednesday, 25 January 2017

VDP Verification Job Leaves Undeleted Folders

There was an interesting case opened quite a while back which took almost few months to figure out the solution. The issue was, a backup verification job was performed for a set of VMs. The verification job used to complete successfully, however, it used to leave a folder behind with the name of the VM's verification job ID and in that folder was one file which was the vmxf file of that virtual machine.

If you took a look at the verification job logs on the VDP server, you will not see any errors or warnings. The VM will be test restored, powered on, tested for VM tools heartbeat and then powered off and deleted.

So, on the ESXi datastore, the folder left behind was something similar to this name, VDP_VERIFICATION_Windows_clone_2_1485298729541. And in this directory the file left behind was VDP_VERIFICATION_Windows_clone_2.vmxf

If you look at the hostd.log during this process, you will notice the following:

2017-01-24T23:44:04.127Z info hostd[234C2B70] [Originator@6876 sub=Libs opID=3a74dd6f-58-d7b3 user=vpxuser:VMWARE.LOCAL\Administrator] SNAPSHOT: SnapshotDeleteFile dele
ted '/vmfs/volumes/57ac84be-efa08410-3cf9-549f3515a09a/VDP_VERIFICATION_Windows_clone_2_1485298729541/VDP_VERIFICATION_Windows_clone_2_1485298729541.nvram'.

2017-01-24T23:44:04.128Z info hostd[234C2B70] [Originator@6876 sub=Libs opID=3a74dd6f-58-d7b3 user=vpxuser:VMWARE.LOCAL\Administrator] SNAPSHOT: SnapshotDeleteFile dele
ted '/vmfs/volumes/57ac84be-efa08410-3cf9-549f3515a09a/VDP_VERIFICATION_Windows_clone_2_1485298729541/vmware.log'.

2017-01-24T23:44:04.128Z info hostd[234C2B70] [Originator@6876 sub=Libs opID=3a74dd6f-58-d7b3 user=vpxuser:VMWARE.LOCAL\Administrator] SNAPSHOT: SnapshotDeleteFile dele
ted '/vmfs/volumes/57ac84be-efa08410-3cf9-549f3515a09a/VDP_VERIFICATION_Windows_clone_2_1485298729541/564d7f1f-1606-518d-6b4e-d8689735d481.vmem.WRITELOCK'.

2017-01-24T23:44:04.128Z info hostd[234C2B70] [Originator@6876 sub=Libs opID=3a74dd6f-58-d7b3 user=vpxuser:VMWARE.LOCAL\Administrator] SNAPSHOT: SnapshotDeleteFile dele
ted '/vmfs/volumes/57ac84be-efa08410-3cf9-549f3515a09a/VDP_VERIFICATION_Windows_clone_2_1485298729541/VDP_VERIFICATION_Windows_clone_2_1485298729541.vmsd~'.

2017-01-24T23:44:04.128Z info hostd[234C2B70] [Originator@6876 sub=Libs opID=3a74dd6f-58-d7b3 user=vpxuser:VMWARE.LOCAL\Administrator] SNAPSHOT: SnapshotDeleteFile dele
ted '/vmfs/volumes/57ac84be-efa08410-3cf9-549f3515a09a/VDP_VERIFICATION_Windows_clone_2_1485298729541/VDP_VERIFICATION_Windows_clone_2_1485298729541.vmsd.lck'.

2017-01-24T23:44:04.129Z info hostd[234C2B70] [Originator@6876 sub=Libs opID=3a74dd6f-58-d7b3 user=vpxuser:VMWARE.LOCAL\Administrator] SNAPSHOT: SnapshotDeleteFile dele
ted '/vmfs/volumes/57ac84be-efa08410-3cf9-549f3515a09a/VDP_VERIFICATION_Windows_clone_2_1485298729541/VDP_VERIFICATION_Windows_clone_2_1485298729541.vmsd'.

2017-01-24T23:44:04.129Z info hostd[234C2B70] [Originator@6876 sub=vm:SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/57ac84be-efa08410-3cf9-549f3515a09a/VDP_VERIFI
CATION_Windows_clone_2_1485298729541/VDP_VERIFICATION_Windows_clone_2_1485298729541.vmx opID=3a74dd6f-58-d7b3 user=vpxuser:VMWARE.LOCAL\Administrator] lck'.

2017-01-24T23:44:04.130Z info hostd[234C2B70] [Originator@6876 sub=Libs opID=3a74dd6f-58-d7b3 user=vpxuser:VMWARE.LOCAL\Administrator] SNAPSHOT: SnapshotDeleteVMInt: Couldn't remove directory '/vmfs/volumes/57ac84be-efa08410-3cf9-549f3515a09a/VDP_VERIFICATION_Windows_clone_2_1485298729541'.

2017-01-24T23:44:04.130Z verbose hostd[234C2B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/57ac84be-efa08410-3cf9-549f3515a09a/VDP_VERIFICATION_Windows_clone_2_1485298729541/VDP_VERIFICATION_Windows_clone_2_1485298729541.vmx opID=3a74dd6f-58-d7b3 user=vpxuser:VMWARE.LOCAL\Administrator] Virtual machine delete completed.


If you notice here, the verification job is removing all the test restored files. However, it says it is unable to remove the VDP Verification directory. At this point, if you run a list command in the datastore of ESXi, you will notice there is a vmxf file present. And because of this VDP is unable to remove the verification folder.
During backup, VDP will only backup vmx, nvram, vmdk and flat files. It will not backup the vmxf file. So if it does not backup the vmxf file, it will neither restore the vmxf file. So the question is, what is this vmxf file? Vmxf file was initially created to contain the virtual machine meta data. Then later on, the tools information was added to this file. So what this means is, if you have a virtual machine with VM tools running, you should see a vmxf file associated with that VM. If there is no tools installed on a VM, then there will be no vmxf file.

So, if you perform a ABV of a VM without any tools, then you will not have this vmxf file left behind. So during the ABV, the VDP will test for VM tools heartbeat. So when the VM is powered On, the vmxf file gets created under the ABV VM folder. But since VDP did not create this file, it will neither delete this file. It will delete all the files that was restored by it, leaving this vmxf file under the verification folder. And the clean up in this case has to be done manually.

There is no much use of vmxf file in your production environment. If you delete this file manually and take a new backup of the VM and verify the restore point, the file is still coming back. This means that something in the "vmx" file is causing the "vmxf" file to appear.

The resolution:

1. Power off the virtual machine
2. If you vi the virtual machine's vmx file, you will see the following entry:
extendedConfigFile = "Windows_clone_2.vmxf"

3. Use the vi editor to hash this entry out.

Before edit:
extendedConfigFile = "Windows_clone_2.vmxf"
After edit:
 #extendedConfigFile = "Windows_clone_2.vmxf"

4. Save the file and reload the virtual machine's vmx. To reload, you will have to find the VM ID. To do this, run the below command:
# vim-cmd vmsvc/getallvms
5. Once you get the ID, reload it using:
# vim-cmd vmsvc/reload <vm-id>
6. Remove the vmxf file manually from the virtual machine directory

7. Power On the VM. Wait for the VM to be completely up and the tools status to be Running (Current) Now, if you browse the VM directory, the vmxf is no longer created.

8. Perform another backup for this VM and then perform the verification and this time there will be no stale files left behind.

Note:
If you make any changes to the VM tools, such as reinstall or upgrade, then another entry for vmxf will be added in the virtual machine's vmx file and this entire process has to be re-done.

Hope this helps!