Friday, 3 March 2017

VDP Backup Fails With The Error "Failed To Attach Disks"

Earlier, we had seen a compatibility issue with VDP 6.1.3 on an ESXi 5.1 environment, where the backup used to fail with the message "Failed to attach disks". More about this can be read here
However, this is a very generic message and can mean different if we are running VDP on a compatible version.

In this case, the VDP was 6.1.3 on a 6.0 vSphere environment and the backup used to fail only when an external proxy was deployed. If the external proxy was discarded the backups utilized the internal VDP proxy and completed successfully.

With the external proxy, the logs are on the proxy machine under /usr/local/avamarclient/var
The backup job logs had the following entry:

2017-03-02T16:13:18.762Z avvcbimage Info <16041>: VDDK:VixDiskLib: VixDiskLib_PrepareForAccess: Disable Storage VMotion failed. Error 18000 (Cannot connect to the host) (fault (null), type GVmomiFaultInvalidResponse, reason: (none given), translated to 18000) at 4259.

2017-03-02T15:46:19.092Z avvcbimage Info <16041>: VDDK:VixDiskLibVim: Error 18000 (listener error GVmomiFaultInvalidResponse).

2017-03-02T15:46:19.092Z avvcbimage Warning <16041>: VDDK:VixDiskLibVim: Login failure. Callback error 18000 at 2444.

2017-03-02T15:46:19.092Z avvcbimage Info <16041>: VDDK:VixDiskLibVim: Failed to find the VM. Error 18000 at 2516.

2017-03-02T15:46:19.093Z avvcbimage Info <16041>: VDDK:VixDiskLibVim: VixDiskLibVim_FreeNfcTicket: Free NFC ticket.

2017-03-02T15:46:19.093Z avvcbimage Info <16041>: VDDK:VixDiskLib: Error occurred when obtaining NFC ticket for [Datastore_A] Test_VM/Test_VM.vmdk. Error 18000 (Cannot connect to the host) (fault (null), type GVmomiFaultInvalidResponse, reason: (none given), translated to 18000) at 2173.

2017-03-02T15:46:19.093Z avvcbimage Info <16041>: VDDK:VixDiskLib: VixDiskLib_OpenEx: Cannot open disk [Datastore_A] Test_VM/Test_VM.vmdk. Error 18000 (Cannot connect to the host) at 4964.

2017-03-02T15:46:19.093Z avvcbimage Info <16041>: VDDK:VixDiskLib: VixDiskLib_Open: Cannot open disk [Datastore_A] Test_VM/Test_VM.vmdk. Error 18000 (Cannot connect to the host) at 5002.

2017-03-02T15:46:19.093Z avvcbimage Error <0000>: [IMG0008] Failed to connect to virtual disk [Datastore_A] Test_VM/Test_VM.vmdk (18000) (18000) Cannot connect to the host

In the mcserver.log, the following was noted:

WARNING: com.avamar.mc.sdk10.McsFaultMsgException: E10055: Attempt to connect to virtual disk failed.
at com.avamar.mc.sdk10.util.McsBindingUtils.createMcsFaultMsg(McsBindingUtils.java:35)
at com.avamar.mc.sdk10.util.McsBindingUtils.createMcsFault(McsBindingUtils.java:59)
at com.avamar.mc.sdk10.util.McsBindingUtils.createMcsFault(McsBindingUtils.java:63)
at com.avamar.mc.sdk10.mo.JobMO.monitorJobs(JobMO.java:299)
at com.avamar.mc.sdk10.mo.GroupMO.backupGroup_Task(GroupMO.java:258)
at com.avamar.mc.sdk10.mo.GroupMO.execute(GroupMO.java:231)
at com.avamar.mc.sdk10.async.AsyncTaskSlip.run(AsyncTaskSlip.java:77)

The cause of this is due to an Ipv6 AAAA record. VDP does not support a dual stack networking and needs to have either IPv4 settings or IPv6 settings.

Resolution:
1. Login to the external proxy machine using root credentials
2. Run the below command to test DNS resolution:
# nslookup -q=any <vcenter-fqdn>

An ideal output should be as follows:

root@vdp-dest:~/#: nslookup -q=any vcenter-prod.happycow.local
Server:         10.109.10.140
Address:        10.109.10.140#53

Name:   vcenter-prod.happycow.local
Address: 10.109.10.142

But, if you see the below output, then you have an IPv6 AAAA record as well:

root@vdp-dest:~/#: nslookup -q=any vcenter-prod.happycow.local
Server:         10.109.10.140
Address:        10.109.10.140#53

Name:   vcenter-prod.happycow.local
Address: 10.109.10.142
vcenter-prod.happycow.local   has AAAA address ::9180:aca7:85e7:623d

3. Run the below command to set IPv4 precedence over IPv6:
echo "precedence ::ffff:0:0/96  100" >> /etc/gai.conf

4. Restart the avagent service using the below command:
# service avagent-vmware restart

Post this, the backups should work successfully. If the Ipv6 entry is not displayed in the nslookup and the backup still fails, then please raise a support request with VMware.