Friday, 23 February 2018

VDP Backup Fails With Time Out - End

So you might be aware that a backup job in VDP can run upto 24 hours, post which it ideally times out with Time Out end status. If this is what you are facing, then you can extend the backup window overtime using this KB article here

In some cases, this might not be the situation. You might not hit the 24 hour window but still the backup fails with Time Out - End

ID Status Error Code Start Time Elapsed End Time Type Progress Bytes New Bytes Client Domain
9151917120002509 Timed Out - End 0 2018-02-21 00:01 IST 02h:59m:31s 2018-02-21 03:00 GMT Scheduled Backup 179.9 GB 15.4% Test-VM /vcenter.southpark.local/VirtualMachines

The backup logs will have the same logging that one would expect to see when the time out issue occurs.

The avtar.log will have something similar to:

2018-02-21T03:00:23.253-01:00 avtar Info <7061>: Canceled by '3016-vmimagew' - exiting...
2018-02-21T03:00:23.258-01:00 avtar Info <9772>: Starting graceful (staged) termination, cancel request (wrap-up stage)
2018-02-21T03:00:23.347-01:00 [avtar] ERROR: <0001> backstreamdir::childdone error merging history stream data during phase_do_hidden while processing directory 'VMFiles/2'
2018-02-21T03:00:23.392-01:00 [avtar] ERROR: <0001> backstreamdir::childdone error merging history stream data during phase_do_hidden while processing directory 'VMFiles'
2018-02-21T03:00:23.438-01:00 avtar Info <7202>: Backup CANCELED, wrapping-up session with Server

The backup.log will have:

2018-02-21T03:00:23.169-01:00 avvcbimage Info <9740>: skipping 2048 unchanged sectors in PAX stream at sector offset 241468416
2018-02-21T03:00:23.196-01:00 avvcbimage Info <9772>: Starting graceful (staged) termination, MCS cancel (wrap-up stage)
2018-02-21T03:00:23.212-01:00 avvcbimage Info <19692>: Cancelled by MCS with timeout=0 sec
2018-02-21T03:00:23.647-01:00 avvcbimage Info <40654>: isExitOK()=169
2018-02-21T03:00:23.647-01:00 avvcbimage Info <16022>: Cancel detected(externally cancelled by Administrator), isExitOK(0).
2018-02-21T03:00:23.657-01:00 avvcbimage Info <16041>: VDDK:VixDiskLib: VixDiskLib_Close: Close disk.


The avagent.log will have the following:

2018-02-21T03:00:23.137-01:00 avagent Info <5964>: Requesting work from 127.0.0.1
2018-02-21T03:00:23.162-01:00 avagent Info <8636>: Starting CTL cancel of workorder "BackupTest-BackupTest-1519171200009".
2018-02-21T03:00:31.454-01:00 avagent Info <6688>: Process 127319 (/usr/local/avamarclient/bin/avvcbimage) finished (code 169: externally cancelled by Administrator)
2018-02-21T03:00:31.454-01:00 avagent Warning <6690>: CTL workorder "BackupTest-BackupTest-1519171200009" non-zero exit status 'code 169: externally cancelled by Administrator'

When I checked the backup duration window, it was set only to 3 hours. Not sure how that changed

root@vdp:/usr/local/avamar/var/vdr/server_logs/#: mccli schedule show --name=/vcenter.southpark.local/VirtualMachines/BackupTest | grep "Backup Window Duration" 
Backup Window Duration 3 hours and 0 minutes

So we had to flip it back to 24 hours:

root@vdp:/usr/local/avamar/var/vdr/server_logs/#: mccli schedule edit --name=/vcenter.southpark.local/VirtualMachines/BackupTest --duration=24:00
0,22214,Schedule modified

root@vdp:/usr/local/avamar/var/vdr/server_logs/#: mccli schedule show --name=/vcenter.southpark.local/VirtualMachines/BackupTest | grep "Backup Window Duration" 
Backup Window Duration 24 hours and 0 minutes

The backup should not time out before the 24 hour window anymore. Hope this helps!