Friday, 23 February 2018

VDP Tomcat Service Crashes With "line 1, column 0, byte 0"

While working with one of my colleague the last day, there was an issue where the tomcat service would not start up.

root@vdp:/home/admin/#: dpnctl status
Identity added: /home/dpn/.ssh/dpnid (/home/dpn/.ssh/dpnid)
dpnctl: INFO: gsan status: up
dpnctl: INFO: MCS status: up.
dpnctl: INFO: emt status: down.
dpnctl: INFO: Backup scheduler status: down.
dpnctl: INFO: axionfs status: down.
dpnctl: INFO: Maintenance windows scheduler status: enabled.
dpnctl: INFO: Unattended startup status: enabled.
dpnctl: INFO: avinstaller status: down.
dpnctl: INFO: [see log file "/usr/local/avamar/var/log/dpnctl.log"]

The core services seemed to be fine, it was just the EM tomcat that was unresponsive. When we tried to restart the tomcat service using the below command as the "root" user of VDP it used to fail:
# emwebapp.sh --start

Error trace:

syntax error at line 1, column 0, byte 0:
Identity added: /home/admin/.ssh/dpnid (/home/admin/.ssh/dpnid)
^
<entry key="clean_emdb_cm_queue_info_days" value="365" />
at /usr/lib/perl5/vendor_perl/5.10.0/x86_64-linux-thread-multi/XML/Parser.pm line 187

If we ran the alternative command to start the service, it would fail to and the dpnctl.log would have the similar Error trace.
# dpnctl start emt

Error trace:

2018/02/22-17:11:42 syntax error at line 1, column 0, byte 0:
2018/02/22-17:11:42 Identity added: /home/admin/.ssh/dpnid (/home/admin/.ssh/dpnid)
2018/02/22-17:11:42 tomcatctl: ERROR: problem running command "[ -r /etc/profile ] && . /etc/profile ; /usr/local/avamar/bin/emwebapp.sh --start" - exit status 255

So, it looks like the emserver.xml file had been corrupted and we just need to restore this particular file from a previous EM Flush.

To perform this, the steps would be:

1. List out the latest 5 available EM backups using:
# avtar --backups --path=/EM_BACKUPS --count=5 | less

The output would be:

Date      Time    Seq       Label           Size     Plugin    Working directory         Targets
---------- -------- ----- ----------------- ---------- -------- --------------------- -------------------
2018-02-21 08:00:05   213                          57K Linux    /usr/local/avamar     var/em/server_data
2018-02-20 08:00:05   212                          57K Linux    /usr/local/avamar     var/em/server_data
2018-02-19 08:00:08   211                          57K Linux    /usr/local/avamar     var/em/server_data
2018-02-18 08:00:04   210                          57K Linux    /usr/local/avamar     var/em/server_data
2018-02-17 08:00:11   209                          57K Linux    /usr/local/avamar     var/em/server_data

2. Next, we will choose a label number and then restore the EM Flush to a temp directory. The command would be:
# avtar -x --labelnum=209 --path=/EM_BACKUPS --target=/tmp/emback

A successful restore would end with a similar snippet:

avtar Info <5259>: Restoring backup to directory "/tmp/emback"
avtar Info <5262>: Restore completed
avtar Info <7925>: Restored 55.55 KB from selection(s) with 55.55 KB in 10 files, 6 directories
avtar Info <6090>: Restored 55.55 KB in 0.01 minutes: 306.5 MB/hour (56,489 files/hour)
avtar Info <7883>: Finished at 2018-02-22 21:26:53 GST, Elapsed time: 0000h:00m:00s
avtar Info <6645>: Not sending wrapup anywhere.
avtar Info <5314>: Command completed (1 warning, exit code 0: success)

3. Rename the older emserver.xml file using:
# mv /space/avamar/var/em/server_data/prefs/emserver.xml /space/avamar/var/em/server_data/prefs/emserver.xml.old

4. Copy to restored file to the actual location using:
# cp -p /tmp/emback/var/em/server_data/prefs/emserver.xml /space/avamar/var/em/server_data/prefs/

5. List out the permissions on the file using:
# ls -lh  /space/avamar/var/em/server_data/prefs/

The permissions should be:

-rw------- 1 admin admin 9.4K Jul 26  2017 emserver.xml
-rw------- 1 admin admin 9.3K Feb 22 09:09 emserver.xml.old
-r-------- 1 admin admin 4.5K Jul 26  2017 preferences.dtd

6. Restart the tomcat service using:
# emwebapp.sh --start

It should now start up successfully. Hope this helps!