Friday, 28 October 2016

VDP Stuck In A Configuration Loop

There have been a few cases logged with VMware where the newly deployed VDP appliance gets stuck in a configuration loop. Not to worry, there is now a fix for this. 

A little insight to what this is: So, we will go ahead and deploy a VDP (6.1.2 in my case) as an ova template. The deployment goes through successfully, and then we power On the VDP appliance which too completes successfully. Then, we go to the https://vdp-ip:8543/vdp-configure page and run through the configuration wizard. Everything goes here as well, the configuration wizard completes and requests you to reboot the appliance. Once the appliance is rebooted, it's going to make certain changes to the appliance, configure alarms and initialize core services. There will be a task called as "VDP: Configure Appliance" which will be initiated. Here, this task gets stuck somewhere around 45 to 70 percent. The appliance will boot up completely, however, when you go back to the vdp-configure page, you will notice that it is taking you through the configuration wizard again. You can run up to the configure storage section post which you will receive an error, as the appliance is already configured with the storage. And no matter which browser or how many times you access this vdp-configure page, you will be taken back to the configuration wizard. This will end up as an infinite loop.

This issue is mainly and mostly (almost certainly) seen only on vCenter 5.5 U3e release. This is because, the VDP uses JSAFE/BSAFE Java libraries and these do not go well with the vCenter SSL ciphers in the 5.5 U3e. To fix this, we switch from JSAFE to Java JCE libraries on the VDP appliance.

Before, we get to this, you can visit the vdr-server.log at the time of the issue (/usr/local/avamar/var/vdr/server_logs) to verify the following:

2016-10-29 01:15:40,676 INFO  [Thread-7]-vi.ViJavaServiceInstanceProviderImpl: vcenter-ignore-cert ? true
2016-10-29 01:15:40,714 WARN  [Thread-7]-vi.VCenterServiceImpl: No VCenter found in MC root domain
2016-10-29 01:15:40,714 INFO  [Thread-7]-vi.ViJavaServiceInstanceProviderImpl: visdkUrl = https:/sdk
2016-10-29 01:15:40,715 ERROR [Thread-7]-vi.ViJavaServiceInstanceProviderImpl: Failed To Create ViJava ServiceInstance owing to Remote VCenter connection error
java.rmi.RemoteException: VI SDK invoke exception:java.lang.IllegalArgumentException: protocol = https host = null; nested exception is:
        java.lang.IllegalArgumentException: protocol = https host = null
        at com.vmware.vim25.ws.WSClient.invoke(WSClient.java:139)
        at com.vmware.vim25.ws.VimStub.retrieveServiceContent(VimStub.java:2114)
        at com.vmware.vim25.mo.ServiceInstance.<init>(ServiceInstance.java:117)
        at com.vmware.vim25.mo.ServiceInstance.<init>(ServiceInstance.java:95)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.createViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:297)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.createViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:159)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.createViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:104)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.createViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:96)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.getViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:74)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.waitForViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:212)
        at com.emc.vdp2.server.VDRServletLifeCycleListener$1.run(VDRServletLifeCycleListener.java:71)
        at java.lang.Thread.run(Unknown Source)

Caused by: java.lang.IllegalArgumentException: protocol = https host = null
        at sun.net.spi.DefaultProxySelector.select(Unknown Source)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(Unknown Source)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
        at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(Unknown Source)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(Unknown Source)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(Unknown Source)
        at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(Unknown Source)
        at com.vmware.vim25.ws.WSClient.post(WSClient.java:216)
        at com.vmware.vim25.ws.WSClient.invoke(WSClient.java:133)
        ... 11 more

2016-10-29 01:15:40,715 INFO  [Thread-7]-vi.ViJavaServiceInstanceProviderImpl: Retry ViJava ServiceInstance Acquisition In 5 Seconds...
2016-10-29 01:15:45,716 INFO  [Thread-7]-vi.ViJavaServiceInstanceProviderImpl: vcenter-ignore-cert ? true
2016-10-29 01:15:45,819 WARN  [Thread-7]-vi.VCenterServiceImpl: No VCenter found in MC root domain

The mcserver.out log file should show the below:

Caught Exception : Exception : org.apache.axis.AxisFault Message : ; nested exception is:
javax.net.ssl.SSLHandshakeException: Unsupported curve: 1.2.840.10045.3.1.7 StackTrace : AxisFault
faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException faultSubcode:
faultString: javax.net.ssl.SSLHandshakeException: Unsupported curve: 1.2.840.10045.3.1.7 faultActor:
faultNode:
faultDetail:
{http://xml.apache.org/axis/}stackTrace:javax.net.ssl.SSLHandshakeException: Unsupported curve: 1.2.840.10045.3.1.7

To fix this:

1. Discard the newly deployed appliance completely. 
2. Deploy the VDP appliance again. Go through the ova deployment and power on the appliance. Stop here, do not go to the vdp-configure page.

3. To enable the Java JCE library we need to add a particular line in the mcsutils.pm file under the $prefs variable. The line is exactly as below:

. "-Dsecurity.provider.rsa.JsafeJCE.position=last "

4. vi the following file;
# vi  /usr/local/avamar/lib/mcsutils.pm
The original content would look like:

my $rmidef = "-Djava.rmi.server.hostname=$rmihost ";
   my $prefs = "-Djava.util.logging.config.file=$mcsvar::lib_dir/mcserver_logging.properties "
             . "-Djava.security.egd=file:/dev/./urandom "
             . "-Djava.io.tmpdir=$mcsvar::tmp_dir "
             . "-Djava.util.prefs.PreferencesFactory=com.avamar.mc.util.MCServerPreferencesFactory "
             . "-Djavax.xml.parsers.DocumentBuilderFactory=org.apache.xerces.jaxp.DocumentBuilderFactoryImpl "
             . "-Djavax.net.ssl.keyStore=" . MCServer::get( "rmi_ssl_keystore" ) ." "
             . "-Djavax.net.ssl.trustStore=" . MCServer::get( "rmi_ssl_keystore" ) ." "
             . "-Dfile.encoding=UTF-8 "
             . "-Dlog4j.configuration=file://$mcsvar::lib_dir/log4j.properties ";  # vmware/axis

After editing it would look like:

 my $rmidef = "-Djava.rmi.server.hostname=$rmihost ";
   my $prefs = "-Djava.util.logging.config.file=$mcsvar::lib_dir/mcserver_logging.properties "
             . "-Djava.security.egd=file:/dev/./urandom "
             . "-Djava.io.tmpdir=$mcsvar::tmp_dir "
             . "-Djava.util.prefs.PreferencesFactory=com.avamar.mc.util.MCServerPreferencesFactory "
             . "-Djavax.xml.parsers.DocumentBuilderFactory=org.apache.xerces.jaxp.DocumentBuilderFactoryImpl "
             . "-Djavax.net.ssl.keyStore=" . MCServer::get( "rmi_ssl_keystore" ) ." "
             . "-Djavax.net.ssl.trustStore=" . MCServer::get( "rmi_ssl_keystore" ) ." "
             . "-Dfile.encoding=UTF-8 "
             . "-Dsecurity.provider.rsa.JsafeJCE.position=last "
             . "-Dlog4j.configuration=file://$mcsvar::lib_dir/log4j.properties ";  # vmware/axis

5. Save the file
6. There is no use of restarting mcs using mcserver.sh --restart, as the VDP appliance is not yet configured and hence the core services are not yet initialized. 
7. Reboot the appliance.
8. Once the appliance is booted up, go to the configure page and begin the configuration and this should avoid the configuration loop issue.

If the VDP was already deployed and the vCenter was upgraded later, then you can follow the same steps until 6. Instead of rebooting the VDP this time, we should be good to restart the MCS using the mcserver.sh --restart --verbose command.

That's it. A permanent fix is in talks with engineering for the future VDP release.

Update:
A permanent fix is in 6.1.3 version of VDP.