vMotion Failing : Error 195887167

Problem

New install of vSphere 6.5.

When trying to vMotion a VM from one host svesx01 (192.168.131.111) to another host svesx02 (192.168.131.112) the process failed with the following error:

vmotion_tcpip1

 

Failed waiting for data. Error 195887167. Connection closed by remote host, possibly due to timeout.
Migration [175211275:8050715817357496887] failed to connect to remote host <192.168.131.112> from host <192.168.131.111>: Timeout.
vMotion migration [175211275:8050715817357496887] vMotion migration [175211275:8050715817357496887] stream thread failed to connect to the remote host <192.168.131.112>: The ESX hosts failed to connect over the VMotion network
The vMotion migrations failed because the ESX hosts were not able to connect over the vMotion network. Check the vMotion network settings and physical network configuration.
vMotion migration [175211275:8050715817357496887] failed to read stream keepalive: Connection closed by remote host, possibly due to timeout

 

Diagnostics
 

1) Ping from svesx02 to svesx01

[root@svesx02] ping 192.168.131.111

 Result: fail

 

2) Ping locally 192.168.131.111

[root@svesx01] ping 192.168.131.111

Result: failed to local tcp/ip stack

 

3)  Internal vmkping

[root@svesx01] vmkping -I vmk6 192.168.131.11

Result:    Unknown interface 'vmk6': Invalid argument

 

Note: This raised alarm bells as i've seen this before when the vmotion network were in different netstacks

 

4) I checked the netstacks for each vmk on affected and non-affected host:

esxcfg-vmknic -l

vmotion_tcpip2

vmotion_tcpip3

 

Resolution

As diagnosed above,  it appears svesx01 was in the vMotion netstack and other hosts were in the defaultTcpipStack. To fix the isue I just deleted the affected vmks on the other hosts an re-added them using the "vMotion" stack.

vmotion_tcpip4