VM Still Working Despite Being Powered Off

Host:     ESXi 5.0
OS:    FreeBSD

I had been having trouble with a VM which suddenly stopped (Veeam) backing up due to an "NFC" error:

"Processing {vm_name} Error: NFC server [{esxihost.domain.local] is busy.
 Failed to retrieve next FILE_PUT message. File path: [[LUN3] {vm_name}/{vm_name}.vmx]. File pointer: [0]. File size: [3567]."

Veeam requested I connect to the host directly and "download" the .vmx file as a test via "Browse Datastore" . When I did this I got an "I/O Error" in the GUI then HA tried to move the VM for some reason, unsuccessfully.

When I SSH'd to the ESXi host and then did a  "less" into the .vmx file I got a "read error" and nothing displayed

The VM still appeared to be funtioning as I could access the OS directly via SSH and DNS (its purpose) was running fine. However vCenter was now showing the VM as powered off completely and my only option was to "Power on". However, the VM was still working. Odd!

ns0-c1

Looking at the Events on the VM via vCenter I could see a couple of reasons for the HA failover attempt.

1) The operation timed out
2) The file was locked

 

ns0-c3

After much investigation it appears the .vmx was unreadable and therefore corrupted. The VM couldn't be restarted so I decided to use the following steps to get the service working again. As you'll see I shutdown the guest OS, removed the VM from the inventory then created a completely new VM with the .VMDK file from 'broken' VM.

1) Shutdown the guest OS – Freebsd in this case

# shutdown -p now

2) Remove the VM from the inventory

   In vCenter -> right click broken VM -> Remove from Inventory

Note: Before doing this ensure you know where the VMDK(s) of this VM is located 

3) I then created a new a VM and added the previously used VMDK as the primary disk
   In vCenter -> Right click cluster -> New Virtual Machine -> Follow the wizard but instead of creating a new disk select the VMDK from the previous VM,

4) Power on VM