News

ESXi 3.5.0 Update 4 Host Disconnects

I've run into a problem on ESXi 3.5.0 Update 4 where the ESX host will disconnect, and then shortly after the VMs will stop responding on the network. If you catch the ESXi host fast enough and get to the unsupported console this command can get you back up and running quickly:

#> /sbin/services.sh restart


It appears that this problem is caused by the CIM providers ( they show the health status for fans, firmware revisions of the storage controller etc ) and what has been reported as a memory leak in the sfcbd process. You can disable these CIM providers by tweaking the Misc.CimEnabled setting from a '1' to a '0' in the advanced properties shown below.





After you change that setting, you'll need to either reboot the ESXi host, or head back to the command line and restart the sfcbd process:


#> /etc/init.d/sfcbd-watchdog stop


It's unconfirmed at this point, but is believed to be a problem that only affects those of you (me) that use Emulex HBAs.

Labels:

Posted by Dominic Rivera at Wednesday, July 01, 2009.


ESX and CDP (Cisco Discovery Protocol)

One of my favorite features of ESX 3.5.0 is the inclusion of the Cisco Discovery Protocol (CDP), from an administrative perspective, it allows me to click on a bubble next to my network ports and get the switch name, switch ip, switch blade and port that the vmnic is connected to. Since documentation isn't always 100% reliable it's nice to be able to verify it with the CDP information that you can gleam from ESX. If you upgraded your ESX host from a pre 3.5.0 version, CDP likely isn't enabled though it is enabled by default with a new 3.5.0+ installation. To correct this you need to fix it from the command line. The following command will set the ESX vSwitch to start listening for CDP information from your switch:


vmprofessional #>esxcfg-vswitch -B listen vSwitch0


After that, you just need to wait a few minutes for the information to populate and be available in the VIC. One drawback about looking at the data in the VIC is that it is presented in a field that is non-selectable so you can't copy the data out. If you have a large number of ports to get information on the process of transcribing can be cumbersome and error-prone. I developed a script to pull this data from the command line to make the data more usable. Behold!


#!/usr/bin/perl -w
use strict;
use warnings;
use Getopt::Long;
use Term::ANSIColor qw(:constants);
$Term::ANSIColor::AUTORESET = 1;
use VMware::VIRuntime;

my $username = "username";
my $password = "password";

my @service_urls = qw|
https://esxhost_or_vc1.yourdomain.com/sdk
https://esxhost_or_vc2.yourdomain.com/sdk
|;


foreach (@service_urls){
my $service_url = $_;
print "Gathering host information from server: ";
print BLUE "$service_url\n";
Vim::login(service_url => $service_url, user_name => $username, password => $password);
my $hostViews = Vim::find_entity_views(view_type => 'HostSystem' );

foreach (@$hostViews ) {
print "Host: ", $_->name, ".\n";
my $version = $_->config->product->fullName;
$version =~ s/.*-(\d+)/$1/;
# Only hosts > ESX 3.5.0 ( build 64607 ) could have CDP information
unless ( $version > 64607 ){
print "Host: ", $_->name, " < ESX 3.5.0, Skipping \n";
next;
}
my $netMgr = Vim::get_view(mo_ref => $_->configManager->networkSystem);
my @physicalNicHintInfo = $netMgr->QueryNetworkHint();
foreach (@physicalNicHintInfo){
foreach ( @{$_} ){
my $device = $_->device;
if ( defined $_->connectedSwitchPort ){
my $port = $_->connectedSwitchPort->portId;
my $switch = $_->connectedSwitchPort->devId;
my $vlan = $_->connectedSwitchPort->vlan;
print "$device $switch $port $vlan\n";
}
}
}
print "\n";
}
}
Vim::logout();



I've been thinking about extending the script a bit too but haven't found time yet. Since CDP yields the name of the physical switch, it's possible to use this data to determine if your vSwitch has redundant links that head to seperate physical switches.

Labels:

Posted by Dominic Rivera at Tuesday, June 23, 2009.


MAC address cloning

If you've spent much time converting systems from physical to virtual, you've inevitably run into some application that has been licensed and keyed off of the MAC address of the physical system. When you're moving that system from physical to virtual, you've got a decision to make: You can either contact the software vendor and get a new license; or you can clone the MAC address from the physical server to the virtual server.

By default, MAC addresses for Virtual Machines aren't static. You need to generate a MAC address for the VM in the range 00:50:56:00:00:00 to 00:50:56:3f:ff:ff before you have the new license generated for you by the vendor. Once you have selected an appropriate address, edit the .vmx file for the VM you want to set the static MAC for and append the following lines:


ethernet0.addressType = "static"
ethernet0.address = "00:50:56:00:00:01"


There are two important things to note here. The first is that you need to manually keep track of the MAC addresses that you're assigning so you don't cause yourself headaches later. Secondly, when you give an address like this to a software vendor for licensing make sure you know how their licensing model works with respect to virtual machines before you call. It will be apparent to them that this is a generated address and may cause them to do some questioning of your licensing.

So, let's say you don't want to deal with the software vendor at all and you prefer to just clone the MAC address of the physical system. You can use the following method to specify *any* MAC address from within the guest OS:

Windows:
Right-click and select "Properties" on the network connection that you want to modify, then click the "Configure" button on the adapter, select advanced and then change the Property NetworkAddress from "Not Present" to Value "0000DEADBEEF" or whatever address it is that you're cloning.




You'll want to do this on the VM's console as you're likely to lose your remote connection at this point.

Linux:

Stop your networking service and bring the interface down ( the service command I use here is RedHat-centric )

$> service network stop
$> ifconfig eth0 hw ether 00:00:DE:AD:BE:EF
$> service network start


By default, ESX sets vSwitch security for Forged Transmits and MAC Address Changes to allow, but many administrators (myself included) set both of these policies to deny to assert some extra control over the virtual environment. If you change the MAC address by using the method inside the guest OS, you will need to ensure that the MAC Address Changes and Forged Transmit options are both set to allow if you want packets to flow in/out of your VM correctly.

Just recently, I became aware of another method to change the MAC address of a VM to any address by editing the .vmx file:


ethernet0.addressType = “static”
ethernet0.Address = “00:00:DE:AD:BE:EF″
ethernet0.checkMACAddress = “false”


The last line in the snippet above prevents ESX from doing the validity check on the MAC address for the VM. I believe this to be the best option for almost all scenarios as it gives you a number of key benefits:

  • You don't have to contact your software vendor for a new key
  • You don't have to change the security on your vSwitch to accomodate the change
  • In the event you upgrade your virtual hardware ( guest OS MAC change method ) you won't have to update the virtual nic with your custom .

    The final and most important lesson is that if you decide to clone the MAC address, you need to make sure that the physical machine that you took the MAC address from doesn't get "recycled" and hooked back up to the network. Rent a wood chipper if necessary.
  • Labels:

    Posted by Dominic Rivera at Monday, June 22, 2009.


    Protecting your SAN LUNs during installation

    When you're installing a new ESX host, it's important to take precautions to make sure that you're not overwriting a SAN LUN where your precious VMs reside. To keep those LUNs safe, you have a few options which I'm going to list from most cautious to least cautious.

  • Physically disconnect the fibre that runs into the server
  • Unpresent the LUNs from the ESX host through zoning on your SP/Switches
  • Remove the drivers from the installation media ( check out this script if you're interested )
  • Run a script in %pre of the kickstart which unloads all of the fibre kernel modules ( Emulex and Qlogic ) so that you can't inadvertently install onto a SAN LUN. The following script was originally posted on VMTN by user timmp. You need to inject this script into the %pre section of your kickstart.

    %pre
    #!/bin/sh

    # This will remove the loaded HBA modules from the kernel
    remove_qla(){
    for i in $(lsmod | grep qla | awk '{print $1'}); do
    echo Will remove: $i >> /dev/tty1
    rmmod $i
    sleep 1
    done
    }

    remove_lpfc(){
    for i in $(lsmod | grep lpfc | awk '{print $1'}); do
    echo Will remove: $i >> /dev/tty1
    rmmod $i
    sleep 1
    done
    }

    remove_qla
    sleep 2
    remove_qla
    remove_lpfc
  • Use the --ignoredisk options to mask your LUNs during installation

    ignoredisk –-drives=sda,sdb,sdc,sdd,sde,sdf,sdg,sdh,sdi,sdj,sdk,etc,etc

  • If you're using Compaq/HP gear, the scsi devices are named differently, and if you specify them explicitly you *should* be safe.

    # Bootloader options
    bootloader –location=mbr –driveorder=cciss/c0d0
    # Disk Partitioning
    clearpart --all --initlabel --drives=cciss/c0d0
    part /boot --fstype ext3 --size 250 --ondisk cciss/c0d0 --asprimary
    part / --fstype ext3 --size 5192 --ondisk cciss/c0d0 --asprimary
    part swap --fstype swap --size 1600 --ondisk cciss/c0d0 --asprimary
    part /var/log --fstype ext3 --size 4096 --ondisk cciss/c0d0
    part /tmp --fstype ext3 --size 4096 --ondisk cciss/c0d0
    part /home --fstype ext3 --size 2048 --ondisk cciss/c0d0
    part None --fstype vmfs3 --size 8192 --ondisk cciss/c0d0 --grow
    part None --fstype vmkcore --size 100 --ondisk cciss/c0d0




    You can use one or more of those options to keep your data safe while you install/upgrade your ESX hosts. If you inadvertenly destroy a VMFS volume, it's extremely hard to recover that data.
  • Labels:

    Posted by Dominic Rivera at Friday, June 19, 2009.


    displayName != Name on disk

    On a couple occasions, I've run into ESX hosts that have died and haven't either had HA enabled, or HA working. So while I could see all of the guests on the ESX host through vCenter ( they show as disconnected ). I still had to go and manually register those VMs ( using the VIC datastore browser, or the vmware-cmd command ). What really sucks in this situation though is locating the VMs when you have a large number of datastores, what sucks even more is if the VM's display name doesn't match the name of the folder on the filesystem. You can still find the VMs by glancing at the .vmx for the VM and looking for the displayName attribute, but it's easier to keep on top of things by having the VMs named correctly on the filesystem. I wrote a quick script to compare the displayName to the name of the folder on the filesystem and then spit out the ones which don't match. To correct this you can use Storage VMotion, or a cold migration to move the VM from one datastore to another. If neither of those options are available, you can do a manual rename, though I don't encourage anyone to do that because there are a number of files that you need to rename to make that work. Script below:



    #!/usr/bin/perl -w

    use VMware::VIRuntime;

    my $username = 'vi_username';
    my $password = 'vi_password';
    my $url = 'https://vcenter.yourdomain.com/sdk';

    Vim::login( service_url => $url, user_name => $username, password => $password );

    my $vms = Vim::find_entity_views(
    view_type => 'VirtualMachine',
    );


    foreach my $vm ( @$vms ){
    my $path = $vm->summary->config->vmPathName;
    my $name = $vm->name;
    $_ = $path;

    # example syntax for $path [datastore] /.vmx
    m/\[.+\]\s(.+)\//;

    # Print out the VMs who have a folder that doesn't match their displayName
    unless ( $name eq $1 ){
    print "Mismatch: $name -> $1\n";
    }

    }

    Labels:

    Posted by Dominic Rivera at Wednesday, June 17, 2009.


    Import / Export Custom Attributes

    In a posting that I had a few days ago on how to export permissions and DRS affinity rules, one reader wrote and asked me if the same could be done for custom attributes. As it turns out, you can in just a few lines of powershell. You can export the attributes by selecting the virtual machines tab and deselecting all but the attributes that you want to export and then just select file->export. Choose a format of .csv. Then to import those values we'll use that same export file and use a few lines of powershell to import those values. So assuming we name our attributes "Business Line" and "Maintenance Window" the import would look like the following:


    foreach ($vm in (import-csv "export.csv")){
    $myvm = get-vm $vm.Name
    set-customfield -entity $myvm -name "Business Line" -value $vm."Business Line"
    set-customfield -entity $myvm -name "Maintenance Window" -value $vm."Maintenance Window"
    }


    If you have your system inventory information in a separate database, this could be very useful to keep your vCenter up to date with your current system inventory.

    Posted by Dominic Rivera at Wednesday, June 17, 2009.


    SLES 10 - Upgrading to the LSI Logic SCSI Adapter

    It turns out that in my environment, a number of VMs were 'recycled', meaning that the group that runs them took an old SUSE 32 bit system, and then wiped the disk and then installed SLES 10 x64. An interesting artifact about that setup is that those VMs are now using the older BusLogic virtual SCSI adapter and vmxnet driver instead of the LSI Logic adapter and E1000 driver that would be provisioned to a VM were it created new today. This problem reared it's head when I was upgrading a cluster from ESXi 3.5.0 Update 4 to ESXi 4.0, it turns out that ESXi 4.0 no longer supports the BusLogic SCSI adapter with an OS Type of SLES 10 x64. The VM would migrate to the new host ( with 2 warnings, one about the SCSI adapter and one about the network adapter ), but the VMotion would ultimately fail the VM back onto the old host.

    By no means am I unfamiliar with linux, but most of my experience has been with RedHat based distros, when I cracked open modprobe.conf on the SLES box the drivers all referenced /bin/true, and it took me nearly an hour to get the VM switched over to the newer LSI Logic controller so I could finish upgrading the cluster. Below is a snippet on how to migrate from the BusLogic adapter to the LSI Logic adapter in SLES 10.



    First, boot the VM with the old BusLogic Controller

    Then, load the lsilogic modules


    bash:# modprobe mptbase
    bash:# modprobe mptscsih
    bash:# modprobe mptspi


    Next, edit /etc/sysconfig/kernel, and change the line "INITRD_MODULES=" from

    INITRD_MODULES="Buslogic "

    to

    INITRD_MODULES="Buslogic mptbase mptscsih mptspi "

    Finally, generate a new initrd

    bash:# mkinitrd


    Then shut down the VM, and change the scsi adapter type from Buslogic to LSI Logic and power the VM back on.

    Labels:

    Posted by Dominic Rivera at Wednesday, June 17, 2009.


    Archives

    10/01/2006 - 11/01/2006 | 03/01/2007 - 04/01/2007 | 04/01/2007 - 05/01/2007 | 05/01/2007 - 06/01/2007 | 06/01/2007 - 07/01/2007 | 07/01/2007 - 08/01/2007 | 09/01/2007 - 10/01/2007 | 10/01/2007 - 11/01/2007 | 11/01/2007 - 12/01/2007 | 12/01/2007 - 01/01/2008 | 01/01/2008 - 02/01/2008 | 02/01/2008 - 03/01/2008 | 03/01/2008 - 04/01/2008 | 04/01/2008 - 05/01/2008 | 05/01/2008 - 06/01/2008 | 06/01/2008 - 07/01/2008 | 08/01/2008 - 09/01/2008 | 09/01/2008 - 10/01/2008 | 10/01/2008 - 11/01/2008 | 03/01/2009 - 04/01/2009 | 06/01/2009 - 07/01/2009 | 07/01/2009 - 08/01/2009 |