SRM 5.8 Plugin does not display in the vSphere Web Client

Today when logging into the vSphere Web Client to document the SRM testing process, I noticed that the SRM plugin did not show on the home screen.  However, when logging into the protected site I noticed that it was there.  Here are my troubleshooting steps:

1. Logged into the SRM server and noticed the service was not running.  I started the service and tried logging into the web client, but the plugin was still not showing.

2. I then rebooted the SRM box and logged back into the web client.  Still no SRM plugin.

3. I restarted the vCenter service and web managementservices on the vCenter box. Still no SRM plugin.

4. Finally, I restarted the web client service on the vcenter box. Logged into the web client, and voila! Plugin was showing.

The root cause is that the SRM service must be running.  If it is not, start the service and then restart the web client service on the vCenter server.

Update Manager Error: sysimage.fault.SSLCertificateError

I had to redeploy a vCenter Server Appliance recently and got an error when opening vCenter: sysimage.fault.SSLCertificateError.

sysimage.fault.SSLCertificateErrorThis is caused by the certificate on the vCenter Server changing.  This causes the Upgrade Manager needing to be re-registered with the vCenter Server.  Luckily, VMware provides a utility to do just that.  Go to your vCenter Upgrade Manager Server (The appliance does not include this, so it will typically be installed on a separate Windows Server).

Go to “C:\Program Files (x86)\VMware\Infrastructure\Update Manager” and open the file “VMwareUpdateManagerUtility”.

VMwareUpdateManagerUtilityNext, enter the credentials for your vCenter Server and hit Login.

UtilityLoginOnce the window opens, you’ll want to click “Re-register to vCenter Server”. This brings up another login screen.  You can use the same credentials you did to open the utility here as well.

Re-register to vCenter ServerFinally click apply and you will receive a notification that you need to restart the VMware vSphere Update Manager in order for the settings to take affect.

RestartUpdateManagerServiceRestart the service and open the vSphere Client again.  The error will be gone and you’ll be able to use Update Manager again.

 

ESXi 5.1 Host out of sync with VDS

A recent issue I was having was that our ESXi 5.1 hosts would go “out of sync” with the VDS.  The only fix that would work was rebooting the host.  After digging into the log file, I discovered that the host was failing to get state information from the VDS. The entries are below:

value = “Failed to get DVS state from vmkernel Status (bad0014)= Out of memory”,
             }
          ],
          message = “Operation failed, diagnostics report: Failed to get DVS state from vmkernel Status (bad0014)= Out of memory”

The issue is a bug in the version of 5.1 we were running (Update 2 at the time) and is a memory leak on the host when using E1000 NICs in your VMs.  Because these VMs were created a long time ago, they were defaulted to the E1000.  The fix for this issue is updating to the latest build of ESXi which has a fix for the issue.  And also, don’t use E1000 NICs, always go with VMXNET3.  Problem solved!

Lost Path Redundancy to Storage Device

After installing 3 new hosts, I kept getting errors for Storage Connectivity stating “Lost path redundancy to storage device naa…….”.  We had 2 fibre cards and one of the paths was being marked as down.  I spent a couple weeks troubleshooting and trying different path selection techniques.  Still, we would randomly get alerts that the redundant path has gone down.  The only fix was to reboot the host, as not even a rescan would bring the path back up.

So after some trial and error, I found a solution.  The RCA isn’t necessarily complete yet, but I believe it was a problem with the fibre switch having an outdated firmware and us using new fibre cards in our hosts.  When using the path selection of Fixed, it would randomly pick an hba to use for each datastore.  Some datastores would use path 2 and some would use path 4.

The solution I came up with was to manually set the preferred path on each datastore (we have about 40, so it was no easy task).  You go into your host configuration, choose storage, pick a datastore and go into properties.  Inside this window, select manage paths from the bottom right and you should see your HBA’s listed.  There is a column marked Preferred with an asterisk showing which hba to prefer for the datastore (see the image below).  I went through and manually set the preferred path to be hba2 instead of letting vmware pick the path. The path selection is persistent across reboot as well when setting it manually.

storage path selectionSince manually setting the preferred path, the hosts have been stable and we have not gotten any more errors about path redundancy.  This is pretty much a band aid fix but at least we are not rebooting hosts 2-3 times per week.

My VCP5-DCV Experience

I recently took and passed my VCP5-DCV (510) exam.  To be honest, it wasn’t what I was expecting and there were other ways that I could have better prepared in hindsight.  Hence the reason for this post, to share my experience with those preparing or thinking about taking the exam.

My first tip would be to have confidence going into the exam.  If you’re thinking about or preparing to take the exam, then you already have experience with vCenter and virtualization.  So walk into the test center like a boss and own that exam.

Second, take as many practice exams as you can find.  I found a few that helped.  A great site I found was here (mwpreston.net).  VMware also has practice exams on their mylearn site. A tip for the mylearn site is if you get all the questions right, you can’t take it again.  So you should purposely get some questions wrong so you can continue using it.

Third, if you can, build a lab and play with things you normally don’t use.  Add some virtual iSCSI adapters if you don’t use iSCSI, etc.  Be prepared and expect to see things on the test that you have never used before and use your best judgement to answer the questions.

Fourth and finally, read as much as you can.  I’m sure we’ve all been in the situation of cramming for an exam.  The night before, trying to read the entire textbook to suck as much knowledge out of it as possible.  I would suggest giving yourself ample time to prepare and get comfortable with topics on vCenter, vDS and Standard Switches, and storage adapters.

With all that said, the first tip is the most important.  Have the confidence you know this stuff and that you have prepared.  Don’t second guess yourself when answering the questions and go with your gut feeling.  Take the practice exams, study the basics, and pass!

Best of luck and if you have any other questions, drop me a line.

Use Windows to Send an SNMP Trap

As a follow up to my post last week, I learned that you can use Windows to send an SNMP trap.  The functionality is built into windows, as long as you have enabled SNMP via computer management (in Windows 2008 R2 and later).  Once enabled in features, you can open the menu to configure SNMP by running the following command at the run menu:

evntwin

Once open, you can configure the trap that you require by utilizing the GUI that opens with evntwin.

evntwinHere you will add the event that you want sent to your SNMP server.  This is what is used as the trap definition.  You can then open the properties for the trap definition and get the Enterprise OID which can be used in your application such as SolarWinds to view the trap.

Once that is setup, you now need to tell the SNMP Service your community string and where you’re going to send the traps that we’ve created.  In the Services panel, you should have one for SNMP. Right-Click and choose Properties.  You’ll see the typical service tabs plus some that are specific to SNMP.

First, go to the agent tab and fill in the details about your organization.

SNMP Service AgentNext, you’ll want to go into the traps tab and fill in your community name and traps destination.

SNMP Service TrapsFinally, the security tab is where you’ll give rights to the community and trap sender.  localhost is in there by default and then you’ll want to include the same server that you’re sending traps to from the previous tab.

SNMP Service SecurityOnce all that is complete, you’re all set.  I typically would restart the service so it got the new settings. It’s a rather simple process to get going, but figuring it out was the tough part.

As always, hope it helped and feel free to leave a comment or question.

Use Powershell to Write to Event Log

One of my first tasks in my new position was to use Powershell to write to the event log.  The purpose of this was to monitor locked out users in an application and forward them to our Solarwinds application via an SNMP trap.  The application itself would write a .csv file to a share containing user information for those that are locked out.  The first part of this was to get Powershell to read the file, and put the details into the Event Log so it could later be sent to Solarwinds as a trap.

The script is below, with some helpful comments (The items in bold are what you will need to change according to your environment):

#Write events to the EventLog
Write-EventLog -LogName Application -Source The source name -EventId 1234 -EntryType Error -Message (Get-Content ‘File Path‘)
#Rename file with date/time stamp
$d = Get-Date -uFormat “%Y%m%d@%H%M%S”
$date = Get-Date -uFormat “%Y%m%d@%H%M%S”
## These will become parameters in our function later
$locationPath = “File Path
$fileName = “File Name
$extension = “.csv
$old = $locationPath + $fileName + $extension
$new = $locationPath + $fileName + “_” + $date + $extension
Rename-Item $old $new

 

The top part of the script is getting the content from the file and writing it to the event log that you specify.  The bottom can be molded to what you want to do but it will rename the file that is read to append the date and time to the file name.  It’s a rather basic script but a good starting point to do more with Powershell.  The script was also added to a scheduled task in order to get the data into the event log.

As a follow up, I’ll write another post describing how to use Windows to send an SNMP trap. The functionality is built into Windows and can be used to send traps for other types of events.

New Job, New Quirks

As some of you know, I’ve recently accepted a new position as a Senior System Engineer, mostly focusing on virtualization and networking.  The genre of the blog will stay the same but I expect it to also expand to some different technologies that I’ll be working with now including Citrix and some new Compellent storage systems.

And as I encounter issues and document them, I’ll continue to share them with the web as well.

Cheers!

Unable to remove a datastore from vCenter Server Inventory

I recently had an issue where I was unable to remove a datastore from the vCenter Server Inventory.  The datastore was grayed out and when right-clicking, had no options.  After some digging and some research in SQL, I found a way to manually do this in the vCenter database.  Every datastore is given a unique ID and can be found and removed inside of the database.

Warning: Always make a SQL backup before attempting any manual database changes.  You never know when things might break and you need to restore.

So here we go:

  1. Stop the vCenter Server Service
  2. Open SQL Management Studio
  3. Run the following against your vCenter Server database (This will give you the datastore ID):

select ID from VPX_ENTITY where name = ‘datastore_name’

  1. Now we have the ID and can remove it from the database
  2. Run the following 3 queries individually (Using the ID we got from the previous query):

delete from VPX_DS_ASSIGNMENT where DS_ID=ID;
delete from VPX_VM_DS_SPACE where DS_ID=
ID;
delete from VPX_DATASTORE where ID=
ID;

  1. Finally, run the following:

delete from VPX_ENTITY where ID=ID;

If you want to verify that everything went correctly, you can run the following:

select * from VPX_DS_ASSIGNMENT where DS_ID=ID;
select * from VPX_VM_DS_SPACE where DS_ID=ID;
select * from VPX_DATASTORE where ID=ID;
select * from VPX_ENTITY where ID=ID;

Now you’ve removed the datastore from the database and can start the vCenter Server Service again. If you don’t see that it has been removed, a reboot may help. I rebooted my server just to be on the safe side.

You can check out this VMware KB for more info.

ESXi Hosts Disconnecting Randomly

A recent issue we experienced was seeing hosts disconnecting from vCenter and reconnecting.  The host would drop and randomly come back for about an hour or more.  The VM’s never saw any issues nor was there any type of outage.  It was that vCenter could no longer see the host.

After quite a bit of troubleshooting, I started digging around in the vCenter Server Settings (Administration > vCenter Server Settings).  In this menu, there is a tab for Runtime settings.  I noticed that we only had the vCenter Server Name filled in and not the vCenter Server Managed IP. The window looks as follows:

vCenter Runtime SettingsAfter completing all the fields in this window, the hosts magically all reconnected and have not dropped again.  This is due to the fact that the hosts use these settings to check in with the vCenter box and they let the host know who it’s being managed by.  As you can guess, if the host doesn’t know who’s managing it, it doesn’t know who to check in with.

The more curious issue was that this field hadn’t even been filled out, but didn’t start immediately.  Which made troubleshooting more difficult and made us all panic as we started getting numerous alerts for hosts dropping.

As best practice, whether you only have 1 vCenter server, is to fill out all these fields and enure they are correct.  Especially if you want the host to check in with the correct vCenter server and you don’t want the heart attack of seeing numerous hosts suddenly disconnecting from vCenter.