NSX Alarm Check Script (Using the NSX REST API)

In my previous blog I created a script to get the last backup status from NSX Manager in order to quickly check multiple NSX Managers. Today I had a need to bring the alarms raised in all of these NSX Managers into one single location, which necessitated creating an NSX Alarm Check Script.
‘But surely the NSX Management Pack would allow you to do this you?’ may ask. Unfortunately it is missing some of the alarms which gets raised on the NSX Managers such as passwords expiring for example. This one being an annoyance if you do not notice until after it’s expired and you are having LDAP issues.

Now luckily this time, we CAN use the NSX REST API to get these details, and I had a script lying around which could provide a skeleton for this. You can find that script here: NSX Backup Check Script

In order to adapt this script to use REST we need to change the Invoke-WebMethod to Invoke-RestMethod

Interrogating NSX REST API

I used the documentation from VMware {code} to find this API and how to handle the results. Luckily this is a lot more detailed than the web API. You can find the NSX API details here:

https://developer.vmware.com/apis/547/nsx-t

so we want to request /api/v1/alarms in order to return a list of all alarms on the nsx managers.

$result = Invoke-RestMethod -Uri https://$nsxmgr/api/v1/alarms -Headers $Header -Method 'GET' -UseBasicParsing

Handling the Outputs.

Running this command will give a response similar to this:

{
  "result_count": 4,
  "results": [
      {
        "id": "xxxx",
        "status": "OPEN",
        "feature_name": "manager_health",
        "event_type": "manager_cpu_usage_high",
        "feature_display_name": "Manager Health",
        "event_type_display_name": "CPU Usage High",
        "node_id": "xxxx",
        "last_reported_time": 1551994806,
        "description": |
          "The CPU usage for the manager node identified by 
           appears to be\nrising.",
        "recommended_action": |
          "Use the top command to check which processes have the most CPU
           usages, and\nthen check \/var\/log\/syslog and these processes'
           local logs to see if there\nare any outstanding errors to be
           resolved.",
        "node_resource_type": "ClusterNodeConfig",
        "severity": "WARNING",
        "entity_resource_type": "ClusterNodeConfig",
      },
      ...
  ]
}

From this output I wanted to pull out the severity, status, alarm description and the node which was impacted, so I pulled these into an array and add the items to variables.

$nsxAlarms = $result.results 
    foreach ($nsxAlarm in $nsxalarms) {
        $nsxAlarmCreated = (get-date 01.01.1970).AddSeconds([int]($nsxAlarm._create_time/1000)).ToString("yyyy/MM/dd HH:mm:ss")
        $timestamp = (Get-Date).ToString("yyyy/MM/dd HH:mm:ss")
        $nsxAlarmSeverity = $nsxAlarm.severity
        $nsxAlarmStatus = $nsxAlarm.status
        $nsxAlarmNode_display_name = $nsxAlarm.node_display_name
        $nsxAlarmDescription = $nsxAlarm.description

From here I wanted to only include any alarms which had not been marked acknowledged or resolved to avoid constantly reporting a condition which was known about.

if($nsxAlarm.status -ne "ACKNOWLEDGED" -and $nsxAlarm.status -ne "RESOLVED"){ 
    Add-Content -Path $exportpath -Value "$timestamp [$nsxAlarmSeverity] $NSXMGR - Alarm Created $nsxAlarmCreated Status $nsxAlarmStatus Affected Node $nsxAlarmNode_display_name Description  $nsxAlarmDescription"
}

It is also possible to bypass this by running the following command, however I wanted to pull in all alarms for my specific use case.

GET /api/v1/alarms?status=OPEN

As per the previous script, this was wrapped in a try catch and the catch failure tested if the host was up. A full explanation can be found on the blog about this script here: NSX Backup Check Script

.

The Final NSX Alarm Check Script

param ($nsxmgr)

$curDir = &{$MyInvocation.PSScriptRoot}
$exportpath = "$curDir\..\Logs\NSXAlarmCheck.log"
$credPath = "$curDir\$nsxmgr.cred"
$scriptName = &{$MyInvocation.ScriptName}

add-type @"
   using System.Net;
   using System.Security.Cryptography.X509Certificates;
   public class TrustAllCertsPolicy : ICertificatePolicy {
      public bool CheckValidationResult(
      ServicePoint srvPoint, X509Certificate certificate,
      WebRequest request, int certificateProblem) {
      return true;
   }
}
"@
[System.Net.ServicePointManager]::CertificatePolicy = New-Object TrustAllCertsPolicy

function catchFailure {
    $timestamp = (Get-Date).ToString("yyyy/MM/dd HH:mm:ss")
    if (Test-Connection -BufferSize 32 -Count 1 -ComputerName $nsxmgr -Quiet) {
        Add-Content -Path $exportpath -Value "$timestamp [ERROR] $NSXMGR - $_"
    }
    else {
        Add-Content -Path $exportpath -Value "$timestamp [ERROR] $NSXMGR - Host Not Found"
    }
exit
}

if (!$nsxmgr) {
    Write-Host "please provide parameter 'nsxmgr' in the format '$scriptName -nsxmgr [FQDN of NSX Manager]'"
    exit
    }

if (-Not(Test-Path -Path  $credPath)) {
    $username = Read-Host "Enter username for NSX Manager" 
    $pass = Read-Host "Enter password" -AsSecureString 
    $password = [System.Runtime.InteropServices.Marshal]::PtrToStringAuto([System.Runtime.InteropServices.Marshal]::SecureStringToBSTR($pass))
    $userpass  = $username + ":" + $password

    $bytes= [System.Text.Encoding]::UTF8.GetBytes($userpass)
    $encodedlogin=[Convert]::ToBase64String($bytes)
    
    Set-Content -Path $credPath -Value $encodedlogin
}

$encodedlogin = Get-Content -Path $credPath

$authheader = "Basic " + $encodedlogin
$header = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$header.Add("Authorization",$authheader)

try{
    $result = Invoke-RestMethod -Uri https://$nsxmgr/api/v1/alarms -Headers $Header -Method 'GET' -UseBasicParsing

        $nsxAlarms = $result.results 
        foreach ($nsxAlarm in $nsxalarms) {
            
            $nsxAlarmCreated = (get-date 01.01.1970).AddSeconds([int]($nsxAlarm._create_time/1000)).ToString("yyyy/MM/dd HH:mm:ss")
            $timestamp = (Get-Date).ToString("yyyy/MM/dd HH:mm:ss")
            $nsxAlarmSeverity = $nsxAlarm.severity
            $nsxAlarmStatus = $nsxAlarm.status
            $nsxAlarmNode_display_name = $nsxAlarm.node_display_name
            $nsxAlarmDescription = $nsxAlarm.description

            if($nsxAlarm.status -ne "ACKNOWLEDGED" -and $nsxAlarm.status -ne "RESOLVED"){ 
                Add-Content -Path $exportpath -Value "$timestamp [$nsxAlarmSeverity] $NSXMGR - Alarm Created $nsxAlarmCreated Status $nsxAlarmStatus Affected Node $nsxAlarmNode_display_name Description  $nsxAlarmDescription"
            }
        
    }
 }
catch {catchFailure}

Overview

The final script above can be altered to be used as a skeleton for any other Invoke-RestRequest APIs as well as simply being adapted for Web API. I will be following up this post with further updates to adapt the script in order to use PowerCLI, which required a different credential store.