Add robust monitoring of Azure Virtual Desktop using Azure Monitor alerts

Azure Virtual Desktop has become a hot topic. COVID has forced the adoption of remote working unlike anything else has in recent memory. Here's how we can monitor the environment, so you can stay on top of issues with access and desktop experience before your users tell you.


Enable Azure Monitoring for your hosts and install the agent. Update the Event logs it captures.


Conduct Log queries using Kusto Query Language to extract useful information.

Set alerts based on query results.


Monitor FSLogix using Event Signal. FSLogix is recommended as the underlying File System that supports profiles being attached to hosts in Azure Virtual Desktop. There are lots of considerations when using FSLogix. I'm only going to touch on one EventID. It contains lots of different types of errors, however, it shares the same EventID: 26.

Condition Signal: Event



Monitor for "No available resources" error.

WVDErrors
| where CodeSymbolic == "ConnectionFailedNoHealthyRdshAvailable" and Message contains "Could not find any SessionHost available in specified pool"



Monitor for Failed Connections.

WVDConnections
| where State =~ "Started" and Type =~"WVDConnections"
| extend Multi=split(_ResourceId, "/") | extend CState=iff(SessionHostOSVersion=="<>","Failure","Success")
| where CState =~"Failure"
| order by TimeGenerated desc
| where State =~ "Started" | extend Multi=split(_ResourceId, "/")
| project ResourceAlias, ResourceGroup=Multi[4], HostPool=Multi[8], SessionHostName, UserName, CState=iff(SessionHostOSVersion=="<>","Failure","Success"), CorrelationId, TimeGenerated
| join kind= leftouter (WVDErrors) on CorrelationId
| extend DurationFromLogon=datetime_diff("Second",TimeGenerated1,TimeGenerated)
| project  TimeStamp=TimeGenerated, DurationFromLogon, UserName, ResourceAlias, SessionHost=SessionHostName, Source, CodeSymbolic, ErrorMessage=Message, ErrorCode=Code, ErrorSource=Source ,ServiceError, CorrelationId
| order by TimeStamp desc



Monitor Session hosts available Memory when it drops to under 1GB.

Perf
| where ObjectName == "Memory"
| where CounterName == "Available Mbytes"
| where CounterValue <= 1024



Monitor Session hosts memory as % committed bytes when it's above 80%.  This represents how much the CPU is having to do extra work by referring to pagefile.

Signal: % Committed Bytes in Use




Monitor for when Session hosts are simply Out of Memory.
WVDErrors
| where CodeSymbolic == "OutOfMemory" and Message contains "The user was disconnected because the session host memory was exhausted."

Here's the JSON to deploy the whole thing.
Action Group:

{
  "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "actionGroupName": {
      "type": "string",
      "metadata": {
        "description": "Unique name (within the Resource Group) for the Action group."
      }
    },
    "actionGroupShortName": {
      "type": "string",
      "metadata": {
        "description": "Short name (maximum 12 characters) for the Action group."
      }
    },
	"alertEmailAddress": {
	"type": "string",
      "metadata": {
        "description": "Should be Azure_Alerts@domain.com"
		}
  }
  },
  "resources": [
    {
      "type": "Microsoft.Insights/actionGroups",
      "apiVersion": "2019-03-01",
      "name": "[parameters('actionGroupName')]",
      "location": "Global",
      "properties": {
        "groupShortName": "[parameters('actionGroupShortName')]",
        "enabled": true,
        "emailReceivers": [
          {
            "name": "Email RR Desk",
            "emailAddress": "[parameters('alertEmailAddress')]",
            "useCommonAlertSchema": true
          }
        ]
      }
    }
  ],
  "outputs":{
      "actionGroupId":{
          "type":"string",
          "value":"[resourceId('Microsoft.Insights/actionGroups',parameters('actionGroupName'))]"
      }
  }
}

Alerts in JSON:

{
  "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.j
son#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "client_Name": {
       "defaultValue": "(Client Name)",
       "type": "String"
    },
    "subscription_Id": {
       "defaultValue": "/subscriptions/(subId)",
       "type": "String"
    },
    "activityLogAlerts_name1": {
       "defaultValue": "WVD Service Health Alert",
       "type": "String"
    },
    "scheduledRule_name1": {
        "defaultValue": "WVD 'No available resources'",
        "type": "String"
    },
    "scheduledRule_name2": {
        "defaultValue": "WVD Available Host Memory",
        "type": "String"
    },
    "scheduledRule_name3": {
        "defaultValue": "WVD Failed Connections",
        "type": "String"
    },
    "scheduledRule_name4": {
        "defaultValue": "WVD Error - Out of Memory",
        "type": "String"
    },
    "metricAlerts_name1": {
        "defaultValue": "WVD Pct Processor committed bytes utilization",
        "type": "String"
    },
    "metricAlerts_name2": {
        "defaultValue": "(storAcctName) Capacity Alert",
        "type": "String"
    },
    "metricAlerts_name3": {
        "defaultValue": "WVD FSLogix Errors",
        "type": "String"
    },
    "storageAcct_Region": {
        "defaultValue": "uswest2",
        "type": "String"
    },
    "storageAcct_ThresholdTB": {
        "defaultValue": "24TB",
        "type": "String"
    },
    "storageAcct_ThresholdBytes": {
        "defaultValue": "26388279066624",
        "type": "String"
    },
    "workspaces_externalId": {
        "defaultValue": "/subscriptions/(subId)/resourceGroups/(rgName)/providers/Microso
ft.OperationalInsights/workspaces/(LAWName)",
        "type": "String"
    },
    "storageAccounts_externalId": {
        "defaultValue": "/subscriptions/(subId)/resourceGroups/(rgName)/providers/Microso
ft.Storage/storageAccounts/(storAcctName)",
        "type": "String"
    },
    "actiongroups_EmailDesk_externalId": {
      "defaultValue": "/subscriptions/(subId)/resourceGroups/(rgName)/providers/microsoft
.insights/actionGroups/(actionGroupName)",
      "type": "String"
    }
  },
  "variables": {},
  "resources": [
    {
      "type": "microsoft.insights/activityLogAlerts",
      "apiVersion": "2020-10-01",
      "name": "[concat(parameters('client_Name'), '- ',parameters('activityLogAlerts_Name
1'))]",
      "location": "Global",
      "properties": {
        "scopes": [
          "[parameters('subscription_Id')]"
        ],
        "condition": {
          "allOf": [
            {
              "field": "category",
              "equals": "ServiceHealth"
            },
            {
              "field": "properties.impactedServices[*].ServiceName",
              "containsAny": [
                "Windows Virtual Desktop"
              ]
            },
            {
              "field": "properties.impactedServices[*].ImpactedRegions[*].RegionName",
              "containsAny": [
                "East US",
                "East US 2",
                "Global",
                "South Central US",
                "West US",
                "West US 2"
              ]
            }
          ]
        },
        "actions": {
          "actionGroups": [
            {
              "actionGroupId": "[parameters('actiongroups_EmailDesk_externalId')]",
              "webhookProperties": {}
            }
          ]
        },
        "enabled": true,
        "description": "[concat(parameters('client_Name'), '- ',parameters('activityLogAl
erts_Name1'))]"
      }
    },
    {
        "type": "microsoft.insights/scheduledqueryrules",
        "apiVersion": "2021-02-01-preview",
        "name": "[concat(parameters('client_Name'), '- ',parameters('scheduledRule_name1'
))]",
        "location": "eastus2",
        "properties": {
          "displayName": "[concat(parameters('client_Name'), '- ',parameters('scheduledRu
le_name1'))]",
          "description": "[concat(parameters('client_Name'), '- ',parameters('scheduledRu
le_name1'))]",
          "severity": 1,
          "enabled": true,
          "evaluationFrequency": "PT5M",
          "scopes": [
            "[parameters('workspaces_externalId')]"
          ],
          "windowSize": "PT15M",
          "criteria": {
            "allOf": [
              {
                "query": "WVDErrors\n| where CodeSymbolic == \"ConnectionFailedNoHealthyR
dshAvailable\" and Message contains \"Could not find any SessionHost available in specifi
ed pool\"\n",
                "timeAggregation": "Count",
                "operator": "GreaterThan",
                "threshold": 20,
                "failingPeriods": {
                  "numberOfEvaluationPeriods": 1,
                  "minFailingPeriodsToAlert": 1
                }
              }
            ]
          },
          "autoMitigate": false,
          "actions": {
            "actionGroups": [
              "[parameters('actiongroups_EmailDesk_externalId')]"
              ]
              }
      }
    },
    {
        "type": "microsoft.insights/metricalerts",
        "apiVersion": "2018-03-01",
        "name": "[concat(parameters('client_Name'), '- ',parameters('metricAlerts_name1')
)]",
        "location": "global",
        "properties": {
            "description": "[concat(parameters('client_Name'), '- ',parameters('metricAle
rts_name1'))]",
            "severity": 2,
            "enabled": false,
            "scopes": [
                "[parameters('workspaces_externalId')]"
            ],
            "evaluationFrequency": "PT5M",
            "windowSize": "PT5M",
            "criteria": {
                "allOf": [
                    {
                        "threshold": 80,
                        "name": "Metric1",
                        "metricNamespace": "Microsoft.OperationalInsights/workspaces",
                        "metricName": "Average_% Committed Bytes In Use",
                        "operator": "GreaterThanOrEqual",
                        "timeAggregation": "Maximum",
                        "criterionType": "StaticThresholdCriterion"
                    }
                ],
                "odata.type": "Microsoft.Azure.Monitor.SingleResourceMultipleMetricCriter
ia"
            },
            "autoMitigate": false,
            "targetResourceType": "Microsoft.OperationalInsights/workspaces",
            "actions": [
                {
                    "actionGroupId": "[parameters('actiongroups_EmailDesk_externalId')]",
                    "webHookProperties": {}
                }
            ]
            }
      },
    {
        "type": "microsoft.insights/metricalerts",
        "apiVersion": "2018-03-01",
        "name": "[concat(parameters('client_Name'), '- ',parameters('metricAlerts_name2')
)]",
        "location": "global",
        "properties": {
            "description": "[concat(parameters('metricAlerts_name2'), '- ',parameters('st
orageAcct_ThresholdTB'), ' ',parameters('metricAlerts_name2'))]",
            "severity": 1,
            "enabled": true,
            "scopes": [
                "[concat(parameters('storageAccounts_externalId'), '/fileServices/default
')]"
            ],
            "evaluationFrequency": "PT5M",
            "windowSize": "PT1H",
            "criteria": {
                "allOf": [
                    {
                        "threshold": "[parameters('storageAcct_ThresholdBytes')]",
                        "name": "Metric1",
                        "metricNamespace": "microsoft.storage/storageaccounts/fileservice
s",
                        "metricName": "FileCapacity",
                        "dimensions": [
                            {
                                "name": "FileShare",
                                "operator": "Include",
                                "values": [
                                    "fshare"
                                ]
                            }
                        ],
                        "operator": "GreaterThanOrEqual",
                        "timeAggregation": "Average",
                        "criterionType": "StaticThresholdCriterion"
                    }
                ],
                "odata.type": "Microsoft.Azure.Monitor.SingleResourceMultipleMetricCriter
ia"
            },
            "autoMitigate": false,
            "targetResourceType": "Microsoft.Storage/storageAccounts/fileServices",
            "targetResourceRegion": "[parameters('storageAcct_Region')]",
            "actions": [
                  {
                    "actionGroupId": "[parameters('actiongroups_EmailRRDesk_externalId')]
",
                    "webhookProperties": {}
                  }
            ]
        }
    },
    {
        "type": "microsoft.insights/scheduledqueryrules",
        "apiVersion": "2021-02-01-preview",
        "name": "[concat(parameters('client_Name'), '- ',parameters('scheduledRule_name2'
))]",
        "location": "eastus2",
        "properties": {
          "displayName": "[concat(parameters('client_Name'), '- ',parameters('scheduledRu
le_name2'))]",
          "description": "[concat(parameters('scheduledRule_name2'), ' below 1024(1GB)')]
",
          "severity": 2,
          "enabled": true,
          "evaluationFrequency": "PT5M",
          "scopes": [
            "[parameters('workspaces_externalId')]"
          ],
          "windowSize": "PT5M",
          "criteria": {
            "allOf": [
              {
                "query": "Perf\n| where ObjectName == \"Memory\"\n| where CounterName ==
\"Available Mbytes\"\n| where CounterValue <= 1024\n",
                "timeAggregation": "Count",
                "operator": "GreaterThanOrEqual",
                "threshold": 1,
                "failingPeriods": {
                  "numberOfEvaluationPeriods": 1,
                  "minFailingPeriodsToAlert": 1
                }
              }
            ]
          },
          "autoMitigate": false,
          "actions": {
            "actionGroups": [
              "[parameters('actiongroups_EmailDesk_externalId')]"
              ]
              }
    }
},
    {
        "type": "microsoft.insights/scheduledqueryrules",
        "apiVersion": "2021-02-01-preview",
        "name": "[concat(parameters('client_Name'), '- ',parameters('scheduledRule_name3'
))]",
        "location": "eastus2",
        "properties": {
          "displayName": "[concat(parameters('client_Name'), '- ',parameters('scheduledRu
le_name3'))]",
          "description": "[concat(parameters('scheduledRule_name3'), ' - More than 10 fai
led connections in 15 minutes.')]",
          "severity": 2,
          "enabled": true,
          "evaluationFrequency": "PT5M",
          "scopes": [
            "[parameters('workspaces_externalId')]"
          ],
          "windowSize": "PT15M",
          "criteria": {
            "allOf": [
              {
                "query": "WVDConnections\n| where State =~ \"Started\" and Type =~\"WVDCo
nnections\"\n| extend Multi=split(_ResourceId, \"/\") | extend CState=iff(SessionHostOSVe
rsion==\"<>\",\"Failure\",\"Success\")\n| where CState =~\"Failure\"\n| order by TimeGene
rated desc\n| where State =~ \"Started\" | extend Multi=split(_ResourceId, \"/\")\n| proj
ect ResourceAlias, ResourceGroup=Multi[4], HostPool=Multi[8], SessionHostName, UserName,
CState=iff(SessionHostOSVersion==\"<>\",\"Failure\",\"Success\"), CorrelationId, TimeGene
rated\n| join kind= leftouter (WVDErrors) on CorrelationId\n| extend DurationFromLogon=da
tetime_diff(\"Second\",TimeGenerated1,TimeGenerated)\n| project  TimeStamp=TimeGenerated,
 DurationFromLogon, UserName, ResourceAlias, SessionHost=SessionHostName, Source, CodeSym
bolic, ErrorMessage=Message, ErrorCode=Code, ErrorSource=Source ,ServiceError, Correlatio
nId\n| order by TimeStamp desc\n",
                "timeAggregation": "Count",
                "operator": "GreaterThanOrEqual",
                "threshold": 10,
                "failingPeriods": {
                  "numberOfEvaluationPeriods": 1,
                  "minFailingPeriodsToAlert": 1
                }
              }
            ]
          },
          "autoMitigate": false,
          "actions": {
            "actionGroups": [
              "[parameters('actiongroups_EmailDesk_externalId')]"
              ]
              }
        }
    },
    {
        "type": "microsoft.insights/metricAlerts",
        "apiVersion": "2018-03-01",
        "name": "[concat(parameters('client_Name'), '- ',parameters('metricAlerts_name3')
)]",
        "location": "global",
        "properties": {
          "description": "[parameters('metricAlerts_name3')]",
          "severity": 1,
          "enabled": true,
          "scopes": [
            "[parameters('workspaces_externalId')]"
          ],
          "evaluationFrequency": "PT5M",
          "windowSize": "PT15M",
          "criteria": {
            "allOf": [
              {
                "threshold": 0,
                "name": "Metric1",
                "metricNamespace": "Microsoft.OperationalInsights/workspaces",
                "metricName": "Event",
                "dimensions": [
                  {
                    "name": "EventLog",
                    "operator": "Include",
                    "values": [
                      "Microsoft-FSLogix-Apps/Operational"
                    ]
                  },
                  {
                    "name": "EventID",
                    "operator": "Include",
                    "values": [
                      "26"
                    ]
                  }
                ],
                "operator": "GreaterThan",
                "timeAggregation": "Total",
                "criterionType": "StaticThresholdCriterion"
              }
            ],
            "odata.type": "Microsoft.Azure.Monitor.SingleResourceMultipleMetricCriteria"
          },
          "autoMitigate": false,
          "targetResourceType": "Microsoft.OperationalInsights/workspaces",
          "actions": [
              {
                "actionGroupId": "[parameters('actiongroups_EmailDesk_externalId')]",
                "webhookProperties": {}
              }
            ]
        }
    },
    {
        "type": "microsoft.insights/scheduledqueryrules",
        "apiVersion": "2021-02-01-preview",
        "name": "[concat(parameters('client_Name'), '- ',parameters('scheduledRule_name4'
))]",
        "location": "eastus2",
        "properties": {
          "displayName": "[concat(parameters('client_Name'), '- ',parameters('scheduledRu
le_name4'))]",
          "description": "[parameters('scheduledRule_name4')]",
          "severity": 1,
          "enabled": true,
          "evaluationFrequency": "PT5M",
          "scopes": [
            "[parameters('workspaces_externalId')]"
          ],
          "windowSize": "PT30M",
          "criteria": {
            "allOf": [
              {
                "query": "WVDErrors\n| where CodeSymbolic == \"OutOfMemory\" and Message
contains \"The user was disconnected because the session host memory was exhausted.\"\n",
                "timeAggregation": "Count",
                "operator": "GreaterThan",
                "threshold": 20,
                "failingPeriods": {
                  "numberOfEvaluationPeriods": 1,
                  "minFailingPeriodsToAlert": 1
                }
              }
            ]
          },
          "autoMitigate": false,
          "muteActionsDuration": "PT1H",
          "actions": {
            "actionGroups": [
              "[parameters('actiongroups_EmailDesk_externalId')]"
              ]
              }
    }
}
]
}


Simplify deployment with Powershell Remoting

Sure, you can use Powershell Desired State Configuration, but let's say you don't want to. Use PSRemoting.


Create a domain using Azure Active Directory Domain Services. Create a user, put them in AAD DC Administrators group in Azure AD. Change their password.


Note: AADDS copies all users and groups from Azure Active Directory. In order to authenticate to the new domain, the password must be changed. This is a security feature. How would you feel if Azure was able to copy your password? Not good, eh?


Once that's complete, let's join a VM to the new domain. Create a VM, ensure it can route/connect to the same network the AADDS servers were provisioned in.


Open Azure Cloudshell and choose Powershell.


Save powershell as .ps1 file and upload to Cloudshell.


Execute and follow prompts.


Connect-AzAccount -UseDeviceAuthentication
$vmName = Read-host "VM Name?"
$rgName = Read-host "Resource Group Name?"
Write-Host "Enabling PSRemoting and disabling Cert checking..." -ForegroundColor Green
Install-Module pswsman -Force
Disable-WSManCertVerification -All
Enable-AZVMPSRemoting -Name $vmName -ResourceGroupName $rgName -Protocol https -OsType Windows
Write-Host "Adding VM to Domain and rebooting..." -ForegroundColor Green
Invoke-AzVMCommand -Name $vmName -ResourceGroupName $rgName -ScriptBlock {
    Add-Computer -DomainName "domain.com" -restart -force -confirm;
    } -Credential (Get-Credential)

Export Azure File Shares files to CSV

Just like it sucks to search directories and file names, it sucks to output files and their sizes.

Note: You'll need Subscription ID, Resource Group Name, Storage Account Name, and Azure Share Name

# Set mandatory parameters
param 
(
    [Parameter(Mandatory=$true)]
    [string] $subscriptionId,
    [Parameter(Mandatory=$true)]
    [string] $resourceGroupName,
    [Parameter(Mandatory=$true)]
    [string] $storageAccountName,
    [Parameter(Mandatory=$true)]
    [string] $storageShareName
)
# Repeat Loop start
Do {
Connect-AzAccount
# Set Azure context
$context = Set-AzContext -SubscriptionId $subscriptionId
# Get Storage Context
Write-Host "Searching for files in $storageShareName in $storageAccountName :" -ForegroundColor Green
Write-Host "=================================================================" -ForegroundColor Green
$key = Get-AzStorageAccountkey -ResourceGroupName $resourceGroupName -Name $storageAccountName
$storContext = New-AzStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $key.value[0]
$getAzSF = Get-AzStorageFile -ShareName $storageShareName -context $storContext
# Iterate through directories, find files and export Dir, Name and Length
	$(foreach ($dir in $getAzSF)
		{
			Get-AzStorageFile -ShareName $storageShareName -context $storContext -Path $($dir.name) | Get-AzStorageFile | Select-Object -Property @{L='Directory';E={$($dir.name)}}, Name, Length
		})| Export-Csv -Path .\$storageShareName.csv
    Write-host "Data exported to CSV!"
    Write-host ""
$repeat = Read-Host "Repeat?"
}
While ($repeat -eq "Y")
# Repeat Loop end
Write-Host ""
Write-Host "EXITING... " -ForegroundColor Yellow -BackgroundColor Black
Write-Host ""
# Disconnect
Disconnect-AzAccount > $null
Write-Host "ACCOUNT HAS BEEN DISCONNECTED" -ForegroundColor Yellow -BackgroundColor Black
# end

Searching Azure File Share to match string

It sucks. There's a search directory box in Explorer, so wtf Azure?! Can't search by anything except full directory name even with wildcards. Until now. Search for Directories and Files that match your string.

Note: You'll need Subscription ID, Resource Group Name, and Storage Account name of where you want to search.

# Set mandatory parameters
param 
(
    [Parameter(Mandatory=$true)]
    [string] $subscriptionId,
    [Parameter(Mandatory=$true)]
    [string] $resourceGroupName,
    [Parameter(Mandatory=$true)]
    [string] $storageAccountName
)
# Repeat Loop start
Do {
Connect-AzAccount
# Set Azure context
$context = Set-AzContext -SubscriptionId $subscriptionId
# Search string
$searchString = Read-Host -Prompt "Please enter string to search for"
# Get Storage Context
Write-Host "Searching Directories for $searchString in $storageAccountName :" -ForegroundColor Green
Write-Host "====================================================" -ForegroundColor Green
$key = Get-AzStorageAccountkey -ResourceGroupName $resourceGroupName -Name $storageAccountName
$storContext = New-AzStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $key.value[0]
# Get Share Names
$shareNames = Get-AzRmStorageShare -ResourceGroupName $resourceGroupName -StorageAccountName $storageAccountName
foreach ($sName in $shareNames)
{
        $getAzSF = Get-AzStorageFile -ShareName $sName.name -context $storContext
# Iterate through directories, find Dir matches
        foreach ($dir in $getAzSF)
		{
			if ($dir.name -match $searchString)
			{
				Write-host "Path:  $($dir.name) " -ForegroundColor Yellow
		    }
        }
Write-Host "Searching Files for $searchString in $storageAccountName :" -ForegroundColor Green
Write-Host "====================================================" -ForegroundColor Green
# Iterate through directories, find File matches
foreach ($dir in $getAzSF)
		{
			$fileName = Get-AzStorageFile -ShareName $sName.name -context $storContext -Path $($dir.name) | Get-AzStorageFile | Select-Object -Property @{L='Directory';E={$($dir.name)}}, Name, Length
            if ($filename.name -match $searchString)
                {
                    Write-host "Path:  $($dir.name) " -ForegroundColor Yellow
                    Write-host "File Name: $($filename.name) " -ForegroundColor Yellow
                }
    }
$repeat = Read-Host "Repeat?"
}
}
While ($repeat -eq "Y")
# Repeat Loop end
Write-Host ""
Write-Host "EXITING... " -ForegroundColor Yellow -BackgroundColor Black
Write-Host ""
# Disconnect
Disconnect-AzAccount | Out-Null
Write-Host "ACCOUNT HAS BEEN DISCONNECTED" -ForegroundColor Yellow -BackgroundColor Black
# end