AZ-104 Domain 05: Monitor and Maintain Azure Resources

Azure Key Vault — Application Access

Azure Key Vault stores secrets, keys, and certificates for applications. When an Azure-hosted application needs to retrieve a secret from Key Vault, the recommended pattern is to use a managed identity — either system-assigned or user-assigned — rather than storing credentials in application configuration. The managed identity authenticates to Key Vault using Azure Active Directory, eliminating the need to manage or rotate any secrets in the application itself.

Granting an Application Access to Key Vault

Enable a managed identity on the Azure resource (App Service, VM, Function App, Container Instance, etc.).
Assign the managed identity the Key Vault Secrets User role on the vault (for read-only secret access) or Key Vault Secrets Officer for read/write.
The application's code uses the DefaultAzureCredential or ManagedIdentityCredential SDK classes to authenticate automatically — no credentials are stored in code or configuration.

Key Vault Network Access — Region Is Irrelevant

Key Vault access is not restricted by Azure region or resource group membership by default. Any application with the appropriate role assignment can access the vault over the internet or via private endpoint. Network-level restrictions are configured through the vault's Networking blade — either allowing all networks, selected VNets/IP ranges, or requiring a private endpoint. The region of the application relative to the vault has no bearing on access permissions.

Azure Backup Vault vs Recovery Services Vault — Workload Mapping

Azure Backup uses two distinct vault types, and each data source maps to exactly one vault type. Understanding this mapping is critical because trying to back up a workload to the wrong vault type will fail — the option simply will not appear in the portal for that vault.

Data Source / Workload	Azure Backup Vault	Recovery Services Vault
Azure Virtual Machines	No	Yes
Azure Managed Disks (standalone)	Yes	No
Azure Blobs (operational backup)	Yes	No
Azure Files (file shares)	No	Yes
Azure Database for PostgreSQL	Yes	No
Azure Kubernetes Service (AKS)	Yes	No
SQL Server in Azure VM	No	Yes
SAP HANA in Azure VM	No	Yes
On-premises (MARS / MABS)	No	Yes

⚠ Blobs vs File Shares — Different Vaults, Same Storage Account

If a storage account contains both a blob container and a file share, they back up to different vault types. The file share backs up to a Recovery Services vault. The blob container backs up to an Azure Backup vault (as operational backup). There is no vault type that handles both. This is the most common source of confusion in backup vault selection questions.

Backup for Azure Managed Disks — Sequence & Managed Identity

Backing up a standalone Azure managed disk (not an entire VM) requires an Azure Backup vault — not a Recovery Services vault. The Backup vault needs a managed identity to authenticate to the disk and the disk's resource group when performing backup and restore operations. This managed identity must be granted the necessary permissions before a backup policy can be configured successfully.

Correct Sequence for Managed Disk Backup

Create an Azure Backup vault — disk backup uses a Backup vault, not a Recovery Services vault.
Configure a managed identity on the Backup vault — a system-assigned managed identity enables the vault to authenticate when accessing the disk and snapshot resources during backup.
Assign the required permissions — the vault's managed identity needs the Disk Backup Reader role on the disk and the Disk Snapshot Contributor role on the resource group where snapshots will be stored.
Create a backup policy and configure the backup — select the specific disk, set the backup schedule, and set the retention period.

⚠ Disk Backup Uses Backup Vault, Not Recovery Services Vault

A Recovery Services vault backs up entire VMs, Azure Files, and SQL/SAP workloads — it cannot back up standalone managed disks. When the requirement is to back up only a single data disk (Disk2) rather than the entire VM, the correct vault type is the Azure Backup vault, and a managed identity on that vault is required for the operation to succeed.

Recovery Services Vault — Region Constraints, Vault Selection & Moving VMs

A Recovery Services vault can only protect resources located in the same Azure region as the vault itself. Resource group membership is completely irrelevant — two resources can be in the same resource group but different regions, and the vault can only protect the one in its own region. This constraint applies to every workload the vault supports.

RSV Protects Same-Region Resources Only

A vault in Central US can back up VMs in Central US, regardless of which resource group they are in.
That same vault cannot back up a VM in West US even if both are in the same resource group.
For Azure Files, the storage account containing the share must be in the same region as the vault.
Diagnostic settings for the vault (e.g. Azure Backup Reports logs) must also point to a storage account or Log Analytics workspace in the same region as the vault.

Moving a VM from One RSV to Another

A VM can only be actively protected by one Recovery Services vault at a time. To switch a VM from RSV1 to RSV2, you must first stop the backup in RSV1 (Stop Protection — either retain or delete backup data). Only after stopping the backup in RSV1 can you configure a new backup in RSV2. Attempting to configure backup in RSV2 while the VM is still registered with RSV1 will fail or produce unexpected results.

Active Backup Protection Blocks Resource Group Deletion

If a resource group contains a Recovery Services vault with active backup items — for example, a SQL database actively registered and backed up — the resource group cannot be deleted while that protection is active. The first step is to stop the backup protection of the backed-up resource. Only then can the vault be deleted, and then the resource group. Deleting the VM or storage account first does not resolve the blocking dependency — the active backup registration in the vault is what prevents deletion.

Azure Backup — File-Level Recovery & MARS Agent Restores

Azure Backup supports two distinct approaches to restoring data from a VM backup: a full VM restore (which recreates or replaces the entire virtual machine) and file-level recovery (which mounts the backup snapshot as a temporary drive, allowing individual files to be copied out). For recovering a small number of deleted files as quickly as possible, file-level recovery is the correct approach — it avoids the overhead of a full VM restore and works without requiring the original VM to be stopped.

File-Level Recovery Sequence

From the Azure portal, navigate to the Recovery Services vault → Backup Items → the VM → click File Recovery.
Select a restore point that contains the files you need to recover.
Download and run the script that mounts the backup snapshot as a drive on the local computer.
Copy the required files using File Explorer (Windows) or standard copy commands (Linux).

File Recovery from VM Snapshot — Any Windows Computer

When a VM is backed up using Azure VM Backup (without the MARS agent — i.e., snapshot-level backup), the file recovery script can be run on any Windows computer with internet connectivity, not just Azure VMs. The script mounts the recovery point as a local disk on the machine where it runs. This is different from MARS agent-level backups, which require the MARS agent to be installed on the target machine.

Restoring a MARS Agent Backup to a Different Machine

The Microsoft Azure Recovery Services (MARS) Agent backs up files and folders from a VM at the file-system level, rather than taking a full disk snapshot. When a MARS backup taken on VM1 needs to be restored to VM2, the first step is to install the MARS Agent on VM2. The agent handles the authentication to the Recovery Services vault and the restore operation. Without the agent installed on the target machine, there is no mechanism to initiate the restore there. Installing Windows Server Backup (a Windows feature) on VM2 is not the answer — Windows Server Backup handles Windows-native backup, not Azure MARS agent restores.

VM Backup — Restore Options & What "Replace Existing" Restores

When restoring a VM from an Azure Backup recovery point, two fundamental restore paths are available. The choice between them depends on whether you need to keep the original VM running alongside the restored version, or whether you want to restore directly into the existing VM in place. Understanding exactly what each option restores — and what it does not — is important for post-restore work.

Two VM Restore Paths Compared

	Replace Existing	Create New
What it does	Replaces the existing VM's disks with the backup disks in place	Creates a brand-new VM from the backup recovery point
Original VM status	Must be deallocated (stopped) first	Can remain running
VM identity	Same NIC, IP address, and VM name are preserved	New VM with a new identity
Best for	In-place recovery with minimal downtime; minimum extra cost	Side-by-side recovery; testing the restore before decommissioning the original

⚠ VM Must Be Deallocated Before "Replace Existing"

"Replace existing" cannot run while the VM is in a running state — the disks cannot be swapped while they are in use. The first step is always to deallocate (stop) the VM, then initiate the Replace existing restore. After the restore completes, the VM is started again with the restored disk contents.

VM Size Is NOT Restored by "Replace Existing"

The "Replace existing" restore operation replaces the disk contents — files, OS configuration, passwords, applications, and data — but it does not restore the VM's size (SKU). VM size is an Azure Resource Manager metadata property stored outside the disk image. If the VM was resized after the backup was taken, the size change must be re-applied manually after the restore completes. A post-restore task to "modify the size of VM1" is always required if the size changed after backup.

Restoring to a Point Before the Instant Snapshot Window

Azure Backup retains two layers of recovery points: instant snapshots (retained for 1–5 configurable days) and vault-backed recovery points (retained according to the backup policy — days, weeks, months, years). If the desired recovery point falls outside the instant snapshot retention window — for example, recovering to a point 8 days ago when snapshots are kept for only 5 days — you must use a vault-backed recovery point. The "Replace existing" option is still available; it is just slightly slower because the data must be retrieved from vault storage rather than a local snapshot.

Backup Retention Policy Hierarchy — Longest Retention Wins

Azure Backup policies for VMs allow multiple retention tiers to be configured simultaneously: daily, weekly, monthly, and yearly. When a backup point qualifies for multiple retention tiers — for example, a Sunday backup that is both the weekly and the monthly backup — the backup is retained for whichever tier has the longest retention period. The hierarchy always resolves to the most durable outcome.

Retention Hierarchy — Checking from Longest to Shortest

When determining how long a specific backup will be retained, check the tiers in order from longest to shortest: yearly → monthly → weekly → daily. The first tier that applies to that backup point determines the retention. For example, a backup taken on March 1 (a Sunday) that qualifies as a weekly backup (10 weeks retention) and also as a monthly backup (36 months retention): the monthly retention wins — it will be kept for 36 months, not 10 weeks. Never default to daily retention without first checking whether higher tiers apply to that backup point.

VM Backup Pre-Check Status — Passed, Warning & Error

Before Azure Backup begins protecting a virtual machine, it validates the VM's configuration and readiness through a Pre-Check evaluation. The result appears in the Recovery Services vault as one of three status levels, each with different implications for whether backup will proceed.

Pre-Check Status Levels

Status	Meaning	Backup Proceeds?	Common Causes
Passed	All validations succeeded; VM is fully ready	Yes	VM agent is current, VM is running, configuration is supported
Warning	Backup can proceed but a non-critical issue was detected	Yes — with reduced capability	Outdated Azure VM Agent (WaAppAgent.exe on Windows; walinuxagent on Linux)
Error	A blocking issue prevents backup from proceeding	No — must resolve first	Recovery Services vault unavailable; VM in unsupported configuration

⚠ Warning Most Commonly Means Outdated VM Agent

The most common cause of a Warning pre-check status is that the Azure VM Agent (WaAppAgent.exe on Windows) is not the latest version. Azure Backup can still proceed and create recovery points, but the backup may lack full functionality or face compatibility issues. To resolve the Warning and restore the pre-check to Passed, update the Azure VM Agent to the latest version. A stopped (deallocated) VM does not itself cause a Warning — the agent version is the key factor.

App Service Backup — Storage Account, Not a Vault

Azure App Service backups work differently from VM or file-share backups. App Service backup does not use either an Azure Backup vault or a Recovery Services vault. Instead, it writes backup ZIP files directly to a blob container in an Azure Storage account. The storage account must be in the same subscription as the web app.

App Service Backup Configuration

The first thing to create before configuring App Service backup is an Azure Storage account (and optionally a specific blob container within it).
Backup configuration is done in the web app's portal blade under Backup, where you specify the storage account, the schedule, and the retention count.
To exclude specific folders from the backup, create a file named _backup.filter in the web app's wwwroot directory and list the folder paths to exclude. A backup policy, a lock, or a WebJob cannot exclude folders from App Service backup.

⚠ App Service Backup Requires Standard Tier or Higher

The backup feature is not available on Free or Shared App Service plans. It requires at least the Standard pricing tier. If the staging slots option or backup option is unavailable in the portal, the first fix is to scale up the App Service plan to Standard or higher — not to add a custom domain or modify application settings.

Recovery Services Vault — Multi-User Authorization & Resource Guard

Multi-User Authorization (MAU) for Recovery Services vaults implements a two-person integrity model for critical backup operations. Without MAU, a single administrator with sufficient vault permissions can stop protection, delete backup data, or modify policies unilaterally. With MAU enabled, these operations require approval from a second authority, protecting backups from accidental or malicious changes.

Resource Guard — The Prerequisite for MAU

Before MAU can be enabled on a Recovery Services vault, a Resource Guard must first be created. The Resource Guard is a standalone Azure resource that acts as the second authority in the two-person model. It is ideally owned by a different administrator or team than the vault owner, enforcing genuine separation of duties. Any critical vault operation (stopping protection, deleting backup data, changing MUA settings) will require explicit approval from the Resource Guard owner if MAU is active.

Enabling MAU — Sequence

Create a Resource Guard (ideally in a different subscription or under a different administrator's control).
Enable Multi-User Authorization on the Recovery Services vault and link it to the Resource Guard.

The Resource Guard must exist before the MAU setting can be configured on the vault. The order cannot be reversed.

⚠ MAU Requires a Resource Guard, Not a Managed Identity or Custom Role

The correct first resource to create when enabling MAU for a Recovery Services vault is a Resource Guard. A managed identity is used for authenticating applications to other services, not for MAU. An administrative unit scopes Entra ID admin permissions and has no role in backup MAU. A custom Azure role configures RBAC permissions and is also not involved in MAU. Only Resource Guard provides the second-authority mechanism that MAU depends on.

Resource Group Moves — Policy, Region & Lock Behaviour

Moving a resource to a different resource group is a metadata operation — the resource itself does not physically relocate to a new Azure region. It simply changes which resource group "owns" the resource in Azure Resource Manager. After the move, the destination resource group's policies apply to the resource, and its original Azure region is unchanged.

How Locks on Source and Destination Affect Moves

Lock Location	Lock Type	Move Allowed?	Why
Source RG	ReadOnly	No	Moving a resource modifies its metadata, which is a write operation — blocked by ReadOnly
Source RG	Delete	Yes	Delete lock only prevents deletion, not modifications
Destination RG	ReadOnly	No	Adding a resource to an RG is a write operation on that RG — blocked by ReadOnly
Destination RG	Delete	Yes	Delete lock does not prevent new resources from being added to the RG

Cross-Subscription Moves and Policy Effects

Resources can be moved across subscriptions, and the move itself is not blocked by the resource being in a different subscription. After the move, the destination resource group's Azure Policies apply to the resource. For example, if a web app in West Europe moves from RG1 (with Policy1 assigned) to RG2 (with Policy2 assigned), Policy2 now applies to the web app. The app's physical location in West Europe does not change — the region is a property of the resource, not the resource group.

Resource Locks — Scope, Inheritance & Where They Apply

Resource locks prevent accidental modification or deletion of Azure resources. They can be applied at the subscription, resource group, or individual resource level, and they inherit downward — a lock on a subscription applies to all resource groups and resources within it, and a lock on a resource group applies to all resources within that group.

Two Lock Types

ReadOnly — prevents any modifications to the locked resource. Read operations are still permitted. This is the most restrictive lock — it blocks writes, deletes, and configuration changes.
Delete (CanNotDelete) — prevents deletion only. Read and write (modify) operations are still permitted. This is appropriate when you want to allow configuration changes but prevent accidental deletion.

Lock Scope — Subscription, Resource Group, and Resource Only

Locks can be applied at subscription, resource group, and individual resource scope. Management Groups and the Tenant Root Group cannot have locks applied to them — locks stop at the subscription level. A lock on a subscription is inherited by all resource groups and resources within it. When asked "at which scope can a lock be applied?", the valid answers are subscription, resource group, and resource.

⚠ Even Owners Are Subject to Locks

Resource locks override RBAC role assignments. A user with the Owner role cannot delete a resource if it has a Delete lock — not without first removing the lock. Only users with explicit permission to manage locks (the Microsoft.Authorization/locks/delete action) can remove a lock. This is a deliberate protection mechanism that prevents even highly privileged users from accidentally or maliciously removing locked resources.

Tags — No Inheritance, Policy-Applied Tags Are Resource-Specific

Azure resource tags are not inherited from parent resources. A tag on a subscription does not automatically propagate to resource groups within it, and a tag on a resource group does not automatically appear on the individual resources within it. Each resource maintains its own independent set of tags. This distinction becomes important when an Azure Policy with an Append effect adds tags — the tag is added only to the specific resource being deployed, not to its parents or children.

How the Append Policy Effect Works with Tags

The Append effect fires when a resource is created or updated (PUT/PATCH operation). It adds the specified tag and value to the resource being deployed.
The tag is added only to that specific resource — not to the resource group it belongs to, not to the subscription, and not to other existing resources that were not touched.
Existing resources created before the policy was assigned will not automatically receive the new tag. They are marked as non-compliant until remediated or updated.

Worked Example — Append Policy with Exclusion

Policy: Scope = Sub1, Exclusion = Sub1/RG1/VNET1, Append Tag4:value4.

Resource	Own Tags	Policy Adds	Effective Tags
Sub1	Tag1:subscription	Tag4:value4	Tag1:subscription, Tag4:value4
RG1	Tag2:IT	Tag4:value4	Tag2:IT, Tag4:value4 (NOT Tag1 — no inheritance from Sub1)
storage1 (in RG1)	Tag3:value1	Tag4:value4	Tag3:value1, Tag4:value4 (NOT Tag1 or Tag2 — no inheritance)
VNET1 (excluded)	Tag3:value2	None — excluded	Tag3:value2 only

RG1 has Tag2:IT and Tag4:value4 — not Tag2:IT only, because the policy applies to RG1 itself (only VNET1 within it is excluded). Storage1 does not inherit Tag1 from Sub1 or Tag2 from RG1.

Azure Policy — Effects, Scope & Exclusions

Azure Policy evaluates resources against defined rules and applies an effect when a resource is found non-compliant or when a resource operation is intercepted. Understanding how each effect works — and how exclusions carve out exceptions — is essential for both designing policy-governed environments and interpreting compliance states.

Common Policy Effects

Deny — prevents the resource operation from completing if it violates the policy. The resource is not created or updated. For a policy that says "Not allowed resource types: Microsoft.Compute/virtualMachines", an attempt to create a VM in the scope is denied at deployment time.
Append — adds fields to the resource being created/updated (most commonly used to add tags). Does not block the operation; it only adds the specified property.
DeployIfNotExists — after a resource is created or updated, if a related resource or configuration is missing, Azure Policy triggers a remediation deployment to add it. Requires a managed identity on the policy assignment.
Audit — marks non-compliant resources in the compliance report but does not block anything or make any changes.
Modify — adds, removes, or modifies tags and properties on resources at creation/update time.

Exclusions — Carving Out Exceptions Within a Policy Scope

An exclusion removes a specific sub-scope from the policy's effect. When a policy is assigned to Subscription1 with an exclusion for ContosoRG1, the policy applies everywhere in Subscription1 except ContosoRG1. Resources in ContosoRG1 are not evaluated by that policy at all. Exclusions must be at or below the assignment scope — you cannot exclude something that is outside the scope the policy is assigned to.

What Doesn't Work for Automatic NSG Rule Enforcement

Assigning a built-in policy — no built-in Azure Policy exists that automatically injects a specific port-blocking security rule into every new NSG. A custom policy definition is required.
Resource locks — locks prevent deletion or modification of resources after creation; they cannot inject security rules at NSG creation time.
Unregistering Microsoft.ClassicNetwork — this affects legacy Classic-mode networking only; it has no effect on ARM-based NSGs.

Azure Policy — Valid Assignment Scopes & Exclusion Scopes

Azure Policy definitions can be assigned at several levels in the Azure resource hierarchy. The scope determines where the policy is enforced. Separately, when a policy is assigned, you can specify exclusions — sub-scopes that are carved out and exempted from the policy's effect.

Valid Assignment Scopes

A policy definition can be assigned at any of these levels in order from broadest to narrowest:

Tenant Root Group
Management Groups
Subscriptions
Resource Groups

Policies cannot be assigned at the individual resource level (e.g., a specific VM or storage account). The assignment scope determines the broadest boundary of effect, and child scopes inherit the policy.

Valid Exclusion Scopes

Exclusions can be specified at any scope that is at or below the assignment scope. If a policy is assigned to Subscription1, valid exclusions include: Subscription1 itself (though that would exempt everything), any resource group within Subscription1, or any individual resource within those resource groups — all the way down to individual VMs and storage accounts.

⚠ Individual Resources Can Be Excluded but Not Assigned

This is the asymmetry that is frequently tested: policies can be assigned only down to resource group scope, but they can be excluded all the way down to individual resource scope. A policy on Subscription1 can have an exclusion for a single VM (Sub1/RG1/VM1), but that same policy cannot be "assigned" directly to that VM.

RBAC — Custom Role Definitions

Custom RBAC roles allow you to define a precise set of permissions when no built-in role matches exactly the principle of least privilege. A custom role definition contains four key sections: actions (management-plane operations allowed), notActions (management-plane operations excluded), dataActions (data-plane operations allowed), notDataActions (data-plane operations excluded), and assignableScopes (where the role can be assigned).

Key Sections in a Custom Role Definition

actions — lists the management-plane operations the role permits. For example, "Microsoft.Network/virtualNetworks/*" permits all operations on VNets; "Microsoft.Network/virtualNetworks/read" permits only reading VNet configuration.
notActions — explicitly excludes specific operations from the actions list. For example, including "Microsoft.Authorization/*" in notActions prevents the role from managing access permissions, even if it is in a wildcard action.
dataActions — governs data-plane access (e.g., reading blob data, signing in to VMs). Virtual machine sign-in requires the "Microsoft.Compute/virtualMachines/login/action" data action — it is a dataAction, not an action.
assignableScopes — defines where the role can be assigned. To restrict a role to resource groups within a specific subscription: "/subscriptions/{subId}/resourceGroups/". To allow it everywhere: "/".

⚠ VM Sign-In Is a dataAction, Not an action

The ability to sign in to a virtual machine (interactive login) is a data-plane operation and must be placed in the dataActions section of the role definition, not the actions section. Similarly, reading blob data is a data action. If a custom role allows all VM management actions but users cannot sign in to the VM, the fix is to add the login data action to the dataActions section.

RBAC — Management Groups, Inheritance & Scope

Azure RBAC role assignments are inherited downward through the management group and subscription hierarchy. A role assigned at a management group scope applies to all subscriptions within that management group, all resource groups within those subscriptions, and all resources within those resource groups. More specific (narrower scope) assignments can add permissions but cannot revoke inherited ones — there is no deny concept in RBAC at the role assignment level (only Deployment Stacks and Deny Assignments can deny).

RBAC Inheritance — Additive, Never Subtractive

A user with Reader on management group MG1 inherits Reader on all subscriptions, RGs, and resources within MG1.
If that same user is also assigned Contributor on Sub1 (within MG1), the user has Contributor rights on Sub1 and Reader on everything else in MG1.
You cannot use RBAC to remove or override an inherited role — assigning a more restrictive role at a lower scope only adds that role; it does not remove the higher-scoped one.

Co-Administrator — Subscription Scope Only

The Co-Administrator role is a legacy role from the classic Azure management model. It can only be assigned at the subscription scope — not at management groups, resource groups, or individual resources. When asked "to what can you add a user as co-administrator?" the answer is the subscription, not a management group, resource group, or VM.

RBAC — Resource Policy Contributor Role

The Resource Policy Contributor role grants the ability to create, edit, and delete Azure Policy definitions and initiatives, as well as assign policies and create exemptions. It is the least-privilege role for users who need to manage the Azure Policy governance framework without having broad Contributor or Owner access to actual resources.

Who Needs Resource Policy Contributor

Users who need to create initiative definitions (groups of policy definitions) require Resource Policy Contributor at the scope where the initiatives will live — typically the subscription.
Users who need to assign initiatives to a specific resource group require Resource Policy Contributor scoped to that resource group.
Resource Policy Contributor does not grant the ability to modify the actual Azure resources that policies govern — it only governs the policy objects themselves.

⚠ Contributor ≠ Resource Policy Contributor

The plain Contributor role grants broad resource management permissions but does not include the ability to create or assign Azure Policy definitions and initiatives. Resource Policy Contributor is a distinct, narrower role specifically for policy governance tasks. Assigning Contributor to a user who needs to create policy initiatives would give them far more resource-level permissions than necessary.

Data Collection Rules (DCR) — Sources, Destinations & Query Types

Data Collection Rules define what monitoring data to collect, from which sources, and where to send it. They are a core component of the Azure Monitor data pipeline and work in conjunction with the Azure Monitor Agent installed on virtual machines. Understanding what can be a data source, what can be a destination, and which query language applies to each filter type is important for monitoring configuration scenarios.

DCR Data Sources and Destinations

	Valid?	Notes
Virtual Machines (with Azure Monitor Agent) as data source	Yes	VMs with the AMA installed are the primary data source type
Storage accounts as data source	No	Storage accounts are destinations, not sources in DCRs
SQL databases as data source	No	SQL databases are not valid DCR data sources or destinations
Log Analytics workspace as destination	Yes	The primary destination for log and performance data
Azure storage account as destination	Yes	For archival of collected data

XPath for Windows Event Logs, KQL for Querying After Ingestion

When configuring a DCR to filter Windows Event Log data before it is collected from VMs, the filter expressions are written in XPath 1.0 syntax. This is a collection-time filter that reduces the volume of data ingested. Once data has been ingested into a Log Analytics workspace, it is queried using KQL (Kusto Query Language). These two query languages serve different purposes at different stages of the data pipeline.

Collecting IIS Logs — First Step When AMA Is Already Installed

If Azure Monitor Agent is already installed on VMs and the goal is to start collecting IIS logs and sending them to a Log Analytics workspace, the first configuration step is to create a Data Collection Rule that specifies IIS logs as the data source and the Log Analytics workspace as the destination. A private endpoint and AMPLS are needed only when VMs must communicate with Azure Monitor privately. VM Insights provides performance and dependency data — not IIS application logs.

Azure Monitor Private Link Scope (AMPLS)

By default, Azure Monitor services communicate with monitoring agents over the public internet. When a requirement mandates that all VMs communicate with Azure Monitor exclusively through a VNet — using private IP addresses and keeping traffic on the Azure backbone — an Azure Monitor Private Link Scope (AMPLS) is the correct resource to create first.

AMPLS Setup Sequence

Create an Azure Monitor Private Link Scope resource.
Add the Log Analytics workspace (and any Application Insights components) to the AMPLS.
Create a private endpoint in the VNet and link it to the AMPLS.

Once configured, the VMs' Azure Monitor Agents resolve Azure Monitor endpoints to private IPs via private DNS and all monitoring traffic stays within the VNet. The AMPLS is the umbrella resource that coordinates the private connectivity across multiple Azure Monitor components.

⚠ AMPLS First, Private Endpoint Second

The AMPLS must be created and populated with Azure Monitor resources before the private endpoint is created. Creating a private endpoint directly for a Log Analytics workspace (without an AMPLS) only provides private connectivity to that one workspace, and may not cover all Azure Monitor endpoints the agents need to reach. AMPLS ensures comprehensive private coverage for the entire Azure Monitor stack.

ITSM Connector — Integrating Azure Monitor with IT Service Management

The IT Service Management Connector (ITSMC) is an Azure Monitor feature that integrates Azure alerting with external IT Service Management platforms such as ServiceNow, System Center Service Manager, Provance, and Cherwell. When an Azure Monitor alert fires, ITSMC can automatically create a work item (incident, problem, or change request) in the connected ITSM tool.

Setting Up ITSMC

The ITSM Connector must be deployed first before it can be used as an action in an action group. Once deployed and connected to the ITSM platform, an action group can be configured with an ITSM action that specifies which type of work item to create and in which ITSM workspace. Automation runbooks, Function Apps, Logic Apps, and webhooks serve different integration purposes — only ITSMC provides native, bidirectional integration with supported ITSM tools.

Alert Rules vs Action Groups — Minimum Count

Designing an Azure Monitor alerting solution requires understanding two independent dimensions: how many alert rules are needed, and how many action groups are needed. These are determined by different factors and are independent of each other.

Alert Rules — One Per Signal

Each alert rule monitors exactly one signal (one metric, one activity log operation, or one log query). You cannot combine multiple signals into a single alert rule. The minimum number of alert rules therefore equals the number of distinct signals being monitored.

Action Groups — One Per Distinct Recipient Set

An action group defines a set of notification actions (email addresses, SMS numbers, webhooks, etc.). The same action group can be reused across multiple alert rules. To minimise the number of action groups, identify which alert rules notify exactly the same set of people — those can share a single action group. Each unique recipient combination requires its own action group.

Worked Example — Four Signals, Three Recipient Sets

Signal	Notify	Alert Rule	Action Group
Storage Ingress	User1 + User3	Rule1	AG-A (User1+User3)
Storage Egress	User1 only	Rule2	AG-B (User1 only)
Delete storage account	User1 + User2 + User3	Rule3	AG-C (User1+User2+User3)
Restore blob ranges	User1 + User3	Rule4	AG-A (reused)

Result: 4 alert rules (one per signal), 3 action groups (three distinct recipient sets — Ingress and Restore blob ranges share AG-A).

Alert Processing Rules — Suppression Without Hiding Alerts

Alert processing rules (formerly called Action Rules) operate on top of alert rules to control notification behaviour during defined windows. The most common use is to suppress notifications during planned maintenance, preventing action groups from firing while the underlying alert rule continues to evaluate and fire alerts normally.

Suppression — Portal vs Notification

When an alert processing rule with type "Suppress notifications" is active during a time window:

The alert rule still fires when the condition is met — the alert appears in the Azure portal.
The action group notifications (emails, SMS, webhooks) are suppressed — they are not sent.
After the suppression window ends, notifications resume normally for any new alert fires.

Suppression never prevents the alert from being logged or visible — it only silences the notifications during the window.

Azure Monitor Alert Lifecycle — User Response States

Every fired Azure Monitor alert has a User response state that tracks whether a human has acknowledged and acted on the alert. These states are manually updated by operators and follow a lifecycle where all transitions between states are valid in any direction — there is no terminal state.

Three User Response States

State	Meaning	Can Transition To
New	Alert has fired; no operator has acted on it yet	Acknowledged or Closed
Acknowledged	An operator has seen the alert and is working on it	New or Closed
Closed	The alert has been resolved or dismissed	New or Acknowledged

⚠ All State Transitions Are Valid — No Terminal State

There is no terminal state. A Closed alert can be reopened to New or moved to Acknowledged. A New alert can go directly to Closed without passing through Acknowledged. The three states form a triangle where every direction of transition is permitted. This is tested by presenting an alert in one state and asking which states it can be changed to — the answer is always both of the other two states.

Action Group Notification Throttling

Azure Monitor action groups throttle certain notification channels to prevent flooding when an alert fires repeatedly. Rate limiting applies to SMS, voice, push, and email notifications. Webhooks and programmatic integrations are not subject to throttling. The full limits are documented in Azure Monitor service limits.

Notification Rate Limits by Channel

Channel	Rate Limit	Max per Hour if Alert Fires Every Minute
Email	No more than 100 emails per hour	60 emails per hour (under the 100/hr limit — all delivered)
SMS	1 per 5 minutes per phone number	12 SMS per hour
Voice call	1 per 5 minutes per phone number	12 calls per hour
Webhook / Logic App / Function App	No throttle	Once per every alert firing

⚠ Email and SMS Are Throttled Differently

Email has a higher rate limit than SMS and voice — up to 100 emails per hour are allowed, so 60 alerts per minute produces 60 emails (all delivered, since 60 < 100). SMS and voice are more aggressively throttled at 1 per 5 minutes per phone number — 60 alerts per minute produces only 12 SMS that hour. Webhooks and programmatic integrations are not subject to this throttle and receive every single alert firing.

Activity Log Alerts — Scope Inheritance & What Triggers Them

Azure Monitor activity log alerts fire based on operations recorded in the Azure Activity Log — administrative operations like creating, modifying, or deleting resources, and assigning tags. A key principle is scope inheritance: when an operation is performed on a resource, alerts scoped to that resource AND alerts scoped to its parent containers (resource group, subscription) will all fire simultaneously.

How Scope Inheritance Works for Activity Log Alerts

If Alert1 is scoped to RG1 and Alert2 is scoped to VM1 (which is inside RG1), and a user performs an administrative operation on VM1:

Alert2 fires — the operation was directly on VM1, which is in Alert2's scope.
Alert1 also fires — VM1 is inside RG1, and operations on VM1 are within the scope of RG1-scoped alerts.

Operations That Qualify as "Administrative Operations"

Creating, modifying, or deleting any Azure resource
Attaching or detaching a managed disk to/from a VM
Assigning or modifying resource tags (on a resource or a resource group)
Modifying NSG rules
Resizing a virtual machine
Starting or stopping a VM (in some contexts)

⚠ Attaching a Disk to a VM Triggers Alerts Scoped to That VM

Creating a new disk and attaching it to VM1 is an administrative operation on VM1. Even though a new disk resource is also created, the attachment of that disk writes to VM1's configuration — making this an operation within the scope of any alert that monitors VM1. An alert scoped to RG1 fires because VM1 is in RG1. An alert scoped to VM1 also fires because VM1 is directly modified by the attachment operation.

Diagnostic Settings — Log Destinations & Query Capability

Azure resources can send platform logs (activity logs, resource logs, and metrics) to one or more destinations via Diagnostic Settings. The choice of destination determines what analysis capabilities are available and whether the data can be queried interactively.

Diagnostic Settings Destinations

Destination	Query Capability	Best For
Log Analytics workspace	Yes — interactive KQL queries from the portal	Analysis, alerting, dashboards, correlation across resources
Azure Storage account	No interactive queries — must export or use tools	Long-term archival, compliance retention, batch analysis
Azure Event Hub	No — streaming only	Real-time streaming to SIEM, external analytics platforms

When Interactive Queries Are Required — Log Analytics Workspace

If the requirement specifies "run interactive queries from the Azure portal against the collected data" — for example, analysing IP addresses that connect to a load balancer — the destination must be a Log Analytics workspace. A storage account stores logs as blobs but does not support direct interactive query from the portal. An Event Hub streams data but is not a queryable store.

Azure Advisor — Identifying Underutilised Resources

Azure Advisor is a personalised, cloud-native recommendation service that analyses your Azure resource configurations and usage patterns, then surfaces actionable recommendations across five categories. It is the go-to tool when the objective is to proactively identify and right-size overprovisioned or underutilised resources — without requiring manual metric queries or scripted analysis across many VMs.

Five Recommendation Categories

Cost — identifies underutilised VMs (based on low CPU and memory usage over 30 days), idle resources, and recommends right-sizing or shutdown to reduce spending. This is the category relevant for "find underutilised VMs to reduce service tier."
Reliability — high availability recommendations including availability sets, geo-redundancy, and backup configuration.
Security — security posture improvements, integrating with Microsoft Defender for Cloud recommendations.
Operational Excellence — service health, support plans, and operational best practices.
Performance — throughput and latency improvements for specific resource types.

What Not to Use for Underutilisation Detection

Azure Monitor (Metrics blade) — shows raw performance charts for individual resources. Useful for investigation but does not proactively surface underutilisation recommendations across a fleet of 100 VMs.
Customer Insights — a Microsoft product for marketing and CRM analytics; it is not an Azure infrastructure service and has nothing to do with VM resource utilisation.

Azure Budget — Notification Only, Never Stops Resources

Azure Budgets are a cost management tool that monitors actual or forecasted spend against a defined threshold and sends notifications when that threshold is reached or projected to be reached. Budgets are notification instruments only — they never automatically stop, deallocate, resize, or otherwise modify any Azure resource when a threshold is breached.

Budgets Never Stop VMs or Modify Resources

When VM1 and VM2 reach the maximum cost defined in a budget, those virtual machines continue to run. Azure Budgets have absolutely no mechanism to stop, deallocate, or terminate resources. The only action a budget performs when a threshold is reached is to trigger the configured action group — which sends notifications (emails, SMS) to the recipients. To automate a VM shutdown based on cost, you would need an action group that calls an Azure Automation runbook or Logic App which then performs the deallocation — the budget alone cannot do it.

How Budget Alerts Work

A budget can have multiple threshold alerts, each set to a percentage of the budget (e.g., 50%, 80%, 100%, 120%).
Each threshold can be set to trigger on Actual spend (current charges) or Forecasted spend (projected charges).
Each threshold triggers an action group. If two thresholds are breached in a billing period, two separate action group notifications are sent.

ARM Deployment Modes — Incremental vs Complete

When deploying an ARM template to a resource group, Azure applies one of two deployment modes. The mode controls what happens to resources in the target resource group that are not defined in the template being deployed.

Incremental Mode (Default)

Azure adds or updates resources defined in the template. Resources already in the resource group that are not in the template are left untouched. Incremental mode is the safe default — redeploying a template in Incremental mode will never accidentally delete resources that were created by other means or at other times. It is additive and non-destructive.

Complete Mode

Azure adds or updates resources defined in the template, and deletes any resources in the resource group that are not in the template. Complete mode makes the resource group contain exactly and only what the template defines. It is used when a clean-slate deployment is required — for example, removing legacy resources or ensuring environment parity between environments. The requirement phrase "remove all existing resources before deploying" signals Complete mode.

⚠ Complete Mode Is Destructive — Resources Not in the Template Are Deleted

Complete mode will permanently delete any resource in the target resource group not listed in the template — including resources created manually, by other pipelines, or by prior deployments. The deletion happens automatically and cannot be undone unless the resource was protected by a Delete lock. Use Complete mode only when you intend the resource group to contain exactly what the template specifies and nothing else.

ARM Template Deployment at Subscription Scope

When an ARM template creates a resource group and then deploys resources to it in the same template, the correct PowerShell cmdlet is New-AzDeployment (subscription-scope deployment), not New-AzResourceGroupDeployment (which requires the resource group to already exist). A subscription-scope deployment can create the resource group as part of the same template operation.

ARM Deployment History — Finding the Template Used

When resources are deployed via an ARM template, Azure records the deployment in the resource group's Deployments history. This history stores the exact template and parameter values used for every deployment to that resource group, making it possible to review, re-run, or export previous deployments.

Where to View an ARM Deployment Template

To view the template used for a specific deployment: navigate to the resource group that received the deployment, open the Deployments blade, find the deployment by name or timestamp, and click it to view its template and input parameters. The template is stored at the resource group level — it is not accessible from individual resource blades (VM1, storage2), from the subscription blade, or from storage account blob containers (unless templates were explicitly exported there).

⚠ Multi-Resource Deployments — One Entry for All Resources

When a single ARM template deploys multiple resources together (for example, a VM and a storage account), the deployment is recorded as a single entry in the resource group's Deployments blade — not one entry per resource. Navigate to the resource group (not to VM1 or storage2) and find the single deployment record to view the complete multi-resource template.

Azure Deployment Stacks — DenySettings and RBAC Override

An Azure Deployment Stack groups Azure resources managed together through a single ARM/Bicep template deployment. Stacks can be configured with DenySettings that create deny assignments on stack-managed resources. Crucially, these deny assignments take precedence over RBAC role assignments — even a user with the Owner role is blocked by a DenySettings restriction.

DenySettingsMode — Three Options

Mode	Blocks	Allows
None	Nothing — no deny assignments created	All operations governed by normal RBAC
DenyDelete	DELETE operations on stack-managed resources	Read and write (modify) operations still governed by RBAC
DenyWriteAndDelete	All write AND delete operations	Read operations only

DenySettings Override RBAC — Even Owner Is Blocked

Deny assignments created by a Deployment Stack take precedence over all RBAC role assignments. A user with the Owner role on a resource cannot delete that resource if the stack has DenySettingsMode set to DenyDelete. The deny assignment from the stack is more authoritative than any RBAC allow. To remove the restriction, the Deployment Stack itself must be deleted or modified by someone with stack management permissions.

DenyDelete — Write Operations Are Still Permitted

DenyDelete mode only blocks delete operations. Read and write (configuration modification) operations are still fully governed by RBAC. A user with Network Contributor or Contributor can still modify a VNet's IP address space, change NSG rules, or update other configuration properties — they simply cannot delete the resource. DenyDelete is appropriate when you want to prevent accidental removal while still allowing ongoing management of the resource's configuration.

← 04. Implement and manage virtual networking AZ-104 Practice Tests →

Monitor and maintain Azure resources