News Azure Infrastructure

How to Deal With Capacity Issues in Microsoft Azure

Did you ever encounter capacity issues in specific Azure regions? This problem can occur when creating, starting, or resizing a virtual machine (VM) in a region.

However, the last thing you’d want is for VMs not to start, and customer-X to be unable to work. 

This article breaks down how to deal with this in Microsoft Azure. 

Niels Kroeze

Author

Niels Kroeze IT Business Copywriter

Reading time 6 minutes Published: 16 September 2025

The issue: overcrowded “hero” regions 

Most experienced Azure users have run into the dreaded deployment failure: no capacity available in the chosen region, showing this error:

Azure VM error message failed to start virtual machine

Error code: AllocationFailed or ZonalAllocationFailed
Error message: We do not have sufficient capacity for the requested VM size in this region. Read more about improving likelihood of allocation success at https://aka.ms/allocation-guidance

Additionally, new features such as Azure OpenAI are often unavailable in “crowded” regions. 

When what seemed such an infinite pool of resources suddenly refuses new workloads due to lack of capacity, it’s a reality check that the cloud is bound by physical limits: physical data centres with power, cooling, racks, and servers, run by everyday people.  

At some point, the building is full, or at least the servers are all in use. Microsoft itself has the following to say about this: 

We are continually investing in additional infrastructure and features to make sure that we always have all VM types available to support customer demand. However, you might occasionally experience resource allocation failures because of unprecedented growth in demand for Azure services in specific regions.” 

Microsoft statement

The popular regions, better known as “hero regions”, are often affected, especially the EU-West region (the largest hero region), one of Microsoft’s busiest data centres. Capacity issues have been ongoing for several years here. 

However, Azure West Europe isn’t the only affected region; other regions occasionally affected include: 

  • (Europe) UK South 
  • (North America) Canada Central 
  • (North America) East US 
  • (North America) East US

Microsoft’s solution? Try another, less-impacted region/zone; that’s what the “helpful” support agent says… 

Using other regions can be a workaround for your Azure environment, but this can become a serious problem as: 

  • Not all cloud architectures are flexible enough to quickly onboard new regions 
  • You need certain features, and not all regions offer the same features (yet) 
  • Your current cloud budget doesn’t allow you to move; some regions are more expensive than others. 
  • Azure Governance and policies may require deployment in the “affected” region. 

For example, at Intercept, we work mostly with Dutch clients (Azure West Europe region) who can’t always afford to reallocate to another region due to compliance requirements. 

Many countries and industries require data to remain within defined geographic borders, and breaking these rules can result in fines or limits on your services. If a customer needs to deploy in a specific region and is being told to redirect to other continents, such as Asia or the US, halfway around the planet, their likely response would be: “Seriously??”  

The point is that this isn’t always going to work, which is why we need some workarounds.

 

Solutions 

Until the capacity issues are over for good, we need to minimise the risks of this sort of thing happening. 

1. No more autoscaling 

Customers use autoscaling to become more cost-efficient in the cloud by dynamically allocating or deallocating computer instances based on demand. But during uncertain capacity times, autoscaling can be a risk.  

Think about it: if you stop-deallocate, you will give up your allocation. While intended to save money during non-peak times, it may potentially harm future operations or revenue. 

In shortage periods, it’s safer to hold onto your existing capacity and consider disabling auto-scaling. Keep the resources running rather than gambling on availability. 

 

2. On-Demand Capacity Reservations 

Microsoft’s on-demand capacity reservation program allows you to reserve compute capacity in an Azure region or availability zone. In short, you reserve and pay for the VM upfront to guarantee the capacity you’ll need when it’s time to deploy. Thus, you avoid allocation issues and increase application availability and uptime. 

If you expect a capacity spike, such as during a migration, or if you’re running mission-critical workloads, reserving VMs in advance is worth serious consideration to ensure guaranteed resources.  

Asking a client to pay for compute weeks before they’re needed can be a tough sell, though. But will you risk failed deployments, lost productivity or even downtime? 

 

3. Resize your VMs 

It’s tempting to deploy the latest SKUs, as they promise faster processors, etc. However, not all workloads actually need that level of performance. Newer SKUs typically run on the latest hardware, which is often where shortages occur.  

Instead, pick the VM size that matches your actual workload requirements. Smaller or moderately sized SKUs, like D_v3, often run on a wider range of hardware, giving Azure more placement options and improving your chances of allocation success. This also applies to other resources, as almost every Azure service ultimately runs on VMs under the hood. 

 

4. Avoid Legacy VM sizes 

Legacy VM series (Av1, Dv1, DSv1, D15v2, DS15v2, etc.) can’t run on the newest Azure hardware. Customers still using these older SKUs may face allocation failures even when newer VMs are available.  

The solution is to migrate to equivalent newer-generation VMs, which are optimised for current hardware, offer better performance, and often come with improved pricing. 

 

5. Consider multiple regions 

One-region deployments often limit scalability and resiliency. And if that happens to be a hero-region, you’re also more likely to hit capacity shortages. 

If your regulations, data residency requirements, and governance policies permit it, consider multi-region deployments to enhance scalability, compliance, and resilience.  

In Azure, there are region pairs: each region has a peer region, which may make it easier to consider a multi-region solution.

Azure region pairs

  • Each Azure region is paired with another region within the same geography.
  • These regions are usually at least 300 miles apart.

It’s important to spread workloads across multiple regions. If a region or data centre goes down, say in West Europe, you can automatically fail over, so customers don’t experience downtime —  maybe a little bit, but it wouldn't be hugely noticeable.”

Simon Lee - Azure Expert & Consultant

But know that, when Western Europe has a problem, the capacity of the paired region also starts to fill up; so first come, first served!

 

6. Availability Zones 

Even within a single region, capacity can run out. Deploying across multiple Availability Zones spreads your workloads across different physical locations within the same region. 

This reduces the risk of hitting shortages in one zone and improves resilience against failures. It’s a simple way to secure more guaranteed capacity without relocating to a different region, as long as your architecture and governance permit it. 

 

7. Use the Allocation success recommender tool 

If you want to see a prediction of the probability of a successful allocation in the next 7 days, you can utilise the “Allocation succes recommender” self-troubleshooting tool from Microsoft. 

Allocation success recommender tool Azure

The tool, available in the Azure Portal, allows you to check which VM sizes can be successfully deployed and how many instances are available at the time of checking for a specific Azure region, such as West Europe. 

 

Additional considerations 

Like with many things, failures can happen. We recommend that you: 

  • Check the Azure status page regularly.  
  • Start by trying a simple retry (even immediately after the failure), which may help, as capacity often becomes available within hours. 
  • Try a different VM size, region or availability zone (AZ) 
  • Reserve on-demand capacity for mission critical workloads and when budget allows 
  • Use the Allocation success recommender tool 
  • Set up disaster recovery 

 

Closing thoughts 

Dealing with capacity issues can be tricky. You’ll need to be strategic to secure enough capacity for your VMs, app services, containers, databases and all other resources, ultimately depending on underlying compute instances. 

We recommend you follow these practices as a temporary workaround until your desired VM type is available again in the required region. The options are many, but ensure you find the right solution that matches your use case. 

Marc Bosgoed

Get in touch with us!

If you’ve experienced capacity issues and would like to know how we can assist you in handling these scenarios in the future, please don't hesitate to contact us.