Blog Azure Migration Infrastructure Modernization

10 Design Principles for Azure Applications

Making an application ready for Azure is exciting, but it must be planned properly. You shouldn’t just move your app to the cloud; it's about doing it the right way.

When you first design an Azure Landing Zone, you should follow these 10 design principles to make your journey to Azure seamless and successful.

No matter if you’re an experienced developer or just starting in Azure, understanding these principles helps you get the most out of Azure.

Author

Niels Kroeze IT Business Copywriter

Reading time 19 minutes Published: 31 October 2025

10 Design Principles for Azure Applications

Follow these design principles to make your application more scalable, resilient, and manageable.

Principle	Explanation
1. Design for self-healing	Failures are inevitable. Hence, design your application to recover automatically when they occur.
2. Make all things redundant	Avoid single points of failure by incorporating redundancy throughout your application.
3. Minimise coordination	Minimise coordination between application services to achieve better scalability.
4. Design to scale out	Design your app to scale horizontally, adding or removing instances as demand changes.
5. Partition around limits	Use partitioning to work around limitations in, for example, databases, networking, and compute resources.
6. Design for operations	Build your application in such a way the operations team has the tools they need to monitor and maintain the application.
7. Use managed services	If possible, opt for Software-as-a-Service (SaaS) or Platform-as-a Service (PaaS) instead of Infrastructure-as-a-Service (IaaS).
8. Use an identity service	Use an identity as a service (IDaaS) (identity-as-a-service) platform instead of building or operating your own. Use an IDaaS platform instead of building your own identity system.
9. Design for evolution	Design for change, so the application can adapt, fix issues, and add features over time.
10. Build for the needs of the business	Make every design choice based on a clear business requirement.

Now, let’s dive deeper into each of them.

1. Design for self-healing

This principle is all about handling failures that can happen at any moment which should be considered inevitable. Failures can be of any kind (VM crashes, failed deployments, network issues, etc.) and may happen in the underlying Azure hardware, an availability zone, or even an entire Azure region can have disruptions. Think of capacity issues common in key regions such as West Europe. Major outages, like full regional disruptions, are rare, but remain a possibility.

As described as a key concept in the Azure Well-Architected Framework, the goal of self-healing workloads isn’t to prevent failures, rather recover from them automatically and return to full operation.

Best practices

Retry failed operations: Configure your application to automatically retry failed operations. Whenever there is a momentary loss of, for example, connectivity or a timeout, the application needs to have logic to retry.
Isolate critical resources (bulkhead): Isolating critical resources using the bulkhead pattern means partitioning your application into separate groups, so a failure in one area doesn’t impact the rest. This approach helps prevent cascading failures and keeps unaffected parts of your system running smoothly.
Perform load levelling: Ensure that sudden spikes in traffic do not overwhelm the backend. Use a queue system for work items to smooth out that load.
Failover: Consider stateless applications which allow easier failover.
Consider degrading functionality: In case you can’t work around the problem, reducing the functionality may be a workaround. You could try to allow only allow mission-critical functionality of your apps while the other ones are disabled, and in such a way avoid huge impact.
Use availability zones: Leverage availability zones to ensure your application can fail over to another data centre within the same Azure region if part of the region experiences disruption.

Note:

Multi-region deployments in Azure usually cost more and can introduce additional latency compared to single-region setups, but they offer higher availability and resilience for critical workloads.

A self-healing workload design is also key in the Azure Well-Architected Framework’s, focusing on creating systems that handle failures and return to full operation automatically.

Make sure to check out all the other recommendations from Microsoft to design a self-healing application in Azure.

2. Make all things redundant

Before “making things redundant”, ask yourself:

How long can your organisation be offline?
How much data loss can your organisation permit?

Redundancy always comes at a cost, which you can estimate using the Azure Pricing Calculator. However, the true value of redundancy - such as preventing downtime or data loss, may be harder to quantify.

Best Practices

Consider business requirements: Start by defining your recovery time objective (RTO) and recovery point objective (RPO).
Consider multi-zone or multi-region architectures
Zone redundancy: If there’s no clear direction from the business, we recommended you design at least for zone redundancy with Azure Availability Zones (AZ). These are isolated sets of data centres within a region. By using availability zones, you can be resilient to failures of a single data centre or an entire availability zone.
Regional redundancy: Add regional redundancy only where there’s a clear and legitimate business requirement, as they are more costly and not always a must for every application.
Redundancy in Depth: Azure offers several ways to build redundancy and avoid single points of failure. Combining these features allows you to tailor your redundancy strategy to your business needs and risk tolerance.
Place VMs behind a load balancer: This can help you avoid a single point of failure.
Replicate databases: Azure SQL Database and Azure Cosmos DB, for example, automatically replicate within a region. You can also enable cross-region replication easily to the paired region.
Consider a failover routing mechanism: Azure Traffic Manager and Azure Front door are good options for your multi-region solutions.

Note:

Always review your Azure SLAs (Service-Level-Agreements) as adding new services (e.g. Microsoft Azure Traffic Manager) introduces another possible point of failure, altering the ultimate combined SLA you get.

You can view the remaining recommendations from Microsoft for redundancy here: https://learn.microsoft.com/en-s/azure/architecture/guide/design-principles/redundancy

3. Minimise coordination

Most cloud applications consist of multiple services, such as web front ends, databases, business processes, and reporting. To achieve scalability, each service should run on multiple instances.

If you have two nodes that need to update a database table, you must consider how this is handled. For example, if a lock is placed when the first node updates the table, adding more nodes may not improve scalability if they’re all waiting for the lock to be released.

“Think of each service in your application as an independent team that communicates via messages. Instead of waiting for everyone to agree before acting, each team can make progress on its own, responding to updates as they arrive. This approach keeps the whole system moving efficiently, even if one team is temporarily delayed.”

In distributed cloud environments, minimising coordination is crucial because network latency and independent scaling can quickly become bottlenecks. When services are tightly coupled and require frequent coordination, even small delays or failures can ripple through the system, reducing performance and reliability.

By designing services to operate independently and communicate asynchronously, you enable each part of your application to scale and recover without waiting for others.

Best Practices

Use decoupled, asynchronous components: Communicate through events rather than direct calls.
Adopt eventual consistency: Avoid heavy global transactions across distributed data. Use patterns like Compensating Transaction to handle rollbacks logically after failures.
Leverage domain events: Record significant domain changes as events. Other services can subscribe and react, reducing the need for global coordination.
Consider event sourcing: Store all state changes as append-only events to reduce locking and allow easy replay of history.
Combine CQRS with event sourcing: Use event sourcing for the write model and feed the same events into a read store for query-optimised views.
Partition data and state:
- Give each service its own data store (microservices principle).
- Use database sharding to improve concurrency and scalability.
- Split large, monolithic state into smaller partitions to manage independently.

You can view the remaining recommendations from Microsoft for minimising coordination here: https://learn.microsoft.com/en-us/azure/architecture/guide/design-principles/minimize-coordination

4. Design to scale out

In cloud architecture, scaling can be achieved in two main ways:

Scaling out/in (horizontal scaling): Adding or removing instances (such as VMs, containers, or app service instances) to handle changes in demand.
Scaling up/down (vertical scaling): Increasing or decreasing the resources (CPU, memory, storage) of a single instance.

Azure is optimised for horizontal scaling, which improves resilience and availability by distributing workloads across multiple resources. Whilst scaling up (vertical scaling) can provide more power to a single resource, it is limited by the maximum capacity of that resource and does not improve redundancy.

Example

For example, an e-commerce website might scale out by adding more web server instances during a sale event, ensuring that increased traffic does not overwhelm the application. When the event ends, those extra instances can be removed to save costs – often automatically.

Scaling out allows you to match resources to demand, helping control costs by only running what you need. It also supports zero-downtime deployments and rolling updates, which are essential for modern cloud-native applications.

Best Practices

Avoid stickiness: Requests from clients should not always be routed to the same server, enabling effective horizontal scaling.
Azure auto-scaling: Use Azure’s built-in auto-scaling features to automatically adjust the number of instances based on real-time metrics.
Continuously identify bottlenecks: Regularly review performance metrics to ensure that scaling out is addressing the right bottlenecks (e.g., database, storage, or compute).
Break down workloads based scalability needs: Separate workloads based on scalability needs, such as public-facing sites versus internal admin portals.

For more details, see Microsoft’s guidance on scaling out in Azure.

5. Partition around limits

All Azure services have technology-imposed limits, such as maximum database size, IOPS, concurrent connections, and network capacity. As your application grows, you may reach these limits, which can impact scalability and performance.

Partitioning your system allows you to distribute data and workloads across multiple resources, helping you avoid bottlenecks and maintain high availability. For instance, if your application’s database approaches the maximum allowed size or throughput, sharding the database across multiple instances can help maintain performance and avoid outages.

Partitioning not only helps you avoid hitting resource limits but can also improve performance and resilience by spreading workloads across multiple resources. It’s best to plan for partitioning early in your architecture, rather than waiting until you approach resource limits.

Best Practices

Partition databases: Shard large databases to avoid limits on size, throughput, or concurrent sessions.
Partition application components: Break down monolithic applications into smaller services, each with its own data store and compute resources.
Partition storage: Use multiple storage accounts to avoid IOPS or capacity limits.
Partition compute: Distribute workloads across multiple VMs, containers, or app service plans to avoid resource exhaustion.
Partition at different levels: Consider partitioning at the database, storage, and compute levels, depending on your application’s growth and performance needs.

For more details, see Microsoft’s recommendations on partitioning in Azure.

6. Design for operations

Design for operations means building your application so that it can be effectively monitored, managed, and supported once it’s running in Azure. Operational excellence is essential for maintaining reliability, performance, and security, and for enabling rapid troubleshooting and continuous improvement.

Modern cloud applications are dynamic and distributed, making it vital to provide operations teams with the tools and visibility they need to keep systems healthy and respond quickly to incidents.

For instance, by integrating Application Insights into your Azure web app, you can monitor response times, track errors, and analyse user behaviour in real time. Setting up alerts for high error rates or slow responses allows your operations team to react quickly and maintain a high level of service.

“Proactive monitoring and automation not only reduce downtime but also free up your team to focus on innovation rather than firefighting. Design for operations also supports compliance and governance, ensuring that audit trails and security controls are in place for regulatory requirements.”

Best Practices

Make things observable:
- Instrument your application to produce logs, metrics, and traces that provide insight into its behaviour and health.
- Use Azure-native tools like Azure Monitor, Log Analytics, and Application Insights to collect and analyse operational data.
- Ensure that critical events (errors, performance bottlenecks, security incidents) are surfaced and actionable.
Enable Proactive Monitoring:
- Set up dashboards and alerts for key performance indicators (KPIs), error rates, and resource utilisation.
- Use automated alerting to notify operations teams of issues before they impact users.
- Monitor dependencies (e.g., databases, external APIs) as well as your own application components.
Support Root Cause Analysis:
- Collect sufficient diagnostic data (logs, traces, snapshots) to enable rapid investigation of incidents.
- Use correlation IDs and distributed tracing to follow requests across microservices and infrastructure boundaries.
- Document common failure scenarios and troubleshooting steps.
Automate Operational Tasks:
- Use Azure Automation, Logic Apps, or runbooks to automate routine maintenance, scaling, and recovery tasks.
- Implement self-healing mechanisms where possible (e.g., auto-restart failed services, auto-scale under load).
Secure and Audit Operations:
- Ensure operational data is protected and access is controlled.
- Enable auditing and logging for sensitive actions (e.g., configuration changes, access to data).
- Regularly review operational security and compliance requirements.
Collaborate Across Teams:
- Provide clear documentation for operational procedures, escalation paths, and support contacts.
- Foster collaboration between development and operations (DevOps) to enable continuous improvement and rapid incident response.

For more details, see Microsoft’s recommendations on designing for operations in Azure.

7. Use managed services

Azure offers a range of cloud service models to meet different business and technical needs: Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS).

Traditional IT and Legacy Applications: Many organisations still rely on legacy applications or workloads that require traditional IT infrastructure. Azure’s IaaS model enables you to migrate these applications to the cloud with minimal changes, providing virtual machines, networking, and storage that closely resemble on-premises environments. Hybrid solutions are also available for workloads that need to remain partially on-premises.
Platform as a Service (PaaS): PaaS provides a complete platform for developing, running, and managing applications without the complexity of maintaining the underlying infrastructure. It accelerates development, simplifies scaling, and reduces operational overhead. Azure App Service, Azure SQL Database, and Azure Functions are common PaaS offerings.
Software as a Service (SaaS): SaaS delivers ready-to-use software solutions over the internet, managed entirely by the provider. Think of Microsoft 365. SaaS is ideal for organisations seeking turnkey solutions for productivity, collaboration, or business processes, with no need to manage infrastructure or updates.

Shared Responsibility Model in Azure comparing responsibilities between traditional IT with IaaS. PaaS and SaaS.

Best Practices

Choosing the Right Model: The choice between IaaS, PaaS, and SaaS depends on your business needs, technical requirements, and existing IT landscape. IaaS is best for legacy applications or when you need full control; PaaS is ideal for modern development; SaaS is suitable for standardised business functions.
Migration and Modernisation: Azure supports various migration strategies for legacy applications, from “lift and shift” (IaaS) to refactoring or rearchitecting for PaaS or SaaS. Assess each workload to determine the most appropriate approach. By leveraging managed services in Azure, organisations can optimise for agility, reliability, and cost-effectiveness, while still supporting legacy workloads where needed.

For more details, see Microsoft’s guidance on managed services in Azure.

8. Use an identity service

Why Identity Services Matter in Azure

Every cloud application needs a way to work with identities. Identity is the foundation of modern security, including Zero Trust, and is a critical part of your application’s architecture. Managing identities securely ensures that only authorised users and services can access your resources, protects against credential theft, and supports compliance requirements.

Why not build your own identity system?

Building and maintaining your own identity system is complex, risky, and expensive. Modern identity protocols and security requirements change rapidly, and even large organisations struggle to keep up. By using a managed identity service, you benefit from continuous improvements, expert support, and reduced liability.

For example, by using Microsoft Entra ID, you can enable single sign-on (SSO) for all your applications, enforce MFA for sensitive operations, and use conditional access to block risky sign-ins. This not only simplifies user experience but also strengthens your security posture.

Best Practices

Use an Identity as a Service (IDaaS) Platform: Opt for a managed identity service like Microsoft Entra ID (formerly Azure Active Directory), Azure AD B2C, or similar platforms, rather than building your own identity system. Managed identity platforms handle credential storage, authentication protocols, and security updates, reducing your risk and operational overhead.
Avoid storing credentials yourself: Never store user credentials in your application database. Even encrypted or hashed credentials are a liability and a target for attackers. Outsource credential management to specialised providers who invest in robust security controls.
Implement modern authentication protocols: Use standards-based protocols such as OAuth2 and OpenID Connect for authentication and authorisation. These protocols are designed to mitigate real-world attacks and evolve with new threats.
Enable strong authentication: Enforce multi-factor authentication (MFA) and consider passwordless options (e.g., FIDO2 devices) to reduce the risk of compromised credentials. Use conditional access policies to require additional verification based on risk signals, device status, or location.
Support Zero Trust Principles: Treat identity as the new perimeter: verify explicitly, use least privilege access, and assume breach. Continuously authenticate and authorise every request, user, device, and session.
Manage External and Guest Users securely: Use Entra ID’s external identity features to onboard business partners, customers, and guest users securely. Apply role-based access control and restrict permissions to only what is necessary.
Audit and Monitor Identity Activities: Integrate identity logs with Azure Monitor and Log Analytics for visibility into sign-ins, access attempts, and suspicious behaviour. Regularly review audit trails and identity secure scores to improve your security posture.

By leveraging managed identity services in Azure, you simplify user management, strengthen security, and reduce operational risk.

For more details and recommendations, see Microsoft’s guidance on identity services in Azure.

9. Design for evolution

In the cloud, business requirements, technologies, and user expectations change rapidly. Designing for evolution ensures your application can adapt to new demands, integrate emerging technologies, and remain maintainable over time. This approach reduces technical debt and helps future-proof your solution.

All applications change over time, to fix bugs, add new features, or bring in new technologies. If all parts of the application are tightly coupled, it becomes very hard to introduce changes. A change in one part may break another or cause changes to ripple through the code base.

For example, if your application is built using microservices, you can update a single service to add a new feature or fix a bug without redeploying the entire system. This minimises risk and allows for faster innovation.

Best Practices

Loose Coupling: Design your application with loosely coupled components, where only part of a single service needs to be updated.
Deploy services independently: Design your application and release process so each service can be updated on its own. This allows faster safer rollouts of bug fixes and new features.
Keep domain logic out of gateways: Use gateways only for infrastructure tasks like routing, load balancing, authentication, or protocol translation. Avoid embedding domain knowledge to prevent heavy dependencies.
Use modular architecture: Break your application into modules or microservices that can be developed, deployed, and scaled independently.
Embrace API-first design: Use APIs to decouple services and enable easier integration with new systems or features.
Automate testing and deployment: Implement CI/CD pipelines to streamline updates and reduce risk when making changes.
Version your APIs and services: Support backward compatibility and smooth transitions when introducing new features.
Document dependencies and interfaces: Maintain clear documentation to help teams understand how components interact and what can change safely.

By designing for evolution, you ensure your application remains adaptable, maintainable, and ready for future business and technology changes.

For more details, see Microsoft’s guidance on designing for evolution.

10. Build for the needs of the business

Every technical decision should be driven by clear business objectives. Building for the needs of the business ensures that your application delivers real value, supports strategic goals, and remains cost-effective and sustainable. This approach helps avoid over-engineering, wasted resources, and solutions that don’t meet user or stakeholder expectations.

For example, if your business requires high availability for customer-facing services, prioritise redundancy and failover in your architecture. If cost control is a key objective, leverage auto-scaling and reserved instances to optimise cloud spend.

Best Practices

Engage stakeholders early and often: Collaborate with business leaders, end users, and other stakeholders to understand requirements, priorities, and constraints.
Translate business requirements into technical specifications: Ensure that features, performance targets, and compliance needs are clearly documented and mapped to architectural decisions.
Define business objectives: Define the recovery time objective (RTO), recovery point objective (RPO), and maximum tolerable outage (MTO).
Document SLAs and SLOs: Document the service level agreement (SLAs) and service level objectives (SLOs) required by the business.
Plan for growth: Design your solution to handle more users, higher transaction volumes, and larger data storage without major architectural changes. Keep the service and data models flexible to accommodate evolving business requirements.
Prioritise features based on business impact: Focus development effort on capabilities that deliver the greatest value or address the most critical risks.
Design for agility: Build flexibility into your architecture so you can respond quickly to changing business needs, market conditions, or regulatory requirements.
Manage costs: In the cloud, you pay for what you use. Understand pricing models for compute, storage, network, and other services to avoid unexpected expenses.
Monitor business outcomes: Use metrics and feedback loops to measure how well your application supports business goals and adjust as needed.
Consider failure risks: Design your solution architecture with availability and redundancy.

By building for the needs of the business, you ensure your Azure solutions deliver measurable value, remain cost-effective, and adapt to evolving requirements.

For more details, see Microsoft’s guidance on building for business.

Get in touch with us!

Do you need help implementing these principles in Azure? Intercept can help. As experienced Azure Expert MSP, we empower business solutions on a large scale and in complex situations on Azure.

info@intercept.cloud +44 7525 633506

10 Design Principles for Azure Applications

10 Design Principles for Azure Applications

1. Design for self-healing

Best practices

Note:

2. Make all things redundant

Best Practices

Note:

5 strategies to increase availability and uptime in Azure

3. Minimise coordination

Best Practices

4. Design to scale out

Example

Best Practices

Want to know more about scaling in Azure?

5. Partition around limits

Best Practices

6. Design for operations

Best Practices

7. Use managed services

Best Practices

From IaaS to PaaS: the next step in your software journey.

8. Use an identity service

Why Identity Services Matter in Azure

Why not build your own identity system?

Best Practices

Azure Security Best Practices Checklist

9. Design for evolution

Best Practices

10. Build for the needs of the business

Best Practices

Get in touch with us!