Best Practices for Azure Spot VMs
Set a price cap
Setting maximum prices prevents Azure from unexpectedly changing the Spot VM price and gives you better control over your budget. If the price exceeds your set maximum, your VM will be evicted, but you'll have better control over your costs. However, availability will be reduced as you are more likely to be evicted with a price cap.
Review historical eviction rates
Review historical eviction rates to understand potential disruptions. While past eviction rates don’t guarantee future patterns, they can give insight into the frequency of interruptions. This will help you determine if your workloads suit Spot VMs or if you need alternative options.
Make your workloads resumable
Spot VMs can be evicted at any time so you must design your workloads to be interruptible.
Time your Spot VM usage well
There will be changes in spare capacity at different times of the day and week. For example, demand for Azure resources tends to drop during weekends or late-night hours. You can benefit by using spot instances out of business hours, because prices will be lower.
Be flexible
Flexibility means building your orchestration to adapt. If you can, use a mix of VM types. Different VM families (D-series, F-series) might experience varying available capacity levels at any given time. This way, you can increase your chances of securing available capacity at a lower cost.
But that’s not all; deploy in multiple cloud zones and regions, account for potential evictions, and prepare for various eviction scenarios to keep your workloads running smoothly and efficiently.
Enable Azure Backup
Protect your Spot VM data by enabling Azure Backup so you can recover data if VMs are interrupted.
Optimise Scaling with Virtual Machine Scale Sets (VMSS)
Use VMSS to scale Spot VMs based on demand. This will help manage costs by adjusting the number of Spot VMs and VMSS can switch to standard VMs if needed to ensure workload stability.
Spot Priority Mix: combine standard VMs with Spot VMs
Spot Priority Mix is an Azure feature that lets you mix Spot and standard VMs in your workload. It will move workloads to Spot VMs when available and back to standard VMs when not.
Use Spot VMs for non-critical tasks and standard VMs for critical ones.
This way you reduce the risk of disruption and take advantage of lower prices for non-time sensitive workloads. This mix will make sure your workloads are resilient even if Spot VMs are evicted as standard VMs can continue to run without interruption.
Build a Flexible Orchestration Mechanism
Make sure your workload can handle interruptions by building an orchestration mechanism that can auto deploy new Spot VMs after eviction.
Continuously monitor
Monitoring is key with Spot VMs. Since Spot VMs don’t have an SLA and can be evicted at any time, the key is to anticipate when this will happen. With that information, you can plan a graceful shutdown and automate the replacement process to keep your workload running.
Closing thoughts
While there are multiple purchasing options in Microsoft Azure (pay-as-you-go, reserved instances), we recommend you leverage them wherever possible. But with Spot VMs, you know you are getting the best possible rates on compute.
In addition, you have the flexibility to consume and use whatever you want, whenever you want, and give it back to Microsoft when you don't need it or want it anymore. Then, they will stop charging you, just like many subscription-based models.
Take into mind, though, that with Spot Instances, you are less in control. In other words, you need to be comfortable with the fact that Microsoft can decide at any moment to take it back from you.