Cost Control and Savings in Azure
Whenever a customer calls me and wants to discuss potential savings in a cloud environment, I prepare myself for a lot of explaining. It’s not so much about the complexity of cost savings in Azure in general or even that the pricing structure would be too difficult to understand but it’s more about the expectations from the customers side.
Almost everybody expects that I will go through the existing environment, suggest few optimizations that will be implemented directly on that meeting and then next month new lower bill arrives – Few buttons to click and instant savings if I exaggerate.
Sure, it’s possible and sometimes is enough to optimize tiers and SKUs but that’s not how major savings are driven. To really impact consumption, following aspects need to be in place
If you want to save, you first need to know how much you spend. More importantly, you need to know for what you are actually paying. That’s where cost control comes in the game.
Azure provides a variety of utilization and consumption data, so the key is to combine them to get most of it. Proper cost control tool should be capable of doing that.
We tried several tools to find the right one and I’ve personally tested even more in my past. The optimal solution is not just about aggregating different cost related inputs but since we need to provide that to our customers, speed and complexity of onboarding as well as management necessity also comes to decision making.
It is my pleasure to announce that Atea has signed partner agreement with Cloudyn or as already known “Azure Cost Management”. The agreement is signed for all countries where Atea operates.
Azure Cost Management (ACM) or Cloudyn offers pre-created and custom dashboard to review the current cost of Azure (as well as Google Cloud Platform or AWS) and classify the categories you are spending across such as resource types. That way you can quickly learn, what’s the major cost driver in your environment (usually VMs) and more importantly – who is responsible for those resources.
Next is Optimizer which will tell you if VMs run properly utilized and if there’s potential to change the VM size or create Reserved Instance to save more. Common practice is that VMs are 30% and more over utilized and you will likely see information such as server didn’t have significant performance peak in last month.
Finally, we can also sort resources to have better overview for future references. A common requirement is to group resources for internal or external billing or simple overview of spend. Grouping is usually done for departments, groups or projects so customers could track the costs based on their needs. It is ideal to use Resource Groups in Azure and Resource Tags. ACM can also leverage your own custom grouping and add custom tags to it on an overview level without writing anything back to Azure or otherwise affecting resources. Additionally, is possible to track everything against the budget which can also be added via API call from different tools more common for budget tracking.
Changes Towards Saving
Now that we know about everything cost related, we can start optimization based on collected data. The first step is already mentioned Optimizer which will help us to resize VMs. However, more significant change happens when we get rid of VMs as much as possible.
When we provision the VM in Azure, all the performance is purely dedicated to us without sharing with anyone. It sounds great in terms of power but certainly not so good for bill.
In any cloud environment you are saving for every second that you don’t need a resource to run. For everything which is managed, scalable and underlying resources can be shared in save way. That’s the reason why App Services, Azure SQL DBs and other pure cloud platform offerings are always cheaper than VMs, not mentioning that you don’t have implement and invent everything from a scratch.
That all sounds great but it’s easier said than done. As mentioned at the beginning, there’s no magic button for savings (or switching over platform). Savings are driven through architecture changes! A lot can be done just by swapping different building blocks but end of the journey towards PaaS and Microservices will require code changes. So where to start?
Majority of customers start with a pure Lift & Shift – moving VMs to Azure which is defined above as Cloud Infrastructure Ready. Almost any application out there is ready for this step. The biggest issue is that customers often think that this is how cloud migration looks like which is the worst and most expensive mistake in cloud adoption.
Cloud DevOps Ready steps are mainly focused on changes towards better automation, better overview of events and some VM replacements. It’s good place to automate deployment of resources in templates and start to think about CI/CD pipelines. Simplifying and defining infrastructure as code will bring easier changes processing without manual work hence better scalability. Further scalability can be achieved by going from VMs to containers which will either allow to share some underlying resources or use very modern approach such as Azure Container Instances.
Databases are usually most expensive, therefore moving towards platform offerings is in place or at least towards SQL Managed Instances if platform can’t be in a picture yet due to legacy compatibility issues.
The last step is to make everything fully Cloud Optimized – Full PaaS. Think about not reinventing a wheel but about how to replace certain integrations with serverless Azure Functions and Logic Apps. How to push as most parts of your application towards individual components so it would be easier to deploy it, troubleshoot it and more importantly – bill it based on execution, per second utilization and not paying for idle states. Pay only for what we use.
The entire topic is way more attention required than this short article, but it should give higher level overview where to start and what to look for. Feel free to reach out for help as well!
– Matous Rokos, Chief Cloud Solutions Architect, Atea NO