How to use Infracost as the guardrail to manage cloud cost during Terraform development
Democratizing DevOps, DevOps self-service, you may have heard of these terms. This is a DevOps trend from recent years largely stemmed from the need to support cloud-native architecture and development. DevOps should not be the bottleneck to releasing its services to the development teams so developers can self-service their applications regarding infrastructure CI/CD and application CI/CD.
“But there are risks associated with such DevOps self-service,” you immediately respond! What if a developer fat-fingers their Terraform configuration? Large organizations with large development teams and micromanaging each team’s IaC code to ensure they are not breaking the bank could be daunting!
How can we manage such DevOps self-service with peace of mind?
Good question! Let’s dive in!
- A developer makes changes to their Terraform configuration, and they submit a pull request.
- This pull request auto triggers a CI workflow, which calculates the cloud cost difference before and after their changes. It nicely displays the cost difference in a table format as a pull request comment, with a detailed drill-down on where the cost change occurs.
- If the monthly cloud cost change exceeds your predefined policy threshold, your workflow fails for further examination to ensure there is no human error in your Terraform configuration.
- The workflow can also generate an HTML report on the cloud cost for the infrastructure with the latest changes.
- You and/or the developer get an email notification with this report attached. See a sample report from our demo app below:
So, within seconds, you and/or the developer get the cloud cost report directly delivered to your email inbox upon a Terraform configuration pull request raised by the developer. You can see clearly what impact on the cloud cost that developer’s code changes introduce.
So often, spikes in cloud cost get captured AFTER you have incurred the cost. Infracost lets DevOps, SRE, and engineers see a cost breakdown and understand costs before making changes, either in the terminal or pull requests. This provides your team with a safety net to catch abnormal cloud cost estimates due to fat fingering or misconfiguration in Terraform configuration.
In this story, we are going to use a simple Spring Boot application named
demo for short) to demonstrate how to incorporate Infracost into our GitHub Actions workflows to manage cloud cost before provisioning our infrastructure, in this case, a simple AWS Lambda function.
Infracost supports Open Policy Agent (OPA) policies out of the box. Policy files are written in OPA’s native query language, Rego. Infracost leverages Rego to enable you to write flexible and powerful cost policies defined through rules. Rules dictate what checks infrastructure changes must pass before being merged. Below is a sample OPA Infracost policy in rego file format. See line 7. I have it configured as $100, which means if the monthly cloud cost for your Terraform configuration changes increases by more than $100, your CI workflow will fail.
You must double-check to ensure your Terraform changes are valid with no human errors. If it’s indeed valid, you can easily bump up this
maxDiff amount to ensure your workflow passes to get the approval to merge your code.
This policy file is straight out of the box of the sample policy file provided by Infracost team. See cost policies for more details on the policy file. The only change we made is the
We name this policy file
infracost-policy.rego, and place it under our project’s
It’s easy to calculate the static cloud cost based on configurations in your
.tfvars file, but how do you handle dynamic cloud cost based on usage? Infracost has an answer for it! You can configure the usage details for relevant cloud resources in a file such as
infracost-usage.yml, sample snippet below for a few AWS resources for our demo app:
As you can see, there are two main sections in this usage file:
resource_type_default_usage: The usage values defined in this section apply to each resource of the given type, which is useful when defining defaults.
resource_usage: The usage values defined in this section apply to individual resources and override any value defined in the
For a complete list of all possible cloud resource usage attributes you can define, refer to infracost/infracost-usage-example.yml at master · infracost/infracost (github.com). Be sure to bookmark this link, as you will need to refer to it when configuring your resources’ usage data.
Time to get to the core of all the above! Let’s explore how to incorporate Infracost into GitHub Actions workflows to achieve automated cloud cost management.
Step 1: Secrets configuration
The only GitHub secret we need to configure for this Infracost workflow is a repository secret with a key
INFRACOST_API_KEY and value as your API key. You can get an API key by following the instructions from Infracost’s website to install Infracost, then get the API key.
Step 2: Infracost reusable workflow
The Infracost workflow evaluates the base branch cloud cost estimate and compares it with the branch from which the pull request is raised. The difference in the cloud cost estimate is then displayed as a PR comment. Based on the sample GitHub action provided by Infracost, I have developed a GitHub Actions reusable workflow,
terraform-infracost-pr.yml, see below, to show AWS cloud cost estimates for Terraform based on pull requests. To find out what GitHub Actions’ reusable workflow is and how to use it, check out my story, A Deep Dive into GitHub Actions’ Reusable Workflows. In addition to showing the cloud cost estimates, this reusable workflow also:
- Generates an HTML report based on the latest Terraform configuration and usage file.
- Uploads the report to GitHub artifact so the caller workflow can download the report and email it as an attachment.
There are comments on each step in the above reusable workflow to explain the purpose of each step. A few key points to mention:
- Line 33–36, Harden Runer, developed by StepSecurity, is a security action to protect our workflow from supply chain attacks. For more details on Harden Runner, check out my blog, A First Look at Harden-Runner: The Must-Have GitHub Action To Prevent Supply Chain Attacks.
- Line 40–44, the “Setup Infracost” step calls Infracost’s GitHub action
infracost/actions/setup@v2, which installs the latest patch version of the Infracost CLI v0.10.x and gets the backward-compatible bug fixes and new resources. Notice the git SHA
6bdd3cb01a306596e8a614e62af7a9c0a133bc5cthat we locked down for that action. This is a security hardening best practice for GitHub Actions. Pinning an action to a full-length commit SHA is currently the only way to use an action as an immutable release. Pinning to a particular SHA helps mitigate the risk of a bad actor adding a backdoor to the action’s repository.
- Line 73–83, “Generate Infracost diff” step runs
infracost diffCLI generates the diff between the base cost and the cost introduced by the PR changes, and it saves the details in a JSON file, which is used to generate the report in the “Generate Infracost report” step right below it. This JSON format is also what users can upload to Infracost Cloud if they want to use the SaaS features on top of the open source product.
- Notice the last line with
--policy-path, which defines the path to our
infracost-policy.regofile, which allows
infracost commentto execute this policy file to fail or pass the policy check for this workflow.
Step 3: call Infracost reusable workflow
Your app’s Infracost workflow should call the above Infracost reusable workflow. See below a sample workflow that calls Infracost reusable workflow. Notice the trigger is pull request only, no manual trigger, as Infracost workflow requires two branches to compare cloud cost estimates.
Here’s the two jobs in this caller workflow:
infracost: it calls the reusable workflow
terraform-infracost-pr.ymlto generate the report, the PR comment with the diff amount, etc.
send-email: it downloads the report generated from the above job from the GitHub artifact, then calls the reusable workflow to send an email notification to designated recipients with the report in the attachment.
Step 4: Workflow execution result
Once the workflow finishes execution, we can find the Infracost comment in our pull request. See the screenshot below for details:
- The cost for the previous, new, and the diff amount
- Detailed output is to be drilled down to see what resources bumped the cost up/down.
- Whether the diff amount falls within the policy threshold for the max diff amount.
To find out exactly which Terraform file change caused this cost change, click on the “Files changed” tab in that pull request, we see this:
Ah, the memory size change from 512 to 1024 caused an $0.83 increase per month. Not bad!
Meanwhile, the email recipients will receive an email with the Infracost HTML report in its attachment. See below a sample report from our demo app:
In addition to the above GitHub Actions workflow for Terraform pull requests, I also expanded the GitHub Actions Terraform workflow to have a pre-requisite step to do an Infracost analysis right before we run Terraform init/plan/apply. This may appear redundant, but in reality, this is the final gatekeeper to ensure the cloud cost is within the policy threshold before we actually provision our infrastructure through Terraform workflow.
If, for whatever reason, the cloud cost exceeds the policy threshold, this is the last chance we can catch it before Terraform does its work. So our terraform workflow contains two jobs:
- Infracost analysis
- Terraform deployment
If the Infracost analysis job fails, the workflow fails, so Terraform’s job is to never execute until Infracost analysis passes.
For details on the workflow code, refer to the links below:
In addition to running Infracost in our GitHub Actions workflows, we can also run Infracost at the terminal to get cost breakdown, diff, and generate reports, etc.
The following command runs an Infracost breakdown of the cost based on Terraform code:
infracost breakdown --path . --show-skipped --terraform-var-file='./.env/dev/terraform.tfvars' --no-color
The command below runs Infracost breakdown of the cost based on terraform code and projected usage:
infracost breakdown --path . --usage-file './.env/dev/infracost-usage.yml' --terraform-var-file='./.env/dev/terraform.tfvars' --no-color
The following command generates JSON output file first:
infracost breakdown --path . --usage-file './.env/dev/infracost-usage.yml' --terraform-var-file='./.env/dev/terraform.tfvars' --format json --out-file infracost-with-usage.json --show-skipped
Then it renders the JSON file into HTML:
infracost output --path infracost-with-usage.json --format html --out-file report.html --show-skipped
In addition to the wonderful GitHub Actions and CLI mentioned above, Infracost provides developers with another great tool in your IDE! Infracost VSCode extension lets you see your cloud cost when developing your infrastructure configuration. See the screenshot below:
- “Total monthly cost” is displayed right above your module in your
- Click on “Total monthly cost” to drill down to the breakdown of cost per resource, displayed in a table format to the right side of VSCode.
- Note that the total monthly cost calculated in the VS Code extension is based on static cost, not usage data. If your
terraform.tfvarsfile is located in a subdirectory, not in the same directory as the
main.tffile, you may not see the monthly cost. If that happens, be sure to copy your
.tfvarsfile in the same directory as
main.tfduring development or cost analysis, you can obtain the dollar amount.
Upon VS Code startup, the Infracost extension creates a
.infracost folder at your terraform root. This
.infracost folder holds the engine for Infracost so that the cost calculation can be performed right within VSCode. Do not commit this
.infracost folder to your git repository. You can add an additional line entry
**/.infracost/* in your
.gitignore file to ignore folder
.infracost when pushing code to git.
Infracost can be a valuable tool in aiding infrastructure design as well. If you are considering multiple options in designing your infrastructure, such as whether to host your microservices in ECS or EKS, whether to host your UI SPA in S3/CloudFront or ECS as a separate service adjacent to their backend counterparts, Infracost can guide you in making the most cost-effective design option, with real dollar numbers to convince your manager and your team why the option you present is the most cost-effective.
Another major benefit of using Infracost is that it raises developers’ awareness of cloud spending. Seeing the spending amount change as you develop your Terraform code and compare and choose different configuration options to find the best fit is amazing!
Democratizing DevOps with peace of mind is not impossible. Infracost can be that guardrail. We explored Infracost in this story, the what, why, and how. I hope you find this story helpful.
The source code for this story can be found in my GitHub repositories: