Terraform 101
Read without Paywall : https://harshityadav.in/posts/Terraform-101/
Terraform in a nutshell
In its most basic form, Terraform is an application that converts configuration files known as HCL (Hashicorp Configuration Language) into real-world infrastructure, usually in Cloud providers such as AWS, Azure, or Google Cloud Platform.
Infrastructure as Code
- This concept of taking configuration files and converting them into real resources is known as IaC (Infrastructure as Code) and is the new hotness in the world of Software Engineering. And the reason it is becoming so hot right now is because this code can live alongside your app code in repos, be version controlled, and easily integrated into your CI/CD pipelines.
Why Terraform?
- Terraform is a state-driven Cloud Platform provisioning engine.
- It leverages abstraction tooling (known as providers and backends) to enable us to write code that can be interpreted and translated into consistent, and deterministic, Cloud provider-specific CRUD API calls, removing a lot of leg work and stress from us.
- Removes the need for in-house custom infrastructure provisioning scripts and/or tooling.
- Terminates human error because automated code deployments remove the human element.
It is important to note that IaC is not a new concept, in fact, it has been around in different guises since the very early days of Cloud Platforms
- AWS had CloudFormation
- Azure had ARM Templates
Terraform Core Concepts
1. Variables:
Terraform has input and output variables, it is a key-value pair. Input variables are used as parameters to input values at run time to customize our deployments. Output variables are return values of a terraform module that can be used by other configurations.Read our blog on Terraform Variables
2. Provider:
Terraform users provision their infrastructure on the major cloud providers such as AWS, Azure, OCI, and others. A provider is a plugin that interacts with the various APIs required to create, update, and delete various resources.Read our blog to know more about Terraform Providers
3. Module:
Any set of Terraform configuration files in a folder is a module. Every Terraform configuration has at least one module, known as its root module.
4. State:
Terraform records information about what infrastructure is created in a Terraform state file. With the state file, Terraform is able to find the resources it created previously, supposed to manage and update them accordingly.
5. Resources:
Cloud Providers provides various services in their offerings, they are referenced as Resources in Terraform. Terraform resources can be anything from compute instances, virtual networks to higher-level components such as DNS records. Each resource has its own attributes to define that resource.
6. Data Source:
Data source performs a read-only operation. It allows data to be fetched or computed from resources/entities that are not defined or managed by Terraform or the current Terraform configuration.
7. Plan:
It is one of the stages in the Terraform lifecycle where it determines what needs to be created, updated, or destroyed to move from the real/current state of the infrastructure to the desired state.
8. Apply:
It is one of the stages in the Terraform lifecycle where it applies the changes real/current state of the infrastructure in order to achieve the desired state.
Terraform States
Everything Terraform does revolves around State. If you don’t understand how Terraform manages and maintains State; you don’t understand how Terraform works
let’s imagine you are in a restaurant and when the waiter comes over to your table, you order the following dishes:
- Soup
- Steak
- Side of chips
- Ice cream
In Terraform terminology this is your “desired state”, in order for you to be satisfied, you want all 4 of those items you’ve ordered.
- Well, Terraform is exactly the same, and in Terraform terminology, when our “desired state” gets manifested into reality, it becomes known as the “actual state”.
When we tell Terraform to do a deployment, it will do a sequence of steps:
Terraform, there is a chance that someone has “removed” your resource, so Terraform must first check if our resource first exists in State (AKA has been created previously), and then it must check with Azure to ensure that the actual state matches what it expects to find.
Please note: The term commonly given to this chain of operations to bring desired and actual state into alignment is: Reconciliation; Terraform reconciles desired state into the actual state.
CRUD in Terraform
Create
- A desired resource doesn’t exist in our Terraform State.
- A desired resource exists in Terraform State, but ends up not actually existing in our Cloud.
Update
- A desired resource exists in Terraform State, but is configured differently in our Cloud.
Delete
- A resource gets removed from our desired state, but still exists in our Terraform State.
Working of Terraform
- It will parse our HCL configuration/code files.
- Using the information in our HCL, Terraform will build up a graph of all the resources we want to provision (desired state) and figure out any dependencies between them to try and decide a logical order they need to be created.
- Terraform will next inspect its State to better understand what it has and hasn’t deployed (if it is our first deployment, the State will be empty). This is known as the perceived state. It is a perceived state because there is a disconnect between what Terraform “thinks” and what “actually” exists.
- Terraform next performs a logical delta between our desired state, and what it knows to be our perceived state. It then decides which CRUD actions it needs to complete, and the order to perform them in, order to bring our perceived state in line with our desired state.
- Terraform next performs all necessary operations to achieve the desired state. The result of this operation will be that resources will likely start to appear in our Azure subscription and this then becomes known as the actual state.
- Terraform updates the state to reflect what it has done.
What happens when we re-run our Terraform?
- Terraform parses our HCL configuration.
- It sees our desired state is to have a Resource Group
- It checks in our State to see if there is an entry for a Resource Group with the state identifier
example
(as mentioned before, notice that the resource is identified as “example”, not the name attribute “example-resources”, just like with variables in other languages) - Terraform sees that the State contains an entry (perceived state), and so next go to Azure to query the actual state of the Resource Group.
- Azure reports back a 404 that our Provider will interpret as the Resource Group does not exist 😱
- Terraform now performs the delta between “Desired state” and “Actual state” and realizes that the necessary action to perform is to Create it.
- Terraform performs the necessary actions to create the Resource Group
- Terraform updates its State as necessary.
What if the State file is lost and Terraform is re-run?
- Terraform parses our HCL configuration/code files.
- Terraform will build up a graph of all the resources we want.
- Terraform checks the State and finds it empty.
- Terraform does its logical delta between our desired state and what is in its State, and decides it needs to create a resource (remember, Terraform hasn’t queried Azure because the state file didn’t exist/was empty)
- Terraform attempts to create a resource group named “example-resources”… BANG!!! 💥
- We now have a situation where Terraform errors because Azure already has a resource group with that name and we cannot create another one with the same name.
Terraform does not know anything about provisioned resources in your Cloud Provider unless it’s in the State
Terraform Lifecycle
Initialization
$ terraform init
- The first command we need to run is initialized.
- This command is used to initialize a working directory containing
- Terraform configuration files and instructs Terraform to interrogate the HCL files, determine the Providers needed, download them, and initialize a State if it doesn’t already exist.
- It is perfectly safe to run init multiple times, and in fact, you will likely need to do so if you add new providers or change any settings within your Terraform block.
Planning
$ terraform plan
- The next command we run is
plan
. - This command instructs Terraform to parse our HCL files, build a graph of our resources, check its state and attempt to come up with an execution plan to perform.
- It is perfectly valid for our
init
to succeed but ourplan
to fail. This is becauseinit
doesn’t really concern itself with trying to determine if any of our resources are valid or if they exist. - Only when Terraform starts to interpret our “desired state” will we start to find syntax errors in our files.
- The output of the
plan
command will be a complex list of operations that Terraform has decided to perform. - It is highly recommended that you review these changes to ensure they match with those you expected to see as destructive operations can be costly
- if they are done in error. Think of this as a “dry run” of all your Terraform.
- It is also worth noting that the output of the plan can be saved into a file and used for the next step
apply
as this ensures that there can be no confusion/discrepancies between what was planned and that which was applied.
Applying
$ terraform apply
- Simply put this operation tells Terraform to execute all its planned operations.
- This will usually cause Terraform to redo its plan (unless a plan file is provided to the command) and then present you with an “Are you sure?” prompt. This is your point of no return.
- If you say yes then Terraform is going to start really provisioning things for you.
This is probably the simplest of all the commands, but it does have some interesting abilities, such as “auto approval” and the ability to increase or decrease the parallelization of the execution.
1If you are renaming resource identifiers, make sure the change is really really really really necessary!
TLDR
- Terraform is a state-driven engine that allows us to provision cloud infrastructure easily and consistently.
- Terraform utilizes code known as HCL (Hashicorp Configuration Language)
- HCL uses the keyword
resource
to define “resources” we wish to have provisioned in our Cloud. - HCL allows us to use the configuration/output from one resource as the input for the configuration/attributes of another resource.
- Terraform interfaces with different Cloud technologies using Providers.
- When terraform runs it will parse the HCL files and build a graph of resources we want — known as Desired State
- By “chaining” resources together, it enables Terraform to make explicit dependencies between objects in its graph.
- Terraform stores knowledge about all the resources it has provisioned previously in a file known as its State file.
- The contents of the State file are known as Perceived State — the state Terraform left the environment the last time it was run using our HCL files.
- Terraform uses Backends to determine how the State should be persisted.
- There are many different Backends that can be used depending on the Cloud Service Provider we use and how we want Terraform to persist State.
- When Terraform wants to provide a resource, and that resource exists in its Perceived State, it will interrogate the Cloud Provider to determine the Actual State.
- If there are discrepancies between desired, perceived, and actual states, Terraform will determine the corrective action required to bring actual in line with desired.
- Terraform has 3 distinct lifecycle stages:
init
,plan
andapply
- If you rename a resource identifier, Terraform will act as if it is a brand new resource it’s never seen before, and purge everything to do with resources relating to the old identifier.