Terraform Understand Keys

Table of Contents

Count / For_Each Recap #

I previously wrote an article on when to use Count vs For Each loops. I’ve seen a number of instances lately where people are either using a for_each loop improperly or using count in new configurations and not really comprehending the potential issues that will be caused as those resources are managed or updated in the future.

To give a recap of the previous article, for_each is preferable to count in most cases because every object that’s created is managed independent of the other objects you are looping through. Comparatively, count evaluates each item as an element of a list, and making a change to any list item except for the last may trigger re-creations that could be catastrophic depending on the type of resources that were created.

Keys In State #

The general guidance to use for_ each is often enough, however I’ve seen a configuration recently that basically turned the for_each into a count and nullified the advantage, something like:

resource "azurerm_storage" "example" {
   for_each = {for i, v in var.item_list : i => v}

   property = each.value
}

To understand the problem with that configuration, you need to understand two things: How keys work in objects, and how Terraform is recording your resources in state.

Collections and Keys #

A collection is a type of variable that contains multiple items or elements. So a collection could be a List or a Map, or an Array or Object in more general terms.

Like any other programming language, in order to retrieve an individual element of that collection, you need to reference that element using a key. With a List/Array, that would be a 0-based Index (integer), so to get the first element in a list you would use var.my_list[0], or to get the fifth item it would be var.my_list[4], etc.

Maps, like Hash Tables or Objects in other languages, don’t use an index, they use a named key instead. So if you have a map:

my_map = {
  location       = "eastus",
  resource_group = "my_rg"
}

You could retrieve the location eastus with the named location key: var.my_map["location"].

For lists, the key is a number representing the location of the element in the list, and for maps, the key is the property name.

Resources, Keys, and State #

The reason it’s important to understand keys, is that every resource you create with a loop in Terraform is saved using a key value in the state file. And you need to understand keys to understand why that example configuration may cause problems.

Let’s say you create three storage accounts using count with a List variable:

var.item_list = [
  "storage1",
  "storage2",
  "storage3"
]

resource "azurerm_storage" "example" {
   count = length(var.item_list)

   name = each.value
}

After creation, those storage accounts are stored in state and can be referenced elsewhere in your configuration using:

azurerm_storage.example[0] # storage1
azurerm_storage.example[1] # storage2
azurerm_storage.example[2] # storage3

The keys in this example are the main issue, and the important consideration when making resources with loops. Because a count will always start at zero and increment by 1, if you were to remove "storage2" from var.item_list above, then the resources change to:

azurerm_storage.example[0] # storage1
azurerm_storage.example[1] # storage3

With storage3 becoming azurerm_storage.example[1] in the list, which causes it to be deleted and re-created.

This is why index-based keys are so dangerous in a Terraform configuration. If you are not paying close attention, it’s easy to put yourself in a difficult situation later if you need to modify a list input variable.

Use Named References #

Understanding how index-based keys are so troublesome, let’s look again at the problematic configuration example from earlier:

resource "azurerm_storage" "example" {
   for_each = {for i, v in var.item_list : i => v}

   property = each.value
}

Declaring two variables in a for loop with a List as the input creates index-based keys for each element in the list, exactly the same as a Count loop. So if we use the same var.item_list, it becomes:

{for i, v in var.item_list : i => v} = {
  0 = "storage1",
  1 = "storage2",
  2 = "storage3"
}

By now I’m sure you see right away where the problem is.

That configuration would be simple to correct, and we can use it as a final illustration of named keys. If we instead use either the toset() function or use the value as the key like {for i, v in var.item_list : v => v} (which is exactly what toset() does), our problem is fixed:

resource "azurerm_storage" "example" {
   for_each = {for index, value in var.item_list : value => value}

   property = each.value
}

This now turns var.item_list into a map where the property name and values are identical:

var.item_list = {
  storage1 = "storage1",
  storage2 = "storage2",
  storage3 = "storage3"
}

But most importantly, now every Resource that is created exists in state with it’s own unique identifier that is not linked in any way with the other resources created with the same loop. Each resource is instead just referenced using a unique property of the resource itself:

azurerm_storage.example["storage1"]
azurerm_storage.example["storage2"]
azurerm_storage.example["storage3"]

Referencing Looped Resources #

It’s important to understand how Keys are used to avoid the issues described above, however there is one additional benefit to developing an understanding of them. If you know how Terraform uses keys to store objects in state, then it’s easy to understand how to reference objects created by loops elsewhere in your configuration or as Outputs.

For index-based keys this is simple because it’s just a number. For name-based keys, it’s easy to identify the key, all you need to pay attention to is the each.key attribute of the objects.

If you are looping through a variable provided to the configuration, you may or may not have control over the key, but you know it will always be the property name of the objects in the variable you are looping through:

sample_loop = {
  resource1 = {
    name        = "kevin"
    description = "silly"
  },
  resource2 = {
    name        = "kevin"
    description = "silly"
  }
}

If you are for_eaching through var.sample_loop then the each.key values will be resource1 and resource2, and could be referenced using those keys either as a looped resource:

azurerm_storage.example["resource1"]

or as a looped module:

module.example["resource2"]

If you are not looping through a provided variable and instead are taking input and formatting the data as a local. variable, this becomes crucial to keep in mind. You set the keys to whatever you want, which makes your life much easier when creating complicated sets of resources that re-use objects, because you can write your configuration to make sure that each related resource or module can use the same key when referencing each other.

That concept may be difficult to envision, but I’ll be writing two posts soon that puts the concepts into practice. The first will be an in depth look at variables created using for, and how to nest then without losing your mind. The second will go over how to use JSON data files as an easy interface for other teams to provision Terraform resources without modifying your code. Keep watch for both.