Monday, 2 October 2017

Terraform: A Complete Tutorial for Beginners

Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently. Terraform can manage existing and popular service providers as well as custom in-house solutions.
Configuration files describe to Terraform the components needed to run a single application or your entire datacenter. Terraform generates an execution plan describing what it will do to reach the desired state, and then executes it to build the described infrastructure. As the configuration changes, Terraform is able to determine what changed and create incremental execution plans which can be applied.
The infrastructure Terraform can manage includes low-level components such as compute instances, storage, and networking, as well as high-level components such as DNS entries, SaaS features, etc.
The key features of Terraform are:
  • Infrastructure as Code: Infrastructure is described using a high-level configuration syntax. This allows a blueprint of your datacenter to be versioned and treated as you would any other code. Additionally, infrastructure can be shared and re-used.
  • Execution Plans: Terraform has a "planning" step where it generates an execution plan. The execution plan shows what Terraform will do when you call apply. This lets you avoid any surprises when Terraform manipulates infrastructure.
  • Resource Graph: Terraform builds a graph of all your resources, and parallelizes the creation and modification of any non-dependent resources. Because of this, Terraform builds infrastructure as efficiently as possible, and operators get insight into dependencies in their infrastructure.
  • Change Automation: Complex changesets can be applied to your infrastructure with minimal human interaction. With the previously mentioned execution plan and resource graph, you know exactly what Terraform will change and in what order, avoiding many possible human errors.

Why terraform:

Terraform provides a flexible abstraction of resources and providers. This model allows for representing everything from physical hardware, virtual machines, and containers, to email and DNS providers. Because of this flexibility, Terraform can be used to solve many different problems. This means there are a number of existing tools that overlap with the capabilities of Terraform. 

Multi-Tier Applications

A very common pattern is the N-tier architecture. The most common 2-tier architecture is a pool of web servers that use a database tier. Additional tiers get added for API servers, caching servers, routing meshes, etc. This pattern is used because the tiers can be scaled independently and provide a separation of concerns.
Terraform is an ideal tool for building and managing these infrastructures. Each tier can be described as a collection of resources, and the dependencies between each tier are handled automatically; Terraform will ensure the database tier is available before the web servers are started and that the load balancers are aware of the web nodes. Each tier can then be scaled easily using Terraform by modifying a single count configuration value. Because the creation and provisioning of a resource is codified and automated, elastically scaling with load becomes trivial.

Self-Service Clusters

At a certain organizational size, it becomes very challenging for a centralized operations team to manage a large and growing infrastructure. Instead it becomes more attractive to make "self-serve" infrastructure, allowing product teams to manage their own infrastructure using tooling provided by the central operations team.
Using Terraform, the knowledge of how to build and scale a service can be codified in a configuration. Terraform configurations can be shared within an organization enabling customer teams to use the configuration as a black box and use Terraform as a tool to manage their services.

Terraform basics:

Terraform must first be installed on your machine. Terraform is distributed as a binary package for all supported platforms and architecture. This page will not cover how to compile Terraform from source.

Installing Terraform

To install Terraform, find the appropriate package for your system and download it. Terraform is packaged as a zip archive.
After downloading Terraform, unzip the package into a directory where Terraform will be installed. The directory will contain a binary program terraform. The final step is to make sure the directory you installed Terraform to is on the PATH. See this page for instructions on setting the PATH on Linux and Mac. This page contains instructions for setting the PATH on Windows.
Example for Linux/Mac - Type the following into your terminal:
PATH=/usr/local/terraform/bin:/home/your-user-name/terraform:$PATH
Example for Windows - Type the following into Powershell:
[Environment]::SetEnvironmentVariable("PATH", $env:PATH + ({;C:\terraform},{C:\terraform})[$env:PATH[-1] -eq ';'], "User")

Verifying the Installation

After installing Terraform, verify the installation worked by opening a new terminal session and checking that terraform is available. By executingterraform you should see help output similar to that below:
$ terraform
Usage: terraform [--version] [--help]  [args]

The available commands for execution are listed below.
The most common, useful commands are shown first, followed by
less common or more advanced commands. If you're just getting
started with Terraform, stick with the common commands. For the
other commands, please read the help and docs before usage.

Common commands:
    apply              Builds or changes infrastructure
    destroy            Destroy Terraform-managed infrastructure
    fmt                Rewrites config files to canonical format
    get                Download and install modules for the configuration
    graph              Create a visual graph of Terraform resources
    import             Import existing infrastructure into Terraform
    init               Initializes Terraform configuration from a module
    output             Read an output from a state file
    plan               Generate and show an execution plan
    push               Upload this Terraform module to Atlas to run
    refresh            Update local state file against real resources
    remote             Configure remote state storage
    show               Inspect Terraform state or plan
    taint              Manually mark a resource for recreation
    untaint            Manually unmark a resource as tainted
    validate           Validates the Terraform files
    version            Prints the Terraform version

All other commands:
    state              Advanced state management
If you get an error that terraform could not be found, then your PATH environment variable was not setup properly. Please go back and ensure that your PATH variable contains the directory where Terraform was installed.
Otherwise, Terraform is installed and ready to go!

Configuration

The set of files used to describe infrastructure in Terraform is simply known as a Terraform configuration. We're going to write our first configuration now to launch a single AWS EC2 instance.
The format of the configuration files is documented here. Configuration files can also be JSON, but we recommend only using JSON when the configuration is generated by a machine.
The entire configuration is shown below. We'll go over each part after. Save the contents to a file named example.tf. Verify that there are no other *.tf files in your directory, since Terraform loads all of them.
provider "aws" {
  access_key = "ACCESS_KEY_HERE"
  secret_key = "SECRET_KEY_HERE"
  region     = "us-east-1"
}

resource "aws_instance" "example" {
  ami           = "ami-0d729a60"
  instance_type = "t2.micro"
}
Replace the ACCESS_KEY_HERE and SECRET_KEY_HERE with your AWS access key and secret key, available from this page. We're hardcoding them for now, but will extract these into variables later in the getting started guide.
This is a complete configuration that Terraform is ready to apply. The general structure should be intuitive and straightforward.
The provider block is used to configure the named provider, in our case "aws." A provider is responsible for creating and managing resources. Multiple provider blocks can exist if a Terraform configuration is composed of multiple providers, which is a common situation.
The resource block defines a resource that exists within the infrastructure. A resource might be a physical component such as an EC2 instance, or it can be a logical resource such as a Heroku application.
The resource block has two strings before opening the block: the resource type and the resource name. In our example, the resource type is "aws_instance" and the name is "example." The prefix of the type maps to the provider. In our case "aws_instance" automatically tells Terraform that it is managed by the "aws" provider.
Within the resource block itself is configuration for that resource. This is dependent on each resource provider and is fully documented within ourproviders reference. For our EC2 instance, we specify an AMI for Ubuntu, and request a "t2.micro" instance so we qualify under the free tier.
In the same directory as the example.tf file you created, run terraform plan. You should see output similar to what is copied below. We've truncated some of the output to save space.
$ terraform plan
...

+ aws_instance.example
    ami:                      "ami-0d729a60"
    availability_zone:        ""
    ebs_block_device.#:       ""
    ephemeral_block_device.#: ""
    instance_state:           ""
    instance_type:            "t2.micro"
    key_name:                 ""
    placement_group:          ""
    private_dns:              ""
    private_ip:               ""
    public_dns:               ""
    public_ip:                ""
    root_block_device.#:      ""
    security_groups.#:        ""
    source_dest_check:        "true"
    subnet_id:                ""
    tenancy:                  ""
    vpc_security_group_ids.#: ""
terraform plan shows what changes Terraform will apply to your infrastructure given the current state of your infrastructure as well as the current contents of your configuration.
If terraform plan failed with an error, read the error message and fix the error that occurred. At this stage, it is probably a syntax error in the configuration.
The output format is similar to the diff format generated by tools such as Git. The output has a "+" next to "aws_instance.example", meaning that Terraform will create this resource. Beneath that, it shows the attributes that will be set. When the value displayed is , it means that the value won't be known until the resource is created.

Apply

The plan looks good, our configuration appears valid, so it's time to create real resources. Run terraform apply in the same directory as your example.tf, and watch it go! It will take a few minutes since Terraform waits for the EC2 instance to become available.
$ terraform apply
aws_instance.example: Creating...
  ami:                      "" => "ami-0d729a60"
  instance_type:            "" => "t2.micro"
  [...]

aws_instance.example: Still creating... (10s elapsed)
aws_instance.example: Creation complete

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

...
Done! You can go to the AWS console to prove to yourself that the EC2 instance has been created.
Terraform also puts some state into the terraform.tfstate file by default. This state file is extremely important; it maps various resource metadata to actual resource IDs so that Terraform knows what it is managing. This file must be saved and distributed to anyone who might run Terraform. It is generally recommended to setup remote state when working with Terraform. This will mean that any potential secrets stored in the state file, will not be checked into version control
You can inspect the state using terraform show:
$ terraform show
aws_instance.example:
  id = i-32cf65a8
  ami = ami-0d729a60
  availability_zone = us-east-1a
  instance_state = running
  instance_type = t2.micro
  private_ip = 172.31.30.244
  public_dns = ec2-52-90-212-55.compute-1.amazonaws.com
  public_ip = 52.90.212.55
  subnet_id = subnet-1497024d
  vpc_security_group_ids.# = 1
  vpc_security_group_ids.3348721628 = sg-67652003
You can see that by creating our resource, we've also gathered a lot more metadata about it. This metadata can actually be referenced for other resources or outputs, which will be covered later in the getting started guide.

Provisioning

The EC2 instance we launched at this point is based on the AMI given, but has no additional software installed. If you're running an image-based infrastructure (perhaps creating images with Packer), then this is all you need.
Destroying your infrastructure is a rare event in production environments. But if you're using Terraform to spin up multiple environments such as development, test, QA environments, then destroying is a useful action.

Plan

Before destroying our infrastructure, we can use the plan command to see what resources Terraform will destroy.
$ terraform plan -destroy
...

- aws_instance.example
With the -destroy flag, we're asking Terraform to plan a destroy, where all resources under Terraform management are destroyed. You can use this output to verify exactly what resources Terraform is managing and will destroy.

Destroy

Let's destroy the infrastructure now:
$ terraform destroy
aws_instance.example: Destroying...

Apply complete! Resources: 0 added, 0 changed, 1 destroyed.

...
The terraform destroy command should ask you to verify that you really want to destroy the infrastructure. Terraform only accepts the literal "yes" as an answer as a safety mechanism. Once entered, Terraform will go through and destroy the infrastructure.
Just like with apply, Terraform is smart enough to determine what order things should be destroyed. In our case, we only had one resource, so there wasn't any ordering necessary. But in more complicated cases with multiple resources, Terraform will destroy in the proper order.

Implicit and Explicit Dependencies

Most dependencies in Terraform are implicit: Terraform is able to infer dependencies based on usage of attributes of other resources.
Using this information, Terraform builds a graph of resources. This tells Terraform not only in what order to create resources, but also what resources can be created in parallel. In our example, since the IP address depended on the EC2 instance, they could not be created in parallel.
Implicit dependencies work well and are usually all you ever need. However, you can also specify explicit dependencies with the depends_on parameter which is available on any resource. For example, we could modify the "aws_eip" resource to the following, which effectively does the same thing and is redundant:
resource "aws_eip" "ip" {
    instance = "${aws_instance.example.id}"
    depends_on = ["aws_instance.example"]
}
If you're ever unsure about the dependency chain that Terraform is creating, you can use the terraform graph command to view the graph. This command outputs a dot-formatted graph which can be viewed with Graphviz.

Non-Dependent Resources

We can now augment the configuration with another EC2 instance. Because this doesn't rely on any other resource, it can be created in parallel to everything else.
resource "aws_instance" "another" {
  ami           = "ami-13be557e"
  instance_type = "t2.micro"
}
You can view the graph with terraform graph to see that nothing depends on this and that it will likely be created in parallel.
Before moving on, remove this resource from your configuration and terraform apply again to destroy it. We won't use the second instance anymore in the getting started guide.

Defining Variables

Let's first extract our access key, secret key, and region into a few variables. Create another file variables.tf with the following contents.
Note: that the file can be named anything, since Terraform loads all files ending in .tf in a directory.
variable "access_key" {}
variable "secret_key" {}
variable "region" {
  default = "us-east-1"
}
This defines three variables within your Terraform configuration. The first two have empty blocks {}. The third sets a default. If a default value is set, the variable is optional. Otherwise, the variable is required. If you run terraform plan now, Terraform will prompt you for the values for unset string variables.

Using Variables in Configuration

Next, replace the AWS provider configuration with the following:
provider "aws" {
  access_key = "${var.access_key}"
  secret_key = "${var.secret_key}"
  region     = "${var.region}"
}
This uses more interpolations, this time prefixed with var.. This tells Terraform that you're accessing variables. This configures the AWS provider with the given variables.

Assigning Variables

There are multiple ways to assign variables. Below is also the order in which variable values are chosen. The following is the descending order of precedence in which variables are considered.

Command-line flags

You can set variables directly on the command-line with the -var flag. Any command in Terraform that inspects the configuration accepts this flag, such as applyplan, and refresh:
$ terraform plan \
  -var 'access_key=foo' \
  -var 'secret_key=bar'
...
Once again, setting variables this way will not save them, and they'll have to be input repeatedly as commands are executed.

From a file

To persist variable values, create a file and assign variables within this file. Create a file named terraform.tfvars with the following contents:
access_key = "foo"
secret_key = "bar"
If a terraform.tfvars file is present in the current directory, Terraform automatically loads it to populate variables. If the file is named something else, you can use the -var-file flag directly to specify a file. These files are the same syntax as Terraform configuration files. And like Terraform configuration files, these files can also be JSON.

From environment variables

Terraform will read environment variables in the form of TF_VAR_name to find the value for a variable. For example, the TF_VAR_access_key variable can be set to set the access_key variable.
We don't recommend saving usernames and password to version control, But you can create a local secret variables file and use -var-file to load it.
You can use multiple -var-file arguments in a single command, with some checked in to version control and others not checked in. For example:
$ terraform plan \
  -var-file="secret.tfvars" \
  -var-file="production.tfvars"
Note: Environment variables can only populate string-type variables. List and map type variables must be populated via one of the other mechanisms.

Creating AWS resources:

  1. VPC:resource "aws_vpc" "main" { cidr_block = "10.0.0.0/16" }
  2. Internet Gateway
    resource "aws_internet_gateway" "gw" {
        vpc_id = "${aws_vpc.main.id}"
    
        tags {
            Name = "main"
        }
    }
    
    The following arguments are supported:
    • vpc_id - (Required) The VPC ID to create in.
    • tags - (Optional) A mapping of tags to assign to the resource.
  3. Security Group
    Basic usage
    resource "aws_security_group" "allow_all" {
      name = "allow_all"
      description = "Allow all inbound traffic"
    
      ingress {
          from_port = 0
          to_port = 0
          protocol = "-1"
          cidr_blocks = ["0.0.0.0/0"]
      }
    
      egress {
          from_port = 0
          to_port = 0
          protocol = "-1"
          cidr_blocks = ["0.0.0.0/0"]
          prefix_list_ids = ["pl-12c4e678"]
      }
    }
  4. Subnet
    variable "subnet_id" {}
    
    data "aws_subnet" "selected" {
      id = "${var.subnet_id}"
    }
    
    resource "aws_security_group" "subnet" {
      vpc_id = "${data.aws_subnet.selected.vpc_id}"
    
      ingress {
        cidr_blocks = ["${data.aws_subnet.selected.cidr_block}"]
        from_port   = 80
        to_port     = 80
        protocol    = "tcp"
      }
    }
  5. Route Table
    variable "subnet_id" {}
    
    data "aws_route_table" "selected" {
      subnet_id = "${var.subnet_id}"
    }
    
    resource "aws_route" "route" {
      route_table_id = "${data.aws_route_table.selected.id}"
      destination_cidr_block = "10.0.1.0/22"
      vpc_peering_connection_id = "pcx-45ff3dc1"
    }
  6. Route table association
    resource "aws_route_table_association" "a" {
        subnet_id = "${aws_subnet.foo.id}"
        route_table_id = "${aws_route_table.bar.id}"
    }
  7. EC2 instance
    # Create a new instance of the latest Ubuntu 14.04 on an
    # t2.micro node with an AWS Tag naming it "HelloWorld"
    provider "aws" {
        region = "us-west-2"
    }
    
    data "aws_ami" "ubuntu" {
      most_recent = true
      filter {
        name = "name"
        values = ["ubuntu/images/hvm-ssd/ubuntu-trusty-14.04-amd64-server-*"]
      }
      filter {
        name = "virtualization-type"
        values = ["hvm"]
      }
      owners = ["099720109477"] # Canonical
    }
    
    resource "aws_instance" "web" {
        ami = "${data.aws_ami.ubuntu.id}"
        instance_type = "t2.micro"
        tags {
            Name = "HelloWorld"
        }
    }
  8. EC2 instance elastic IP
    resource "aws_eip" "lb" {
      instance = "${aws_instance.web.id}"
      vpc      = true
    }
  9. Provisioner without bastion
    Many provisioners require access to the remote resource. For example, a provisioner may need to use SSH or WinRM to connect to the resource.
    Terraform uses a number of defaults when connecting to a resource, but these can be overridden using a connection block in either a resource or provisioner. Any connection information provided in a resource will apply to all the provisioners, but it can be scoped to a single provisioner as well. One use case is to have an initial provisioner connect as the root user to setup user accounts, and have subsequent provisioners connect as a user with more limited permissions.
    # Copies the file as the root user using SSH
    provisioner "file" {
        source = "conf/myapp.conf"
        destination = "/etc/myapp.conf"
        connection {
            type = "ssh"
            user = "root"
            password = "${var.root_password}"
        }
    }
    
    # Copies the file as the Administrator user using WinRM
    provisioner "file" {
        source = "conf/myapp.conf"
        destination = "C:/App/myapp.conf"
        connection {
            type = "winrm"
            user = "Administrator"
            password = "${var.admin_password}"
        }
    }
  10. Provisioner with bastion:provisioner "file" {
    source = "configure.sh"
    destination = "/tmp/configure.sh"
    connection {
    agent = false
    bastion_host = "${var.nat_public_ip}"
    bastion_user = "ec2-user"
    bastion_port = 22
    bastion_private_key = "${file("${var.tf_home}/${var.aws_key_path}")}"
    user = "centos"
    host = "${self.private_ip}"
    private_key = "${file("${var.tf_home}/${var.aws_key_path}")}"
    timeout = "2m"
    }
    }
  11. Remote-exec provisioner
    The remote-exec provisioner invokes a script on a remote resource after it is created. This can be used to run a configuration management tool, bootstrap into a cluster, etc. To invoke a local process, see the local-exec provisioner instead. The remote-exec provisioner supports both ssh and winrm type connections.
    # Run puppet and join our Consul cluster
    resource "aws_instance" "web" {
        ...
        provisioner "remote-exec" {
            inline = [
            "puppet apply",
            "consul join ${aws_instance.web.private_ip}"
            ]
        }
    }
  12. Output
    Outputs are a way to tell Terraform what data is important. This data is outputted when apply is called, and can be queried using the terraform output command.Let's define an output to show us the public IP address of the elastic IP address that we create. Add this to any of your *.tf files:
    output "ip" {
        value = "${aws_eip.ip.public_ip}"
    }
    
    This defines an output variable named "ip". The value field specifies what the value will be, and almost always contains one or more interpolations, since the output data is typically dynamic. In this case, we're outputting the public_ipattribute of the elastic IP address.
    Multiple output blocks can be defined to specify multiple output variables.
  13. Modules
    Modules in Terraform are self-contained packages of Terraform configurations that are managed as a group. Modules are used to create reusable components, improve organization, and to treat pieces of infrastructure as a black box.
     Create a configuration file with the following contents: 
    provider "aws" {
        access_key = "AWS ACCESS KEY"
        secret_key = "AWS SECRET KEY"
        region = "AWS REGION"
    }
    
    module "consul" {
        source = "github.com/hashicorp/consul/terraform/aws"
    
        key_name = "AWS SSH KEY NAME"
        key_path = "PATH TO ABOVE PRIVATE KEY"
        region = "us-east-1"
        servers = "3"
    }
    
      The module block tells Terraform to create and manage a module. It is very similar to the resource block. It has a logical name -- in this case "consul" -- and a set of configurations.
     The source configuration is the only mandatory key for modules. It tells Terraform where the module can be retrieved. Terraform automatically downloads and manages modules for you. For our example, we're getting the module directly from GitHub. Terraform can retrieve modules from a variety of sources including Git, Mercurial, HTTP, and file paths.
     The other configurations are parameters to our module. Please fill them in with the proper values.
     Prior to running any command such as plan with a configuration that uses modules, you'll have to get the modules. This is done using the get command.
    $ terraform get
    ...
    
     This command will download the modules if they haven't been already. By default, the command will not check for updates, so it is safe (and fast) to run multiple times. You can use the -u flag to check and download updates.
    With the modules downloaded, we can now plan and apply it. If you runterraform plan, you should see output similar to below:
    $ terraform plan
    ...
    + module.consul.aws_instance.server.0
    ...
    + module.consul.aws_instance.server.1
    ...
    + module.consul.aws_instance.server.2
    ...
    + module.consul.aws_security_group.consul
    ...
    Plan: 4 to add, 0 to change, 0 to destroy.
    
    Conceptually, the module is treated like a black box. In the plan, however Terraform shows each resource the module manages so you can see each detail about what the plan will do. If you'd like compressed plan output, you can specify the -module-depth= flag to get Terraform to output summaries by module.
    Next, run terraform apply to create the module. Note that as we warned above, the resources this module creates are outside of the AWS free tier, so this will have some cost associated with it.
    $ terraform apply
    ...
    Apply complete! Resources: 3 added, 0 changed, 0 destroyed.
    
    After a few minutes, you'll have a three server Consul cluster up and running! Without any knowledge of how Consul works, how to install Consul, or how to configure Consul into a cluster

Anonymous

Author & Editor

A technology enthusiast and addictive blogger who likes to hacking tricks and wish to be the best White Hacket Hacker of the World.

0 comments:

Post a Comment

Note: only a member of this blog may post a comment.