Infrastructure as Code with Terraform for cloud-agnostic infrastructure management
16 min read
Introduction to Terraform
Terraform is an open-source Infrastructure as Code (IaC) tool created by HashiCorp that enables you to safely and predictably create, change, and improve infrastructure. It uses a declarative configuration language called HCL (HashiCorp Configuration Language) to manage cloud resources across multiple providers.
Core Concepts of Terraform
Key Terminology
- Provider: Plugins that interact with cloud platforms
- Resource: Infrastructure components (VMs, networks, etc.)
- Module: Reusable configuration packages
- State: Current state of managed infrastructure
- Plan: Execution plan showing what will change
- Apply: Execute changes to reach desired state
Terraform Installation and Setup
Installing Terraform
# Linux/MacOS with Homebrew brew tap hashicorp/tap brew install hashicorp/tap/terraform # Ubuntu/Debian wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor | sudo tee /usr/share/keyrings/hashicorp-archive-keyring.gpg echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list sudo apt update && sudo apt install terraform # Windows with Chocolatey choco install terraform # Verify installation terraform version
Basic Configuration
# main.tf - Basic Terraform configuration
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
}
}
backend "s3" {
bucket = "my-terraform-state"
key = "production/terraform.tfstate"
region = "us-east-1"
}
}
# Configure AWS Provider
provider "aws" {
region = var.aws_region
profile = var.aws_profile
}
Basic Terraform Configuration
Variables and Outputs
# variables.tf
variable "aws_region" {
description = "AWS region"
type = string
default = "us-east-1"
}
variable "instance_type" {
description = "EC2 instance type"
type = string
default = "t3.micro"
}
variable "environment" {
description = "Deployment environment"
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be dev, staging, or prod."
}
}
# outputs.tf
output "instance_public_ip" {
description = "Public IP address of the EC2 instance"
value = aws_instance.web.public_ip
}
output "vpc_id" {
description = "ID of the VPC"
value = aws_vpc.main.id
}
Basic Resource Configuration
# ec2-instance.tf
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
tags = {
Name = "main-vpc-${var.environment}"
Environment = var.environment
}
}
resource "aws_subnet" "public" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.1.0/24"
availability_zone = "us-east-1a"
tags = {
Name = "public-subnet-${var.environment}"
}
}
resource "aws_instance" "web" {
ami = "ami-0c02fb55956c7d316"
instance_type = var.instance_type
subnet_id = aws_subnet.public.id
vpc_security_group_ids = [aws_security_group.web.id]
user_data = <<-EOF
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "Hello from $(hostname -f)
" > /var/www/html/index.html
EOF
tags = {
Name = "web-server-${var.environment}"
}
}
resource "aws_security_group" "web" {
name = "web-sg-${var.environment}"
description = "Allow HTTP and SSH traffic"
vpc_id = aws_vpc.main.id
ingress {
description = "HTTP"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "SSH"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
Terraform Commands and Workflow
Basic Terraform Workflow
# Initialize Terraform terraform init # Validate configuration terraform validate # Format code terraform fmt # Create execution plan terraform plan -out=tfplan # Apply changes terraform apply tfplan # Destroy infrastructure terraform destroy # Show current state terraform show # List resources terraform state list
Advanced Commands
# Import existing resources terraform import aws_instance.web i-1234567890abcdef0 # Move resources terraform state mv aws_instance.old aws_instance.new # Remove from state terraform state rm aws_instance.orphaned # Refresh state terraform refresh # Workspace management terraform workspace new development terraform workspace select production terraform workspace list
Terraform Modules
Creating Modules
# modules/ec2-instance/main.tf
resource "aws_instance" "this" {
ami = var.ami_id
instance_type = var.instance_type
subnet_id = var.subnet_id
vpc_security_group_ids = var.security_group_ids
key_name = var.key_name
tags = merge(
var.tags,
{
Name = var.name
}
)
}
# modules/ec2-instance/variables.tf
variable "name" {
description = "Instance name"
type = string
}
variable "ami_id" {
description = "AMI ID"
type = string
}
variable "instance_type" {
description = "Instance type"
type = string
default = "t3.micro"
}
variable "subnet_id" {
description = "Subnet ID"
type = string
}
variable "security_group_ids" {
description = "Security group IDs"
type = list(string)
}
variable "key_name" {
description = "SSH key name"
type = string
default = null
}
variable "tags" {
description = "Additional tags"
type = map(string)
default = {}
}
# modules/ec2-instance/outputs.tf
output "instance_id" {
description = "Instance ID"
value = aws_instance.this.id
}
output "private_ip" {
description = "Private IP address"
value = aws_instance.this.private_ip
}
output "public_ip" {
description = "Public IP address"
value = aws_instance.this.public_ip
}
Using Modules
# main.tf - Using the module
module "web_server" {
source = "./modules/ec2-instance"
name = "web-server-${var.environment}"
ami_id = data.aws_ami.ubuntu.id
instance_type = var.instance_type
subnet_id = aws_subnet.public.id
security_group_ids = [aws_security_group.web.id]
key_name = aws_key_pair.deployer.key_name
tags = {
Environment = var.environment
Project = "web-app"
}
}
# Data source for AMI
data "aws_ami" "ubuntu" {
most_recent = true
owners = ["099720109477"] # Canonical
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-20.04-amd64-server-*"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}
State Management
Remote State Backends
# Backend configuration for S3
terraform {
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "global/s3/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-state-lock"
encrypt = true
}
}
# Backend configuration for Azure
terraform {
backend "azurerm" {
resource_group_name = "tfstate"
storage_account_name = "tfstate12345"
container_name = "tfstate"
key = "production.terraform.tfstate"
}
}
# Backend configuration for GCS
terraform {
backend "gcs" {
bucket = "tf-state-prod"
prefix = "terraform/state"
}
}
State Locking
# DynamoDB table for state locking
resource "aws_dynamodb_table" "terraform_state_lock" {
name = "terraform-state-lock"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
tags = {
Name = "Terraform State Lock Table"
}
}
Advanced Terraform Features
Dynamic Blocks
# Using dynamic blocks for complex configurations
resource "aws_security_group" "web" {
name = "web-sg"
description = "Web security group"
vpc_id = aws_vpc.main.id
dynamic "ingress" {
for_each = var.security_group_rules
content {
description = ingress.value.description
from_port = ingress.value.port
to_port = ingress.value.port
protocol = ingress.value.protocol
cidr_blocks = ingress.value.cidr_blocks
}
}
}
# Variables for dynamic blocks
variable "security_group_rules" {
description = "Security group rules"
type = list(object({
description = string
port = number
protocol = string
cidr_blocks = list(string)
}))
default = [
{
description = "HTTP"
port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
},
{
description = "HTTPS"
port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
]
}
Conditional Expressions and Loops
# Conditional resources
resource "aws_instance" "web" {
count = var.create_instance ? 1 : 0
ami = var.ami_id
instance_type = var.instance_type
# ... other configuration
}
# For_each with maps
resource "aws_instance" "servers" {
for_each = var.servers
ami = each.value.ami
instance_type = each.value.instance_type
subnet_id = each.value.subnet_id
tags = {
Name = each.key
Role = each.value.role
}
}
# Variables for for_each
variable "servers" {
description = "Server configurations"
type = map(object({
ami = string
instance_type = string
subnet_id = string
role = string
}))
default = {
web1 = {
ami = "ami-123456"
instance_type = "t3.micro"
subnet_id = "subnet-123456"
role = "web"
}
app1 = {
ami = "ami-789012"
instance_type = "t3.small"
subnet_id = "subnet-789012"
role = "app"
}
}
}
Terraform with Multiple Providers
Multi-Cloud Configuration
# Multiple provider configuration
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
}
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.0"
}
google = {
source = "hashicorp/google"
version = "~> 4.0"
}
}
}
# AWS Provider
provider "aws" {
region = "us-east-1"
}
# Azure Provider
provider "azurerm" {
features {}
subscription_id = var.azure_subscription_id
client_id = var.azure_client_id
client_secret = var.azure_client_secret
tenant_id = var.azure_tenant_id
}
# Google Cloud Provider
provider "google" {
project = var.gcp_project_id
region = "us-central1"
}
# Cross-cloud resources
resource "aws_s3_bucket" "logs" {
bucket = "my-app-logs"
}
resource "google_storage_bucket" "backup" {
name = "my-app-backup"
location = "US"
}
Terraform Workspaces
Environment Management
# Using workspaces for multiple environments
# Create workspaces
terraform workspace new development
terraform workspace new staging
terraform workspace new production
# Select workspace
terraform workspace select development
# Configuration using workspace
locals {
environment = terraform.workspace
common_tags = {
Environment = local.environment
Project = "my-app"
Terraform = "true"
}
instance_sizes = {
development = "t3.micro"
staging = "t3.small"
production = "t3.medium"
}
}
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id
instance_type = local.instance_sizes[local.environment]
tags = local.common_tags
}
# Workspace-specific variables
variable "instance_count" {
description = "Number of instances"
type = number
default = 1
}
# Override in terraform.tfvars per workspace
# development.auto.tfvars
instance_count = 1
# production.auto.tfvars
instance_count = 3
Security and Best Practices
Security Considerations
# Secure Terraform configuration
# 1. Never commit secrets to version control
# 2. Use environment variables or secret management
# 3. Implement least privilege for IAM roles
# 4. Enable encryption for state files
# 5. Use private modules and providers
# Example with sensitive data
variable "db_password" {
description = "Database password"
type = string
sensitive = true
}
resource "aws_db_instance" "database" {
identifier = "prod-db"
engine = "mysql"
instance_class = "db.t3.micro"
username = "admin"
password = var.db_password # Marked as sensitive
# Additional security settings
storage_encrypted = true
backup_retention_period = 7
deletion_protection = true
}
Infrastructure as Code Best Practices
# 1. Use version control
# 2. Implement code review processes
# 3. Use modules for reusability
# 4. Validate with terraform validate
# 5. Plan before apply
# 6. Use remote state with locking
# 7. Implement proper tagging
# 8. Use variables for configuration
# 9. Implement proper error handling
# 10. Regular dependency updates
# Example of well-structured configuration
# File structure:
terraform/
├── environments/
│ ├── production/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── terraform.tfvars
│ └── development/
│ ├── main.tf
│ ├── variables.tf
│ └── terraform.tfvars
├── modules/
│ ├── networking/
│ ├── compute/
│ └── database/
└── scripts/
├── plan.sh
└── apply.sh
Terraform in CI/CD Pipelines
GitHub Actions Example
# .github/workflows/terraform.yml
name: 'Terraform'
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
terraform:
name: 'Terraform'
runs-on: ubuntu-latest
environment: production
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Setup Terraform
uses: hashicorp/setup-terraform@v2
with:
terraform_version: 1.3.0
- name: Terraform Format
id: fmt
run: terraform fmt -check
- name: Terraform Init
id: init
run: terraform init
- name: Terraform Validate
id: validate
run: terraform validate -no-color
- name: Terraform Plan
id: plan
run: terraform plan -input=false -no-color
continue-on-error: true
- name: Terraform Plan Status
if: steps.plan.outcome == 'failure'
run: exit 1
- name: Terraform Apply
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
run: terraform apply -auto-approve -input=false
Troubleshooting and Debugging
Common Issues and Solutions
# 1. State locking issues # Solution: Manually release lock in DynamoDB or check for stale locks # 2. Provider authentication errors # Solution: Verify credentials and permissions # 3. Resource dependency issues # Solution: Use explicit dependencies with depends_on # 4. State file conflicts # Solution: Use remote state with proper locking # 5. Plan/apply discrepancies # Solution: Use -refresh=false for consistent plans # Debugging with TF_LOG export TF_LOG=DEBUG export TF_LOG_PATH=./terraform.log terraform plan # Using terraform console for testing terraform console > aws_vpc.main.cidr_block > length(aws_instance.web) > local.common_tags
Real-World Examples
Complete AWS Infrastructure
# Complete web application infrastructure
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 3.0"
name = "web-app-vpc"
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24"]
enable_nat_gateway = true
enable_vpn_gateway = false
tags = {
Environment = var.environment
Project = "web-app"
}
}
module "web_server" {
source = "./modules/web-cluster"
name = "web"
environment = var.environment
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.public_subnets
instance_type = var.instance_type
min_size = var.min_size
max_size = var.max_size
key_name = aws_key_pair.deployer.key_name
}
module "database" {
source = "./modules/database"
name = "webapp"
environment = var.environment
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
allocated_storage = 20
instance_class = "db.t3.micro"
username = var.db_username
password = var.db_password
backup_retention = 7
}
Conclusion
Terraform has revolutionized infrastructure management by providing a consistent, declarative approach to provisioning cloud resources across multiple providers. Its powerful features, extensive provider ecosystem, and strong community support make it the de facto standard for Infrastructure as Code.
Key Benefits of Terraform:
- Cloud-agnostic infrastructure management
- Declarative configuration with HCL
- Strong state management and locking
- Extensive provider ecosystem
- Powerful module system for reusability
- Excellent integration with CI/CD pipelines
By following Terraform best practices for security, state management, and code organization, teams can build robust, maintainable infrastructure that scales with their application needs.
Comments
Post a Comment