﻿# Infrastructure as Code: Terraform

Current Terraform requirements are captured in [10-terraform-requirements-qa.md](./10-terraform-requirements-qa.md). Where this document and that Q&A differ, the Q&A is the source of truth for implementation.

To orchestrate this cloud architecture reliably, all infrastructure will be managed via **Terraform**. Terraform manages the dependency graph ensuring services are brought up and scaled in the correct sequence.

## Terraform Dependency Graph

```mermaid
graph TD
    %% Base Infrastructure
    Network[VPC Networks] --> Redis[Memorystore Redis]
    Network --> Spanner[Cloud Spanner Instance]
    
    %% Storage
    Bucket[GCS Asset Bucket]
    
    %% IAM and Secrets
    Secret[Secret Manager] --> ServiceAccount[Cloud Run Service Account]
    ServiceAccount -->|Storage Object Creator| Bucket
    ServiceAccount -->|Database User| Spanner
    
    %% Compute
    Docker[Artifact Registry] --> CR[Cloud Run Service - Primary]
    Docker --> CR_SSE[Cloud Run Service - SSE]
    ServiceAccount --> CR
    ServiceAccount --> CR_SSE
    Spanner --> CR
    Spanner --> CR_SSE
    Redis --> CR
    
    %% Load Balancing & Edge
    NEG[Serverless NEG - Primary] --> LBBackend[Backend Service - Default]
    NEG_SSE[Serverless NEG - SSE] --> LBBackend_SSE[Backend Service - 3600s Timeout]
    
    CR --> NEG
    CR_SSE --> NEG_SSE
    
    LBBackend -->|Enables| CDN[Cloud CDN]
    
    URLMap[URL Map] -->|Path: /*| LBBackend
    URLMap -->|Path: /api/v1/notifications/stream| LBBackend_SSE
    URLMap --> LB[External HTTP/S Load Balancer]
    
    %% Security
    ArmorPolicy[Cloud Armor Policy]
    ArmorRules[Cloud Armor Rules: Throttle & Adaptive Protection]
    ArmorPolicy --> ArmorRules
    ArmorRules --> LB
```

## Management Strategy

*   **State Management:** Terraform state will be securely stored in an encrypted, versioned GCS bucket using lock mechanisms.
*   **Environment Segregation:** The first apply targets `dev`. `prod` is supported by the same modules later, with separate state prefixes such as `env/dev` and `env/prod`.
*   **Deployment Pipeline:**
    1. The core infrastructure (VPC, Spanner, Redis, Artifact Registry) is applied first.
    2. GitHub Actions builds the `.NET Native AOT` Docker image and pushes it to Artifact Registry.
    3. The pipeline runs `terraform plan`; `terraform apply` remains manual for now.
    4. Cloud Run deploys use the new image digest to trigger a rolling zero-downtime deployment.

## Specific Backup Configurations in Terraform

The disaster recovery features are explicitly managed via Terraform code to ensure infrastructure consistency:

### 1. Spanner Backup Configuration
```hcl
resource "google_spanner_database" "duuble_db" {
  instance                 = google_spanner_instance.main.name
  name                     = "duuble-db"
  version_retention_period = "7d" # Enables Point-in-Time Recovery (PITR) max limit
}

resource "google_spanner_backup_schedule" "daily_30d_snapshot" {
  instance = google_spanner_instance.main.name
  database = google_spanner_database.duuble_db.name
  name     = "daily-backup-30d-retention"

  spec {
    # Specifically: 1 snapshot per day (Triggered at midnight UTC)
    cron_spec {
      text = "0 0 * * *"
    }
  }

  # Specifically: Keeps the snapshot for exactly 30 days
  retention_duration = "2592000s" # 30 days * 24 hours * 60 mins * 60 secs
}
```

### 2. GCS Asset Bucket Configuration
```hcl
resource "google_storage_bucket" "asset_bucket" {
  name          = "duuble-assets-bucket-long-velocity-496203-v9-${var.environment}"
  location      = "US" # Multi-Region replication

  # Soft delete for immediate accidentally-deleted restoral
  soft_delete_policy {
    retention_duration_seconds = 1209600 # 14 days
  }

  # Lifecycle rule to delete abandoned temporary uploads
  lifecycle_rule {
    condition {
      age            = 1
      matches_prefix = ["tmp/"]
    }
    action {
      type = "Delete"
    }
  }
}
```
