﻿# Cloud Architecture Overview

This document provides a high-level overview of the Duuble backend infrastructure on Google Cloud Platform (GCP).

## Architecture Diagram

The system is designed for massive scale (up to 1M RPM), utilizing serverless and managed services to reduce operational overhead while maintaining high performance.

```mermaid
graph TD
    Client[Mobile App Client]

    subgraph "Google Cloud Edge"
        CA[Cloud Armor WAF & Rate Limiting]
        ExtLB[External HTTPS Load Balancer]
        CDN[Cloud CDN]
    end

    subgraph "Compute Tier"
        CR[Cloud Run <br> Primary API]
        CR_SSE[Cloud Run <br> Dedicated SSE Service]
    end

    subgraph "State & Storage Tier"
        Spanner[(Cloud Spanner <br> PostgreSQL Dialect)]
        Redis[(Memorystore for Redis)]
    end

    subgraph "Asset Storage"
        GCS[(Cloud Storage)]
    end

    %% Flows
    Client -->|API Requests & Asset Reads| CA
    Client -->|Binary Uploads PUT| GCS
    
    CA --> ExtLB
    ExtLB --> CDN
    CDN --> CR
    CDN -->|Finalized Asset Reads| GCS
    ExtLB -->|/api/v1/notifications/stream| CR_SSE
    
    CR -->|Read/Write Data| Spanner
    CR -->|Sessions/Cache| Redis
    CR -->|Generate Signed URLs| GCS
    
    CR_SSE -->|Read Stream| Spanner
```

## Key Components

1.  **Cloud Armor:** Acts as the entry point, protecting against DDoS and volumetric attacks.
2.  **Cloud CDN:** Edge caching layer that caches public/safe read-only API responses and finalized asset reads, leveraging HTTP `Cache-Control` headers to offload read traffic from the compute and storage tiers.
3.  **Cloud Run (Primary):** The core compute layer running the .NET backend API. Scalable, serverless, and optimized for fast boot times using Native AOT compilation.
4.  **Cloud Run (SSE Streamer):** A dedicated, isolated service with high concurrency limits explicitly tuned for handling long-lived Server-Sent Event connections without exhausting primary API resources.
5.  **Cloud Spanner:** The globally consistent, distributed relational database handling the core social graph and transactional data.
6.  **Memorystore (Redis):** Handles fast-moving session data, stateless auth components, distributed rate limiting buckets, and application-level caching.
7.  **Cloud Storage (GCS):** Stores all immutable binary assets (images, PDFs) directly uploaded by the client via Signed URLs.

## Caching Strategy

Duuble utilizes a multi-tiered caching strategy to target low-latency responses at high scale:

1.  **Tier 1: Edge Caching (Cloud CDN)**
    *   Sits at the Global Load Balancer.
    *   **Use Case:** Caches highly accessed, public read-only API endpoints (e.g., `GET /auth/login/config`, public hub profiles, static configs) and finalized immutable assets.
    *   **Mechanism:** The .NET API emits `Cache-Control: public, max-age=...` headers for safe shared responses. Finalized asset paths are immutable and served through CDN-backed URLs. Cloud CDN serves cache hits directly from Google's edge POPs (Points of Presence) closest to the user, bypassing Cloud Run and reducing origin load.
2.  **Tier 2: Distributed Data Cache (Memorystore for Redis)**
    *   Sits within the VPC accessible closely by Cloud Run.
    *   **Use Case:** Fast aggregates, session states, and rate limits. 
    *   **Mechanism:** Stores state that cannot be cached at the edge because it is user-specific or mutates too quickly (e.g., JWT Refresh Tokens, HyperLogLog active user counts, OTP challenge states).
