﻿# Storage and Offloaded Asset Uploads

A critical antipattern in serverless environments is proxying binary file uploads directly through the compute container.

## The Upload Problem at Scale

Handling 5MB-20MB image and PDF uploads through Cloud Run instances would:
1. Saturate the container's RAM.
2. Tie up vCPU cycles waiting on network buffering from mobile clients.
3. Cause artificially high auto-scaling (scaling up instances just to buffer bytes rather than processing logic).

## Solution: Cloud Storage Signed URLs

We fully bypass the API servers for the upload data transfer by using GCS V4 Signed URLs.

```mermaid
sequenceDiagram
    participant Mobile as Mobile App
    participant API as Cloud Run API (Duuble)
    participant GCS as Google Cloud Storage

    Note over Mobile, API: 1. Presign Phase
    Mobile->>API: POST /uploads/presign (file metadata)
    API->>API: Validate user quotas and file extension
    API->>GCS: Generate V4 PUT Signed URL
    API-->>Mobile: Return Signed URL + required checksum header

    Note over Mobile, GCS: 2. Upload Phase (No Compute API Impact)
    Mobile->>GCS: PUT binary file straight to URL with x-goog-content-sha256
    GCS-->>Mobile: 200 OK

    Note over Mobile, API: 3. Complete Phase (Success)
    Mobile->>API: POST /uploads/complete (uploadId, checksum)
    API->>API: Verify checksum matches upload intent
    API->>GCS: Verify Object Exists, content type, and size
    API->>API: Store Asset ID in Database
    API-->>Mobile: 201 Created (Asset Ready)
    
    Note over Mobile, API: 3b. Complete Phase (Failure: Checksum Mismatch)
    Mobile->>API: POST /uploads/complete (uploadId, BAD_checksum)
    API->>API: Mismatch detected against upload intent
    API-->>Mobile: 422 Unprocessable Entity (Upload failed, restart)
```

By using this flow, the Cloud Run API is only involved in the **authorization/intent** (millisesconds) and the **finalization** (milliseconds). The actual transfer of bytes is securely offloaded to Google's highly-optimized edge storage network.

## Checksum Mismatch Error Handling

The client supplies a SHA-256 checksum during presign. The API signs it into the required `x-goog-content-sha256` PUT header, so GCS rejects a mismatched upload body. The client then sends the same checksum in `POST /uploads/complete`; if that value does not match the upload intent, the API rejects finalization.

## Upload Purposes

- `profile_photo`: profile image upload, image MIME types only, max 10 MB.
- `hub_photo`: hub image upload, image MIME types only, max 10 MB.
- `post_image`: post article/comment/discussion image upload, image MIME types only, max 10 MB.
- `document`: PDF upload, `application/pdf` only, max 20 MB.

When this happens:
1. The API responds with `422 UPLOAD_CHECKSUM_MISMATCH`.
2. The pending upload remains unfinalized.
3. The client may retry only if it can supply the exact checksum used during presign; otherwise the UI must restart the specific file upload flow from Phase 1 (`POST /uploads/presign`).

*Fallback:* a 24-hour GCS Lifecycle Policy automatically sweeps the temporary bucket and deletes any un-finalized/orphaned chunks to save costs.

## Primary Databases

*   **Cloud Spanner:** The system of record. Replicated for high availability. Optimized with `INTERLEAVE IN PARENT` for hierarchy, and query latency rests reliably below 10ms for single-row and interleaved reads.
*   **Memorystore (Redis):** Caching layer used for tracking HTTP session validity (Refresh tokens), tracking real time user counts (HyperLogLog), and abuse protection buckets for OTP limits.
